Recommendation system
Encyclopedia
Recommender systems, recommendation systems, recommendation engines, recommendation frameworks, recommendation platforms or simply recommender form or work from a specific type of information filtering system
Information filtering system
An Information filtering system is a system that removes redundant or unwanted information from an information stream using automated or computerized methods prior to presentation to a human user. Its main goal is the management of the information overload and increment of the semantic...

 technique that attempts to recommend information items (movie
Film
A film, also called a movie or motion picture, is a series of still or moving images. It is produced by recording photographic images with cameras, or by creating images using animation techniques or visual effects...

s, TV program/show/episode
Television program
A television program , also called television show, is a segment of content which is intended to be broadcast on television. It may be a one-time production or part of a periodically recurring series...

, video on demand
Video on demand
Video on Demand or Audio and Video On Demand are systems which allow users to select and watch/listen to video or audio content on demand...

, music
Music
Music is an art form whose medium is sound and silence. Its common elements are pitch , rhythm , dynamics, and the sonic qualities of timbre and texture...

, book
Book
A book is a set or collection of written, printed, illustrated, or blank sheets, made of hot lava, paper, parchment, or other materials, usually fastened together to hinge at one side. A single sheet within a book is called a leaf or leaflet, and each side of a leaf is called a page...

s, news
News
News is the communication of selected information on current events which is presented by print, broadcast, Internet, or word of mouth to a third party or mass audience.- Etymology :...

, image
Image
An image is an artifact, for example a two-dimensional picture, that has a similar appearance to some subject—usually a physical object or a person.-Characteristics:...

s, web page
Web page
A web page or webpage is a document or information resource that is suitable for the World Wide Web and can be accessed through a web browser and displayed on a monitor or mobile device. This information is usually in HTML or XHTML format, and may provide navigation to other web pages via hypertext...

s, scientific literature
Scientific literature
Scientific literature comprises scientific publications that report original empirical and theoretical work in the natural and social sciences, and within a scientific field is often abbreviated as the literature. Academic publishing is the process of placing the results of one's research into the...

 such as research paper
Research paper
Research paper may refer to:* Academic paper , which is published in academic journals and contains original research results or reviews existing results* Term paper, written by high school or college students...

s etc.) or social elements (e.g. people
People
People is a plurality of human beings or other beings possessing enough qualities constituting personhood. It has two usages:* as the plural of person or a group of people People is a plurality of human beings or other beings possessing enough qualities constituting personhood. It has two usages:*...

, event
Event
Event can refer to many things such as:* An observable occurrence, phenomenon or an extraordinary occurrenceA type of gathering:* A ceremony, for example, a marriage* A competition, for example, a sports competition* A convention...

s or groups) that are likely to be of interest to the user.

A recommender system help users
that have no sufficient competence to evaluate the, potentially overwhelming, number of alternatives. In their simplest form recommender systems provide a personalized and ranked lists of items by predicting what the most suitable items are, based on the user’s history, preferences and constraints.

Typically, a recommender system compares a user profile
User profile
A user profile is a collection of personal data associated to a specific user. A profile refers therefore to the explicit digital representation of a person's identity...

 to some reference characteristics, and seeks to predict the 'rating' or 'preference' that a user would give to an item they had not yet considered. These characteristics may be from the information item (the content-based approach) or the user's social environment (the collaborative filtering
Collaborative filtering
Collaborative filtering is the process of filtering for information or patterns using techniques involving collaboration among multiple agents, viewpoints, data sources, etc. Applications of collaborative filtering typically involve very large data sets...

 approach).

Overview

When building the user's profile a distinction is made between explicit and implicit
Implicit data collection
Implicit data collection is used in human computer interaction to gather data about the user in an implicit, non invasive way.-Overview:The collection of user related data in human-computer interaction is used to adapt the computer interface to the end user. The data collected are used to build a...

 forms of data collection
Data collection
Data collection is a term used to describe a process of preparing and collecting data, for example, as part of a process improvement or similar project. The purpose of data collection is to obtain information to keep on record, to make decisions about important issues, to pass information on to...

.

Examples of explicit data collection include the following:
  • Asking a user to rate an item on a sliding scale.
  • Asking a user to rank a collection of items from favorite to least favorite.
  • Presenting two items to a user and asking him/her to choose the better one of them.
  • Asking a user to create a list of items that he/she likes.


Examples of implicit data collection
Implicit data collection
Implicit data collection is used in human computer interaction to gather data about the user in an implicit, non invasive way.-Overview:The collection of user related data in human-computer interaction is used to adapt the computer interface to the end user. The data collected are used to build a...

 include the following:
  • Observing the items that a user views in an online store.
  • Analyzing item/user viewing times
  • Keeping a record of the items that a user purchases online.
  • Obtaining a list of items that a user has listened to or watched on his/her computer.
  • Analyzing the user's social network and discovering similar likes and dislikes


The recommender system compares the collected data to similar and not similar data collected from others and calculates a list of recommended items for the user. Several commercial and non-commercial examples are listed in the article on collaborative filtering systems
Collaborative filtering
Collaborative filtering is the process of filtering for information or patterns using techniques involving collaboration among multiple agents, viewpoints, data sources, etc. Applications of collaborative filtering typically involve very large data sets...

. Montaner provides the first overview of recommender systems, from an intelligent agents perspective. Adomavicius provides a new overview of recommender systems. Herlocker provides an overview of evaluation techniques for recommender systems.

Recommender systems are a useful alternative to search algorithm
Search algorithm
In computer science, a search algorithm is an algorithm for finding an item with specified properties among a collection of items. The items may be stored individually as records in a database; or may be elements of a search space defined by a mathematical formula or procedure, such as the roots...

s since they help users discover items they might not have found by themselves. Interestingly enough, recommender systems are often implemented using search engines indexing non-traditional data.

Algorithms

One of the most commonly used algorithms in recommender systems is the k-nearest neighborhood
K-nearest neighbor algorithm
In pattern recognition, the k-nearest neighbor algorithm is a method for classifying objects based on closest training examples in the feature space. k-NN is a type of instance-based learning, or lazy learning where the function is only approximated locally and all computation is deferred until...

 approach. In a social network, a particular user's neighborhood with similar taste or interest can be found by calculating Pearson Correlation, by collecting the preference data of top-N nearest neighbors of the particular user (weighted by similarity), the user's preference can be predicted by calculating the data using certain techniques.

Collaborative Filtering

Another family of algorithms that is widely used in recommender systems is collaborative filtering
Collaborative filtering
Collaborative filtering is the process of filtering for information or patterns using techniques involving collaboration among multiple agents, viewpoints, data sources, etc. Applications of collaborative filtering typically involve very large data sets...

. Collaborative filtering methods are based on collecting and analysing a large amount of information on users’ behavior, activity or preferences and predicting what users will like based on their similarity to other users. One of the most common types of Collaborative Filtering is item-to-item collaborative filtering (people who buy x also buy y), an algorithm popularized by Amazon.com
Amazon.com
Amazon.com, Inc. is a multinational electronic commerce company headquartered in Seattle, Washington, United States. It is the world's largest online retailer. Amazon has separate websites for the following countries: United States, Canada, United Kingdom, Germany, France, Italy, Spain, Japan, and...

's recommender system. User-based collaborative filtering attempts to model the social process of asking a friend for a recommendation. A particular type of collaborative filtering algorithms uses matrix factorization, a low-rank matrix approximation techique. A key advantage of the collaborative filtering approach is that it does not rely on machine analyzable content and therefore it is capable of accurately recommending complex items such as movies without requiring an "understanding" of the item itself.

Building user profiles using collaborative filtering can be problematic from a privacy point of view. Many European countries have a strong culture of data privacy and every attempt to introduce any level of user profiling can result in a negative customer response.

Content Based Filtering

Another family of algorithms that is widely used in recommender systems is content-based filtering. Content based filtering methods are based on the information about the items that are going to be recommended. In other words, these algorithms try to recommend the items similar to those that a user liked in the past. In particular, various candidate items are compared with items previously rated by the user and the best-matching items are recommended. This approach has its roots in information retrieval
Information retrieval
Information retrieval is the area of study concerned with searching for documents, for information within documents, and for metadata about documents, as well as that of searching structured storage, relational databases, and the World Wide Web...

 and information filtering research. Basically those methods use an item profile i.e. a set of attributes (features) characterizing the item within the system. The system creates a content based profile of users based on a weighted vector of item features. The weights denote the importance of each feature to the user and can be computed from individually rated content vectors using a variety of techniques. Simple approaches use the average values of the rated item vector while other sophisticated methods use Bayesian Classifiers
Naive Bayes classifier
A naive Bayes classifier is a simple probabilistic classifier based on applying Bayes' theorem with strong independence assumptions...

 (and other machine learning techniques, including clustering, decision trees, and artificial neural networks) in order to estimate the probability that the user is going to like the item.

Hybrid Recommender Systems

Recent research has demonstrated that a hybrid approach, combining collaborative filtering
Collaborative filtering
Collaborative filtering is the process of filtering for information or patterns using techniques involving collaboration among multiple agents, viewpoints, data sources, etc. Applications of collaborative filtering typically involve very large data sets...

 and content-based filtering could be more effective in some cases. Hybrid approaches can be implemented in several ways: by making content-based and collaborative-based predictions separately and then combining them; by adding content-based capabilities to a collaborative-based approach (and vice versa); or by unifying the approaches into one model (see for a complete review of recommender systems). Several studies empirically compare the performance of the hybrid with the pure collaborative and content-based methods and demonstrate that the hybrid methods can provide more accurate
recommendations than pure approaches. These methods can also be used to overcome some of the common problems in recommender systems such as cold start
Cold start
Cold start is a potential problem in computer-based information systems which involve a degree of automated data modelling. Specifically, it concerns the issue that the system cannot draw any inferences for users or items about which it has not yet gathered sufficient information.-Systems...

 and the sparsity problem.

The Netflix Prize

The Netflix Prize
Netflix Prize
The Netflix Prize was an open competition for the best collaborative filtering algorithm to predict user ratings for films, based on previous ratings....

, a contest with a dataset of over 100 million movie ratings and a grand prize of $1,000,000, energized the search for new and more accurate algorithms from 2006 to 2010. It was cancelled in 2010 due to privacy concerns and a lawsuit regarding sharing user data. The most accurate algorithm in 2007 used 107 different algorithmic approaches, blended into a single prediction:


Predictive accuracy is substantially improved when blending multiple predictors. Our experience is that most efforts should be concentrated in deriving substantially different approaches, rather than refining a single technique. Consequently, our solution is an ensemble of many methods.

See also

  • Rating site
  • Cold start
    Cold start
    Cold start is a potential problem in computer-based information systems which involve a degree of automated data modelling. Specifically, it concerns the issue that the system cannot draw any inferences for users or items about which it has not yet gathered sufficient information.-Systems...

  • Collaborative filtering
    Collaborative filtering
    Collaborative filtering is the process of filtering for information or patterns using techniques involving collaboration among multiple agents, viewpoints, data sources, etc. Applications of collaborative filtering typically involve very large data sets...

  • Collective intelligence
    Collective intelligence
    Collective intelligence is a shared or group intelligence that emerges from the collaboration and competition of many individuals and appears in consensus decision making in bacteria, animals, humans and computer networks....

  • Content Discovery Platform
    Content Discovery Platform
    A Content Discovery Platform is an implemented software platform for consumers to search for television content online using recommender system tools such as recommendations or TV search engine...

  • Enterprise bookmarking
    Enterprise bookmarking
    Enterprise bookmarking is a method for Enterprise 2.0 users to tag, organize, store, and search bookmarks of both web pages on the Internet and data resources stored in a distributed database or fileserver...

  • Personalized marketing
    Personalized marketing
    Personalized marketing is an extreme form of product differentiation. Whereas product differentiation tries to differentiate a product from competing ones, personalization tries to make a unique product offering for each customer.-Internet marketing:Personalized marketing had been most practical...

  • Preference elicitation
    Preference elicitation
    Preference elicitation refers to the problem of developing a decision support system capable of generating recommendations to a user, thus assisting him in decision making. It is important for such a system to model user's preferences accurately, find hidden preferences and avoid redundancy. This...

  • Product finder
    Product finder
    Product finders are information systems that help consumers to identify products within a large palette of similar alternative products. Product finders differ in complexity, the more complex among them being a special case of decision support systems. Conventional decision support systems,...

    s

Further reading


External links


Research groups


Workshops


ACM Recommender Systems Series


Journal special issues

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK