Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Recommender system
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Approaches == === Collaborative filtering === {{Main|Collaborative filtering}} [[File: Collaborative filtering.gif|thumb|An example of collaborative filtering based on a rating system]] One approach to the design of recommender systems that has wide use is [[collaborative filtering]].<ref name="Breese98">{{cite conference |author1=John S. Breese |author2=David Heckerman |author3=Carl Kadie |name-list-style=amp |year = 1998 |title = Empirical analysis of predictive algorithms for collaborative filtering |conference = In Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence (UAI'98) |arxiv=1301.7363 }} </ref> Collaborative filtering is based on the assumption that people who agreed in the past will agree in the future, and that they will like similar kinds of items as they liked in the past. The system generates recommendations using only information about rating profiles for different users or items. By locating peer users/items with a rating history similar to the current user or item, they generate recommendations using this neighborhood. Collaborative filtering methods are classified as memory-based and model-based. A well-known example of memory-based approaches is the user-based algorithm,<ref>{{cite report|author1=Breese, John S.|author2=Heckerman, David|author3=Kadie, Carl|url=http://research.microsoft.com/pubs/69656/tr-98-12.pdf|title=Empirical Analysis of Predictive Algorithms for Collaborative Filtering|year=1998|publisher=Microsoft Research}}</ref> while that of model-based approaches is [[matrix factorization (recommender systems)]].<ref>{{cite journal|title=Matrix Factorization Techniques for Recommender Systems|doi=10.1109/MC.2009.263|volume=42|journal= Computer|pages=30–37|date=2009-08-01|last1=Koren|first1=Yehuda|last2=Volinsky|first2=Chris|issue=8 |citeseerx=10.1.1.147.8295|s2cid=58370896 }}</ref> A key advantage of the collaborative filtering approach is that it does not rely on machine analyzable content and therefore it is capable of accurately recommending complex items such as movies without requiring an "understanding" of the item itself. Many algorithms have been used in measuring user similarity or item similarity in recommender systems. For example, the [[k-nearest neighbors algorithm|k-nearest neighbor]] (k-NN) approach<ref>{{cite web |last1 = Sarwar |first1 = B. |last2 = Karypis |first2 = G. |last3 = Konstan |first3 = J. |last4 = Riedl |first4 = J. |year = 2000 |url = http://glaros.dtc.umn.edu/gkhome/node/122 |title = Application of Dimensionality Reduction in Recommender System A Case Study }},</ref> and the [[Pearson correlation|Pearson Correlation]] as first implemented by Allen.<ref>{{citation |mode=cs1 | last = Allen | first = R.B. | date=1990 | title=User Models: Theory, Method, Practice| publisher = International J. Man-Machine Studies}}</ref> When building a model from a user's behavior, a distinction is often made between explicit and [[implicit data collection|implicit]] forms of [[data collection]]. Examples of explicit data collection include the following: * Asking a user to rate an item on a sliding scale. * Asking a user to search. * Asking a user to rank a collection of items from favorite to least favorite. * Presenting two items to a user and asking him/her to choose the better one of them. * Asking a user to create a list of items that he/she likes (see ''[[Rocchio algorithm|Rocchio classification]]'' or other similar techniques). Examples of [[implicit data collection]] include the following: * Observing the items that a user views in an online store. * Analyzing item/user viewing times.<ref>{{Cite conference |last1 = Parsons |first1 = J. |last2 = Ralph |first2 = P. |last3 = Gallagher |first3 = K. |date = July 2004 |title = Using viewing time to infer user preference in recommender systems |conference = AAAI Workshop in Semantic Web Personalization, San Jose, California }}.</ref> * Keeping a record of the items that a user purchases online. * Obtaining a list of items that a user has listened to or watched on his/her computer. * Analyzing the user's social network and discovering similar likes and dislikes. Collaborative filtering approaches often suffer from three problems: [[Cold start (computing)|cold start]], scalability, and sparsity.<ref name="Lee2007">Sanghack Lee and Jihoon Yang and Sung-Yong Park, [https://books.google.com/books?id=u4qzlZAEjegC&dq=sparsity+problem+content-based&pg=PA396 Discovery of Hidden Similarity on Collaborative Filtering to Overcome Sparsity Problem], Discovery Science, 2007.</ref> * '''Cold start''': For a new user or item, there is not enough data to make accurate recommendations. Note: one commonly implemented solution to this problem is the [[Multi-armed bandit|multi-armed bandit algorithm]].<ref>{{Cite book|last1=Felício|first1=Crícia Z.|last2=Paixão|first2=Klérisson V.R.|last3=Barcelos|first3=Celia A.Z.|last4=Preux|first4=Philippe|title=Proceedings of the 25th Conference on User Modeling, Adaptation and Personalization |chapter=A Multi-Armed Bandit Model Selection for Cold-Start User Recommendation |date=2017-07-09|chapter-url=https://doi.org/10.1145/3079628.3079681|series=UMAP '17|location=Bratislava, Slovakia|publisher=Association for Computing Machinery|pages=32–40|doi=10.1145/3079628.3079681|isbn=978-1-4503-4635-1|s2cid=653908|url=https://hal.inria.fr/hal-01517967/file/umap2017.4hal.pdf }}</ref><ref name=":3" /><ref name="rubens2016"/><ref name="elahi2016"/><ref name="bi2017"/> * '''Scalability''': There are millions of users and products in many of the environments in which these systems make recommendations. Thus, a large amount of computation power is often necessary to calculate recommendations. * '''Sparsity''': The number of items sold on major e-commerce sites is extremely large. The most active users will only have rated a small subset of the overall database. Thus, even the most popular items have very few ratings. One of the most famous examples of collaborative filtering is item-to-item collaborative filtering (people who buy x also buy y), an algorithm popularized by [[Amazon.com]]'s recommender system.<ref name="patft.uspto.gov">[http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&p=1&u=/netahtml/PTO/search-bool.html&r=1&f=G&l=50&co1=AND&d=PTXT&s1=6,266,649.PN.&OS=PN/6,266,649&RS=PN/6,266,649 Collaborative Recommendations Using Item-to-Item Similarity Mappings] {{webarchive|url=https://web.archive.org/web/20150316185024/http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&p=1&u=%2Fnetahtml%2FPTO%2Fsearch-bool.html&r=1&f=G&l=50&co1=AND&d=PTXT&s1=6%2C266%2C649.PN.&OS=PN%2F6%2C266%2C649&RS=PN%2F6%2C266%2C649 |date=2015-03-16 }}</ref> Many [[social networks]] originally used collaborative filtering to recommend new friends, groups, and other social connections by examining the network of connections between a user and their friends.<ref name="handbook">{{cite book |last1=Ricci |first1=Francesco |last2=Rokach |first2=Lior |last3=Shapira |first3=Bracha |editor1-last=Ricci |editor1-first=Francesco |editor2-last=Rokach |editor2-first=Lior |editor3-last=Shapira |editor3-first=Bracha |title=Recommender Systems Handbook |date=2022 |publisher=Springer |location=New York |isbn=978-1-0716-2196-7 |edition=3 | chapter=Recommender Systems: Techniques, Applications, and Challenges |chapter-url = https://link.springer.com/chapter/10.1007/978-1-0716-2197-4_1 |doi=10.1007/978-1-0716-2197-4_1|pages=1–35}}</ref> Collaborative filtering is still used as part of hybrid systems. === Content-based filtering === Another common approach when designing recommender systems is '''content-based filtering'''. Content-based filtering methods are based on a description of the item and a profile of the user's preferences.<ref name="Aggarwal16Book">{{cite book|last1=Aggarwal|first1=Charu C.|title=Recommender Systems: The Textbook|date=2016|publisher=Springer|isbn=978-3-319-29657-9}}</ref><ref>{{cite book|author=Peter Brusilovsky|title=The Adaptive Web|url=https://archive.org/details/adaptivewebmetho00brus|url-access=limited|year=2007|isbn=978-3-540-72078-2|page=[https://archive.org/details/adaptivewebmetho00brus/page/n331 325]|publisher=Springer |author-link=Peter Brusilovsky}}</ref> These methods are best suited to situations where there is known data on an item (name, location, description, etc.), but not on the user. Content-based recommenders treat recommendation as a user-specific classification problem and learn a classifier for the user's likes and dislikes based on an item's features. In this system, keywords are used to describe the items, and a [[user profile]] is built to indicate the type of item this user likes. In other words, these algorithms try to recommend items similar to those that a user liked in the past or is examining in the present. It does not rely on a user sign-in mechanism to generate this often temporary profile. In particular, various candidate items are compared with items previously rated by the user, and the best-matching items are recommended. This approach has its roots in [[information retrieval]] and [[information filtering]] research. To create a [[user profile]], the system mostly focuses on two types of information: # A model of the user's preference. # A history of the user's interaction with the recommender system. Basically, these methods use an item profile (i.e., a set of discrete attributes and features) characterizing the item within the system. To abstract the features of the items in the system, an item presentation algorithm is applied. A widely used algorithm is the [[tf–idf]] representation (also called vector space representation).<ref>{{cite journal |doi=10.1016/j.knosys.2018.05.001 |doi-access=free|title=A content-based recommender system for computer science publications|year=2018|last1=Wang|first1=Donghui|last2=Liang|first2=Yanchun|last3=Xu|first3=Dong|last4=Feng|first4=Xiaoyue|last5=Guan|first5=Renchu|journal=Knowledge-Based Systems|volume=157|pages=1–9}}</ref> The system creates a content-based profile of users based on a weighted vector of item features. The weights denote the importance of each feature to the user and can be computed from individually rated content vectors using a variety of techniques. Simple approaches use the average values of the rated item vector while other sophisticated methods use machine learning techniques such as [[Naive Bayes classifier|Bayesian Classifiers]], [[cluster analysis]], [[decision trees]], and [[artificial neural networks]] in order to estimate the probability that the user is going to like the item.<ref>{{cite news|title=Online Recommender Systems – How Does a Website Know What I Want?|url=http://blogs.ams.org/mathgradblog/2015/05/25/online-recommender-systems-website-want/|last=Blanda, Stephanie|work=American Mathematical Society|date=May 25, 2015|access-date=October 31, 2016}}</ref> A key issue with content-based filtering is whether the system can learn user preferences from users' actions regarding one content source and use them across other content types. When the system is limited to recommending content of the same type as the user is already using, the value from the recommendation system is significantly less than when other content types from other services can be recommended. For example, recommending news articles based on news browsing is useful. Still, it would be much more useful when music, videos, products, discussions, etc., from different services, can be recommended based on news browsing. To overcome this, most content-based recommender systems now use some form of the hybrid system. Content-based recommender systems can also include opinion-based recommender systems. In some cases, users are allowed to leave text reviews or feedback on the items. These user-generated texts are implicit data for the recommender system because they are potentially rich resources of both feature/aspects of the item and users' evaluation/sentiment to the item. Features extracted from the user-generated reviews are improved [[metadata]] of items, because as they also reflect aspects of the item like metadata, extracted features are widely concerned by the users. Sentiments extracted from the reviews can be seen as users' rating scores on the corresponding features. Popular approaches of opinion-based recommender system utilize various techniques including [[text mining]], [[information retrieval]], [[sentiment analysis]] (see also [[Multimodal sentiment analysis]]) and [[deep learning]].<ref>X.Y. Feng, H. Zhang, Y.J. Ren, P.H. Shang, Y. Zhu, Y.C. Liang, R.C. Guan, D. Xu, (2019), "[https://www.jmir.org/2019/5/e12957/ The Deep Learning–Based Recommender System "Pubmender" for Choosing a Biomedical Publication Venue: Development and Validation Study]", ''[[Journal of Medical Internet Research]]'', 21 (5): e12957</ref> === Hybrid recommendations approaches === Most recommender systems now use a hybrid approach, combining [[collaborative filtering]], content-based filtering, and other approaches. There is no reason why several different techniques of the same type could not be hybridized. Hybrid approaches can be implemented in several ways: by making content-based and collaborative-based predictions separately and then combining them; by adding content-based capabilities to a collaborative-based approach (and vice versa); or by unifying the approaches into one model.<ref name="Toward the Next Generation of Recommender Systems" /> Several studies that empirically compared the performance of the hybrid with the pure collaborative and content-based methods and demonstrated that the hybrid methods can provide more accurate recommendations than pure approaches. These methods can also be used to overcome some of the common problems in recommender systems such as cold start and the sparsity problem, as well as the knowledge engineering bottleneck in [[Knowledge base|knowledge-based]] approaches.<ref>Rinke Hoekstra, [http://www.semantic-web-journal.net/sites/default/files/swj32.pdf The Knowledge Reengineering Bottleneck], Semantic Web – Interoperability, Usability, Applicability 1 (2010) 1, IOS Press</ref> [[Netflix]] is a good example of the use of hybrid recommender systems.<ref>{{cite journal|last1=Gomez-Uribe|first1=Carlos A.|last2=Hunt|first2=Neil|title=The Netflix Recommender System|journal=ACM Transactions on Management Information Systems|date=28 December 2015|volume=6|issue=4|pages=1–19|doi=10.1145/2843948|doi-access=free}}</ref> The website makes recommendations by comparing the watching and searching habits of similar users (i.e., collaborative filtering) as well as by offering movies that share characteristics with films that a user has rated highly (content-based filtering). Some hybridization techniques include: *'''Weighted''': Combining the score of different recommendation components numerically. *'''Switching''': Choosing among recommendation components and applying the selected one. *'''Mixed''': Recommendations from different recommenders are presented together to give the recommendation. *'''Cascade''': Recommenders are given strict priority, with the lower priority ones breaking ties in the scoring of the higher ones. *'''Meta-level''': One recommendation technique is applied and produces some sort of model, which is then the input used by the next technique.<ref name=hybrids>Robin Burke, [http://www.dcs.warwick.ac.uk/~acristea/courses/CS411/2010/Book%20-%20The%20Adaptive%20Web/HybridWebRecommenderSystems.pdf Hybrid Web Recommender Systems] {{Webarchive|url=https://web.archive.org/web/20140912085014/http://www.dcs.warwick.ac.uk/~acristea/courses/CS411/2010/Book%20-%20The%20Adaptive%20Web/HybridWebRecommenderSystems.pdf |date=2014-09-12 }}, pp. 377-408, The Adaptive Web, Peter Brusilovsky, Alfred Kobsa, Wolfgang Nejdl (Ed.), Lecture Notes in Computer Science, Springer-Verlag, Berlin, Germany, Lecture Notes in Computer Science, Vol. 4321, May 2007, 978-3-540-72078-2.</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)