Editing Recommender system (section)

=== Collaborative filtering ===
{{Main|Collaborative filtering}}
[[File: Collaborative filtering.gif|thumb|An example of collaborative filtering based on a rating system]]
One approach to the design of recommender systems that has wide use is [[collaborative filtering]].<ref name="Breese98">{{cite conference |author1=John S. Breese |author2=David Heckerman |author3=Carl Kadie  |name-list-style=amp |year = 1998 |title = Empirical analysis of predictive algorithms for collaborative filtering |conference =  In Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence (UAI'98) |arxiv=1301.7363 }}
</ref> Collaborative filtering is based on the assumption that people who agreed in the past will agree in the future, and that they will like similar kinds of items as they liked in the past. The system generates recommendations using only information about rating profiles for different users or items. By locating peer users/items with a rating history similar to the current user or item, they generate recommendations using this neighborhood. Collaborative filtering methods are classified as memory-based and model-based. A well-known example of memory-based approaches is the user-based algorithm,<ref>{{cite report|author1=Breese, John S.|author2=Heckerman, David|author3=Kadie, Carl|url=http://research.microsoft.com/pubs/69656/tr-98-12.pdf|title=Empirical Analysis of Predictive Algorithms for Collaborative Filtering|year=1998|publisher=Microsoft Research}}</ref> while that of model-based approaches is [[matrix factorization (recommender systems)]].<ref>{{cite journal|title=Matrix Factorization Techniques for Recommender Systems|doi=10.1109/MC.2009.263|volume=42|journal= Computer|pages=30–37|date=2009-08-01|last1=Koren|first1=Yehuda|last2=Volinsky|first2=Chris|issue=8 |citeseerx=10.1.1.147.8295|s2cid=58370896 }}</ref>

A key advantage of the collaborative filtering approach is that it does not rely on machine analyzable content and therefore it is capable of accurately recommending complex items such as movies without requiring an "understanding" of the item itself. Many algorithms have been used in measuring user similarity or item similarity in recommender systems. For example, the [[k-nearest neighbors algorithm|k-nearest neighbor]] (k-NN) approach<ref>{{cite web |last1 = Sarwar |first1 = B. |last2 = Karypis |first2 = G. |last3 = Konstan |first3 = J. |last4 = Riedl |first4 = J. |year = 2000 |url = http://glaros.dtc.umn.edu/gkhome/node/122 |title = Application of Dimensionality Reduction in Recommender System A Case Study }},</ref> and the [[Pearson correlation|Pearson Correlation]] as first implemented by Allen.<ref>{{citation |mode=cs1 | last = Allen | first = R.B. | date=1990 | title=User Models: Theory, Method, Practice| publisher = International J. Man-Machine Studies}}</ref>

When building a model from a user's behavior, a distinction is often made between explicit and [[implicit data collection|implicit]] forms of [[data collection]].

Examples of explicit data collection include the following:
* Asking a user to rate an item on a sliding scale.
* Asking a user to search.
* Asking a user to rank a collection of items from favorite to least favorite.
* Presenting two items to a user and asking him/her to choose the better one of them.
* Asking a user to create a list of items that he/she likes (see ''[[Rocchio algorithm|Rocchio classification]]'' or other similar techniques).

Examples of [[implicit data collection]] include the following:
* Observing the items that a user views in an online store.
* Analyzing item/user viewing times.<ref>{{Cite conference |last1 = Parsons |first1 = J. |last2 = Ralph |first2 = P. |last3 = Gallagher |first3 = K. |date = July 2004 |title = Using viewing time to infer user preference in recommender systems |conference = AAAI Workshop in Semantic Web Personalization, San Jose, California }}.</ref>
* Keeping a record of the items that a user purchases online.
* Obtaining a list of items that a user has listened to or watched on his/her computer.
* Analyzing the user's social network and discovering similar likes and dislikes.

Collaborative filtering approaches often suffer from three problems: [[Cold start (computing)|cold start]], scalability, and sparsity.<ref name="Lee2007">Sanghack Lee and Jihoon Yang and Sung-Yong Park, [https://books.google.com/books?id=u4qzlZAEjegC&dq=sparsity+problem+content-based&pg=PA396 Discovery of Hidden Similarity on Collaborative Filtering to Overcome Sparsity Problem], Discovery Science, 2007.</ref>
* '''Cold start''': For a new user or item, there is not enough data to make accurate recommendations. Note: one commonly implemented solution to this problem is the [[Multi-armed bandit|multi-armed bandit algorithm]].<ref>{{Cite book|last1=Felício|first1=Crícia Z.|last2=Paixão|first2=Klérisson V.R.|last3=Barcelos|first3=Celia A.Z.|last4=Preux|first4=Philippe|title=Proceedings of the 25th Conference on User Modeling, Adaptation and Personalization |chapter=A Multi-Armed Bandit Model Selection for Cold-Start User Recommendation |date=2017-07-09|chapter-url=https://doi.org/10.1145/3079628.3079681|series=UMAP '17|location=Bratislava, Slovakia|publisher=Association for Computing Machinery|pages=32–40|doi=10.1145/3079628.3079681|isbn=978-1-4503-4635-1|s2cid=653908|url=https://hal.inria.fr/hal-01517967/file/umap2017.4hal.pdf }}</ref><ref name=":3" /><ref name="rubens2016"/><ref name="elahi2016"/><ref name="bi2017"/>
* '''Scalability''': There are millions of users and products in many of the environments in which these systems make recommendations. Thus, a large amount of computation power is often necessary to calculate recommendations.
* '''Sparsity''': The number of items sold on major e-commerce sites is extremely large. The most active users will only have rated a small subset of the overall database. Thus, even the most popular items have very few ratings.

One of the most famous examples of collaborative filtering is item-to-item collaborative filtering (people who buy x also buy y), an algorithm popularized by [[Amazon.com]]'s recommender system.<ref name="patft.uspto.gov">[http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&p=1&u=/netahtml/PTO/search-bool.html&r=1&f=G&l=50&co1=AND&d=PTXT&s1=6,266,649.PN.&OS=PN/6,266,649&RS=PN/6,266,649 Collaborative Recommendations Using Item-to-Item Similarity Mappings] {{webarchive|url=https://web.archive.org/web/20150316185024/http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&p=1&u=%2Fnetahtml%2FPTO%2Fsearch-bool.html&r=1&f=G&l=50&co1=AND&d=PTXT&s1=6%2C266%2C649.PN.&OS=PN%2F6%2C266%2C649&RS=PN%2F6%2C266%2C649 |date=2015-03-16 }}</ref>

Many [[social networks]] originally used collaborative filtering to recommend new friends, groups, and other social connections by examining the network of connections between a user and their friends.<ref name="handbook">{{cite book |last1=Ricci |first1=Francesco |last2=Rokach |first2=Lior |last3=Shapira |first3=Bracha |editor1-last=Ricci |editor1-first=Francesco |editor2-last=Rokach |editor2-first=Lior |editor3-last=Shapira |editor3-first=Bracha |title=Recommender Systems Handbook |date=2022 |publisher=Springer |location=New York |isbn=978-1-0716-2196-7 |edition=3  | chapter=Recommender Systems: Techniques, Applications, and Challenges |chapter-url = https://link.springer.com/chapter/10.1007/978-1-0716-2197-4_1 |doi=10.1007/978-1-0716-2197-4_1|pages=1–35}}</ref> Collaborative filtering is still used as part of hybrid systems.