Editing Recommender system (section)

=== Reinforcement learning for recommender systems ===

The recommendation problem can be seen as a special instance of a reinforcement learning problem whereby the user is the environment upon which the agent, the recommendation system acts upon in order to receive a reward, for instance, a click or engagement by the user.<ref name="yt" /><ref name="srl">{{cite arXiv|last1=Xin|first1=Xin|last2=Karatzoglou|first2=Alexandros|last3=Arapakis|first3=Ioannis|last4=Jose|first4=Joemon|title=Self-Supervised Reinforcement Learning for Recommender Systems|year=2020|class=cs.LG|eprint=2006.05779}}</ref><ref name="sQ">{{Cite journal|last1=Ie|first1=Eugene|last2=Jain|first2=Vihan|last3=Narvekar|first3=Sanmit|last4=Agarwal|first4=Ritesh|last5=Wu|first5=Rui|last6=Cheng|first6=Heng-Tze|last7=Chandra|first7=Tushar|last8=Boutilier|first8=Craig|title=SlateQ: A Tractable Decomposition for Reinforcement Learning with Recommendation Sets|url=https://research.google/pubs/pub48200/|journal=Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19)|year=2019|pages=2592–2599}}</ref> One aspect of reinforcement learning that is of particular use in the area of recommender systems is the fact that the models or policies can be learned by providing a reward to the recommendation agent. This is in contrast to traditional learning techniques which rely on supervised learning approaches that are less flexible, reinforcement learning recommendation techniques allow to potentially train models that can be optimized directly on metrics of engagement, and user interest.<ref name="jd">{{Cite book|last1=Zou|first1=Lixin|last2=Xia|first2=Long|last3=Ding|first3=Zhuoye|last4=Song|first4=Jiaxing|last5=Liu|first5=Weidong|last6=Yin|first6=Dawei|title=Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining |chapter=Reinforcement Learning to Optimize Long-term User Engagement in Recommender Systems |chapter-url=https://dl.acm.org/doi/10.1145/3292500.3330668|series=KDD '19|year=2019|pages=2810–2818|doi=10.1145/3292500.3330668|arxiv=1902.05570|isbn=978-1-4503-6201-6|s2cid=62903207}}</ref>