paper list in the area of reinforcenment learning for recommendation systems
https://github.com/cszhangzhen/DRL4Recsys
SIGIR, Self-Supervised Reinforcement Learning for Recommender Systems, https://arxiv.org/abs/2006.05779
WSDM, Model-Based Reinforcement Learning for Whole-Chain Recommendations, https://arxiv.org/abs/1902.03987
WSDM, End-to-End Deep Reinforcement Learning based Recommendation with Supervised Embedding, https://dl.acm.org/doi/abs/10.1145/3336191.3371858
WSDM, Pseudo Dyna-Q: A Reinforcement Learning Framework for Interactive Recommendation, https://dl.acm.org/doi/abs/10.1145/3336191.3371801
AAAI, Simulating User Feedback for Reinforcement Learning Based Recommendations, https://arxiv.org/pdf/1906.11462.pdf
KBS, State representation modeling for deep reinforcement learning based recommendation, https://www.sciencedirect.com/science/article/abs/pii/S095070512030407X
MOReL : Model-Based Offline Reinforcement Learning, https://arxiv.org/abs/2005.05951
KDD, MBCAL: Sample Efficient and Variance Reduced Reinforcement Learning for Recommender Systems, https://arxiv.org/pdf/1911.02248.pdf
Generator and Critic: A Deep Reinforcement Learning Approach for Slate Re-ranking in E-commerce, https://arxiv.org/pdf/2005.12206.pdf
NIPS, Model-Based Reinforcement Learning with Adversarial Training for Online Recommendation, paper and code: http://papers.nips.cc/paper/9257-a-model-based-reinforcement-learning-with-adversarial-training-for-online-recommendation
NIPS, Benchmarking Batch Deep Reinforcement Learning Algorithms, https://arxiv.org/abs/1910.01708, code: https://github.com/sfujim/BCQ
ICML, Off-Policy Deep Reinforcement Learning without Exploration, https://arxiv.org/abs/1812.02900, code: https://github.com/sfujim/BCQ
ICML, Challenges of Real-World Reinforcement Learning, https://arxiv.org/abs/1904.12901
ICML, Horizon: Facebook's Open Source Applied Reinforcement Learning Platform, https://arxiv.org/pdf/1811.00260.pdf
ICML, Generative Adversarial User Model for Reinforcement Learning Based Recommendation System, paper and code, http://proceedings.mlr.press/v97/chen19f.html
KDD, Deep Reinforcement Learning for List-wise Recommendations,https://arxiv.org/pdf/1801.00209.pdf code: https://github.com/luozachary/drl-rec
WSDM, Top-K Off-Policy Correction for a REINFORCE Recommender System, https://arxiv.org/pdf/1812.02353.pdf
SigWeb, Deep reinforcement learning for search, recommendation, and online advertising: a survey, https://dl.acm.org/doi/abs/10.1145/3320496.3320500
UIST, Learning Cooperative Personalized Policies from Gaze Data, https://dl.acm.org/doi/abs/10.1145/3332165.3347933
Toward Simulating Environments in Reinforcement Learning Based Recommendations, https://arxiv.org/abs/1906.11462
RecSys, PyRecGym: a reinforcement learning gym for recommender systems, https://dl.acm.org/doi/abs/10.1145/3298689.3346981
Recsys, Revisiting offline evaluation for implicit-feedback recommender systems, https://dl.acm.org/doi/pdf/10.1145/3298689.3347069
IJCAI, Reinforcement Learning for Slate-based Recommender Systems: A Tractable Decomposition and Practical Methodology, https://arxiv.org/pdf/1905.12767.pdf
AAAI, Virtual-Taobao: Virtualizing Real-world Online Retail Environment for Reinforcement Learning, https://arxiv.org/pdf/1805.10000.pdf
WWW, Towards Neural Mixture Recommender for Long Range Dependent User Sequences, https://dl.acm.org/doi/abs/10.1145/3308558.3313650
Deep Reinforcement Learning for Online Advertising in Recommender Systems, https://arxiv.org/abs/1909.03602
Towards Characterizing Divergence in Deep Q-Learning, https://arxiv.org/abs/1903.08894
Dynamic Search -- Optimizing the Game of Information Seeking, https://arxiv.org/abs/1909.12425
RecSim: A Configurable Simulation Platform for Recommender Systems, https://arxiv.org/abs/1909.04847
KDD, Reinforcement Learning to Rank in E-Commerce Search Engine: Formalization, Analysis, and Application, https://arxiv.org/pdf/1803.00710.pdf
WWW, DRN: A Deep Reinforcement Learning Framework for News Recommendation, http://www.personal.psu.edu/~gjz5038/paper/www2018_reinforceRec/www2018_reinforceRec.pdf
https://github.com/higgsfield/RL-Adventure-2, PyTorch tutorial of: actor critic / proximal policy optimization / acer / ddpg / twin dueling ddpg / soft actor critic / generative adversarial imitation learning / hindsight experience replay
Key Papers from OpenAI, https://spinningup.openai.com/en/latest/spinningup/keypapers.html
Strategic Exploration in Reinforcement Learning - New Algorithms and Learning Guarantees, https://www.ml.cmu.edu/research/phd-dissertation-pdfs/cmu-ml-19-116-dann.pdf
Learning to Recommend via Meta Parameter Partition, https://arxiv.org/pdf/1912.04108.pdf
Adversarial Machine Learning in Recommender Systems: State of the art and Challenges, https://arxiv.org/abs/2005.10322
WWW20, Mixed Negative Sampling for Learning Two-tower Neural Networks in Recommendations, https://dl.acm.org/doi/abs/10.1145/3366424.3386195
ICLR2020, On the Variance of the Adaptive Learning Rate and Beyond, https://github.com/LiyuanLucasLiu/RAdam, code: https://github.com/LiyuanLucasLiu/RAdam
WSDM2020, Unbiased Recommender Learning from Missing-Not-At-Random Implicit Feedback, https://dl.acm.org/doi/abs/10.1145/3336191.3371783
Recsys2019, Recommending what video to watch next: a multitask ranking system, https://dl.acm.org/doi/abs/10.1145/3298689.3346997
Recsys2019, Addressing delayed feedback for continuous training with neural networks in CTR prediction, https://dl.acm.org/doi/abs/10.1145/3298689.3347002
IJCAI2019, Sequential Recommender Systems: Challenges, Progress and Prospects, https://arxiv.org/abs/2001.04830
KDD2019, Fairness in Recommendation Ranking through Pairwise Comparisons, https://dl.acm.org/doi/abs/10.1145/3292500.3330745
BoTorch: Programmable Bayesian Optimization in PyTorch, https://arxiv.org/abs/1910.06403