ReRec: Reasoning-Augmented LLM-based Recommendation Assistant via Reinforcement Fine-tuning
2026-04-09 • Information Retrieval
Information RetrievalArtificial Intelligence
AI summaryⓘ
The authors created ReRec, a new method that helps large language models (LLMs) make better and more thoughtful recommendations when asked complicated questions. Their approach uses special rewards to guide the learning process, breaks down the model's reasoning to fix mistakes, and organizes training by difficulty to improve learning. Tests show ReRec works better than other systems and keeps the model's original skills intact. They also shared their code for others to use.
large language modelsrecommendation systemsreinforcement learningreward shapingadvantage estimationcurriculum learningNDCG@Kmulti-step reasoningfine-tuning
Authors
Jiani Huang, Shijie Wang, Liangbo Ning, Wenqi Fan, Qing Li
Abstract
With the rise of LLMs, there is an increasing need for intelligent recommendation assistants that can handle complex queries and provide personalized, reasoning-driven recommendations. LLM-based recommenders show potential but face challenges in multi-step reasoning, underscoring the need for reasoning-augmented systems. To address this gap, we propose ReRec, a novel reinforcement fine-tuning (RFT) framework designed to improve LLM reasoning in complex recommendation tasks. Our framework introduces three key components: (1) Dual-Graph Enhanced Reward Shaping, integrating recommendation metrics like NDCG@K with Query Alignment and Preference Alignment Scores to provide fine-grained reward signals for LLM optimization; (2) Reasoning-aware Advantage Estimation, which decomposes LLM outputs into reasoning segments and penalizes incorrect steps to enhance reasoning of recommendation; and (3) Online Curriculum Scheduler, dynamically assess query difficulty and organize training curriculum to ensure stable learning during RFT. Experiments demonstrate that ReRec outperforms state-of-the-art baselines and preserves core abilities like instruction-following and general knowledge. Our codes are available at https://github.com/jiani-huang/ReRec.