When Chatbots Accommodate: What AI Companions Optimize for in Vulnerable Conversations

2026-06-03Human-Computer Interaction

Human-Computer Interaction
AI summary

The authors studied how AI chatbots respond when people share feelings of loneliness or distress. They created a new way to categorize both user emotions and chatbot replies during long conversations. Using a technique called Inverse Reinforcement Learning, they analyzed nearly 48,000 chat turns to see the different strategies used by GPT-4.1, Character.AI, and Replika. They found each chatbot has a unique approach: GPT-4.1 mostly gives advice, Character.AI uses mixed tactics, and Replika focuses on asking questions and staying engaged. Their method reveals patterns not visible in usual tests, offering a better way to understand and improve chatbot safety.

AI companion chatbotsuser vulnerabilitychatbot responseInverse Reinforcement Learningconversational AIGPT-4.1Character.AIReplikadecision policysafety evaluation
Authors
Minh Duc Chu, Yifan Wu, Zhiyi Chen, Angel Hsing-Chi Hwang, Luca Luceri
Abstract
Millions turn to AI companion chatbots during loneliness, grief, and personal crises. How these companion platforms respond in such moments can shape the trajectory of a user's vulnerable state. Yet we lack tools to characterize what each platform actually does when users open up. Existing audits score reactions to pre-defined crisis prompts and miss the underlying decision policy that governs sustained interaction. We address these gaps with two key contributions. First, we introduce the AI Companion Vulnerability-Response Taxonomy, a paired taxonomy of user vulnerability and chatbot response designed for analyzing extended companion chatbot interactions. Second, we infer the response policy each platform follows across distinct vulnerability scenarios by applying Inverse Reinforcement Learning to ~48k turns of real-world user conversations with GPT-4.1, Character.AI, and Replika. Our findings reveal what AI companions prioritize in conversations with vulnerable users: GPT-4.1 reaches for advice, Character.AI spreads its response across different strategies without a dominant mode, and Replika consistently asks questions and stays present. Each, however, downweights the responses that introduce corrective friction: GPT-4.1 probes less as conversations continue and when interacting with psychologically high-risk users; Replika advises bonded users more and challenges them less; Character.AI shows no committed engagement strategy on internal distress. Estimated policies are invisible to output-level audits, providing a new lens for auditing chatbots in the wild and enabling more realistic safety evaluation.