FileGram: Grounding Agent Personalization in File-System Behavioral Traces

2026-04-06Computer Vision and Pattern Recognition

Computer Vision and Pattern RecognitionArtificial Intelligence
AI summary

The authors introduce FileGram, a system designed to improve how AI agents remember and personalize their actions based on users' file system behavior. They created a tool to simulate realistic user workflows, a benchmark to test AI memory related to file activities, and a new way for AI to build user profiles from detailed file actions rather than just conversations. Their experiments show that existing AI memory systems struggle with these tasks, but FileGram's components perform well. By sharing their work openly, the authors aim to help future research on personalized AI agents that work closely with users' file data.

coworking AI agentsfile systemspersonalizationbehavioral tracesmultimodal action sequencesmemory systemspersona driftbenchmarkuser profilesprocedural/semantic/episodic memory
Authors
Shuai Liu, Shulin Tian, Kairui Hu, Yuhao Dong, Zhe Yang, Bo Li, Jingkang Yang, Chen Change Loy, Ziwei Liu
Abstract
Coworking AI agents operating within local file systems are rapidly emerging as a paradigm in human-AI interaction; however, effective personalization remains limited by severe data constraints, as strict privacy barriers and the difficulty of jointly collecting multimodal real-world traces prevent scalable training and evaluation, and existing methods remain interaction-centric while overlooking dense behavioral traces in file-system operations; to address this gap, we propose FileGram, a comprehensive framework that grounds agent memory and personalization in file-system behavioral traces, comprising three core components: (1) FileGramEngine, a scalable persona-driven data engine that simulates realistic workflows and generates fine-grained multimodal action sequences at scale; (2) FileGramBench, a diagnostic benchmark grounded in file-system behavioral traces for evaluating memory systems on profile reconstruction, trace disentanglement, persona drift detection, and multimodal grounding; and (3) FileGramOS, a bottom-up memory architecture that builds user profiles directly from atomic actions and content deltas rather than dialogue summaries, encoding these traces into procedural, semantic, and episodic channels with query-time abstraction; extensive experiments show that FileGramBench remains challenging for state-of-the-art memory systems and that FileGramEngine and FileGramOS are effective, and by open-sourcing the framework, we hope to support future research on personalized memory-centric file-system agents.