Tstars-Tryon 1.0: Robust and Realistic Virtual Try-On for Diverse Fashion Items

2026-04-21Computer Vision and Pattern Recognition

Computer Vision and Pattern Recognition
AI summary

The authors introduce Tstars-Tryon 1.0, a virtual try-on system that works well even with difficult conditions like weird poses or bad lighting. It creates very realistic images of clothes on people, keeping details like texture and fabric look without common AI mistakes. The system can combine up to six images and control aspects like the person's appearance and background. It is also fast enough for real-time use and has been tested on a large scale in the Taobao App. The authors provide a benchmark to help future research in this area.

virtual try-onimage generationphotorealistic renderingmulti-image compositionpose variationinference speedAI artifactsend-to-end modelbenchmarkcommercial deployment
Authors
Mengting Chen, Zhengrui Chen, Yongchao Du, Zuan Gao, Taihang Hu, Jinsong Lan, Chao Lin, Yefeng Shen, Xingjian Wang, Zhao Wang, Zhengtao Wu, Xiaoli Xu, Zhengze Xu, Hao Yan, Mingzhou Zhang, Jun Zheng, Qinye Zhou, Xiaoyong Zhu, Bo Zheng
Abstract
Recent advances in image generation and editing have opened new opportunities for virtual try-on. However, existing methods still struggle to meet complex real-world demands. We present Tstars-Tryon 1.0, a commercial-scale virtual try-on system that is robust, realistic, versatile, and highly efficient. First, our system maintains a high success rate across challenging cases like extreme poses, severe illumination variations, motion blur, and other in-the-wild conditions. Second, it delivers highly photorealistic results with fine-grained details, faithfully preserving garment texture, material properties, and structural characteristics, while largely avoiding common AI-generated artifacts. Third, beyond apparel try-on, our model supports flexible multi-image composition (up to 6 reference images) across 8 fashion categories, with coordinated control over person identity and background. Fourth, to overcome the latency bottlenecks of commercial deployment, our system is heavily optimized for inference speed, delivering near real-time generation for a seamless user experience. These capabilities are enabled by an integrated system design spanning end-to-end model architecture, a scalable data engine, robust infrastructure, and a multi-stage training paradigm. Extensive evaluation and large-scale product deployment demonstrate that Tstars-Tryon1.0 achieves leading overall performance. To support future research, we also release a comprehensive benchmark. The model has been deployed at an industrial scale on the Taobao App, serving millions of users with tens of millions of requests.