Hunyuanportrait: Implicit condition control for enhanced portrait animation
Zunnan Xu, Zhentao Yu, Zixiang Zhou, Jun Zhou, and others
CVPR 2025
HunyuanVideo: A Systematic Framework For Large Video Generative Models
Core contributor in Hunyuan Foundation Model Team
Technical Report, 2024
Audio-visual controlled video diffusion with masked selective state spaces modeling for natural talking head generation
Fa-Ting Hong, Zunnan Xu, Zixiang Zhou, Jun Zhou, Xiu Li, Qin Lin, Qinglin Lu, Dan Xu
ICCV 2025
Alignment is All You Need: A Training-free Augmentation Strategy for Pose-guided Video Generation
Xiaoyu Jin*, Zunnan Xu*, Mingwen Ou, Wenming Yang
CVG@ICML 2024
Zero-shot 3D-Aware Trajectory-Guided image-to-video generation via Test-Time Training
Ruicheng Zhang*, Jun Zhou*, Zunnan Xu*, Zihao Liu, Jiehui Huang, Mingyang Zhang, Yu Sun, Xiu Li
AAAI 2026
Fireedit: Fine-grained instruction-based image editing via region-aware vision language model
Jun Zhou, Jiahao Li, Zunnan Xu, Hanhui Li, Yiji Cheng, Fa-Ting Hong, Qin Lin, Qinglin Lu, Xiaodan Liang
CVPR 2025
InterAnimate: Taming Region-aware Diffusion Model for Realistic Human Interaction Animation
Yukang Lin, Yan Hong, Zunnan Xu, Xindi Li, Chao Xu, Chuanbiao Song, Ronghui Li, Haoxing Chen, Jun Lan, Huijia Zhu, and others
ACMMM 2025
Bridging vision and language encoders: Parameter-efficient tuning for referring image segmentation
Zunnan Xu*, Zhihong Chen*, Yong Zhang, Yibing Song, Xiang Wan, Guanbin Li
ICCV 2023
Igniting vlms toward the embodied space
Core contributor in X Square Robot Team
Technical Report, 2025
SAM-R1: Leveraging SAM for Reward Feedback in Multimodal Segmentation via Reinforcement Learning
Jiaqi Huang*, Zunnan Xu*, Jun Zhou, Ting Liu, Yicheng Xiao, Mingwen Ou, Bowen Ji, Xiu Li, Kehong Yuan
NeurIPS 2025
Densely Connected Parameter-Efficient Tuning for Referring Image Segmentation
Jiaqi Huang*, Zunnan Xu*, Ting Liu, Yong Liu, Haonan Han, Kehong Yuan, Xiu Li
AAAI 2025
Enhancing Fine-grained Multi-modal Alignment via Adapters: A Parameter-Efficient Training Framework
Zunnan Xu, Jiaqi Huang, Ting Liu, Yong Liu, Haonan Han, Kehong Yuan, Xiu Li
WANT@ICML 2024
MaPPER: Multimodal Prior-guided Parameter Efficient Tuning for Referring Expression Comprehension
Ting Liu*, Zunnan Xu*, Yue Hu, Liangtao Shi, Zhiqiang Wang, Quanjun Yin
EMNLP 2024
Mambatalk: Efficient holistic gesture synthesis with selective state space models
Zunnan Xu, Yukang Lin, Haonan Han, Sicheng Yang, Ronghui Li, Yachao Zhang, Xiu Li
NeurIPS 2024
Chain of generation: Multi-modal gesture synthesis via cascaded conditional control
Zunnan Xu, Yachao Zhang, Sicheng Yang, Ronghui Li, Xiu Li
AAAI 2024
Separate to Collaborate: Dual-Stream Diffusion Model for Coordinated Piano Hand Motion Synthesis
Zihao Liu*, Mingwen Ou*, Zunnan Xu*, Jiaqi Huang, Haonan Han, Ronghui Li, Xiu Li
ACMMM 2025
Freetalker: Controllable speech and text-driven gesture generation based on diffusion models
Sicheng Yang, Zunnan Xu, Haiwei Xue, Yongkang Cheng, Shaoli Huang, Mingming Gong, Zhiyong Wu
ICASSP 2024
Atom: Aligning text-to-motion model at event-level with gpt-4vision reward
Haonan Han, Xiangzuo Wu, Huan Liao, Zunnan Xu, Zhongyuan Hu, Ronghui Li, Yachao Zhang, Xiu Li
CVPR 2025
Reparo: Compositional 3d assets generation with differentiable 3d layout alignment
Haonan Han, Rui Yang, Huan Liao, Jiankai Xing, Zunnan Xu, Xiaoming Yu, Junwei Zha, Xiu Li, Wanhua Li
ICCV 2025
Consistent123: One image to highly consistent 3d asset using case-aware diffusion priors
Yukang Lin, Haonan Han, Chaoqun Gong, Zunnan Xu, Yachao Zhang, Xiu Li
ACMMM 2024