Hi, I am a student at Tsinghua University. Currently, my research focus on multi-modal learning and generative models. I have broad interests and am always open to explore new and meaningful topics. As my tech expertise deepened, I now focus less on paper quantity and more on rethinking problems and offering simple, effective solutions.
My research topic includes vision-language understanding and video/3D generation.