I am a first-year PhD Student in computer science at Peking University (PKU), supervised by Prof. Li Yuan (Deep Learning) and Prof. Fanyang Mo (AI4Science). I have published over 10 papers at the top international AI conferences with a total Google Scholar citations of 700+. I am now a research intern for multimodal generation at the Baidu ERNIE team through the "ERNIE Star Top Talent Program", under the supervision of Jingdong Wang and Haifeng Wang.
My current research interest primarily focuses on:
1. Multimodal Generation and Unified Model: I'm deeply researching the hybrid architecture of autoregressive (AR) and diffusion models, and have developed a thorough understanding and thoughtful insights into their complementary mechanisms, architectural design, and joint training strategies, with the goal of advancing unified multimodal understanding and generation.
2. Vision-Language Model: I'm also very interested in how language can be used to learn effective visual representations.
I am also very interested in their applications in:
1. AIGC Detection (my previous research direction): Developing generalizable and interpretable methods for detecting AI-generated images and videos, along with designing an effective discriminator tailored for DiT and other diffusion-based generative models.
2. AI4Science: Utilizing DL/ML methods or the latest reasoning models to address challenges in the science topics, especially chemistry and drug discovery. I believe that interdisciplinary cross-pollination and integration can spark new insights and innovative thinking.
If you are interested in my work and would like to cooperate with me, please do not hesitate to contact!
GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation
Zhiyuan Yan ๐งโ๐ป, Junyan Ye ๐งโ๐ป, Weijia Li ๐ฎ, Zilong Huang, Shenghai Yuan, Xiangyang He, Kaiqing Lin, Jun He, Conghui He, Li Yuan ๐ฎ
ArXiv, 2025
ImgEdit: A Unified Image Editing Dataset and Benchmark
Yang Ye ๐งโ๐ป, Xianyi He ๐งโ๐ป, Zongjian Li ๐งโ๐ป, Bin Lin ๐งโ๐ป, Shenghai Yuan ๐งโ๐ป, Zhiyuan Yan ๐งโ๐ป, Bohan Hou, Li Yuan ๐ฎ
ArXiv, 2025
Navigating Chemical-Linguistic Sharing Space with Heterogeneous Molecular Encoding
Liuzhenghao Lv ๐งโ๐ป, Hao Li ๐งโ๐ป, Yu Wang, Zhiyuan Yan, Zijun Chen, Zongying Lin, Li Yuan ๐ฎ, Yonghong Tian ๐ฎ
Nature Communication (reviewed), 2024
Orthogonal Subspace Decomposition for Generalizable AI-Generated Image Detection
Zhiyuan Yan ๐งโ๐ป, Jiangming Wang ๐งโ๐ป, Zhendong Wang ๐งโ๐ป, Peng Jin, Ke-Yue Zhang, Shen Chen, Taiping Yao, Shouhong Ding ๐ฎ, Baoyuan Wu, Li Yuan ๐ฎ
ICML, Oral ๐, 2024
Stacking Brick by Brick: Aligned Feature Isolation for Incremental Face Forgery Detection
Jikang Cheng ๐งโ๐ป, Zhiyuan Yan ๐งโ๐ป, Ying Zhang, Hao Li, Jiaxin Ai, Qin Zou, Chen Li, Zhongyuan Wang
CVPR, 2025
Exploring Unbiased Deepfake Detection via Token-Level Shuffling and Mixing
Xinghe Fu ๐งโ๐ป, Zhiyuan Yan ๐งโ๐ป, Taiping Yao ๐ฎ, Shen Chen, Xi Li ๐ฎ
AAAI, Oral๐ค, 2024
๐ณ2-DFD: A framework for e๐ณplainable and e๐ณtendable Deepfake Detection
Yize Chen ๐งโ๐ป, Zhiyuan Yan ๐งโ๐ป, Siwei Lyu, Baoyuan Wu ๐ฎ
ArXiv, 2024
DF40: Toward Next-Generation Deepfake Detection
Zhiyuan Yan, Taiping Yao, Shen Chen, Yandan Zhao, Xinghe Fu, Junwei Zhu, Donghao Luo, Chengjie Wang, Shouhong Ding, Yunsheng Wu, and Li Yuan ๐ฎ
NeurIPS, 2024
Can We Leave Deepfake Data Behind in Training Deepfake Detector?
Jikang Cheng, Zhiyuan Yan, Ying Zhang, Yuhao Luo, Zhongyuan Wang ๐ฎ, Chen Li
NeurIPS, 2024
Generalizing Deepfake Video Detection with Plug-and-Play: Video-Level Blending and Spatiotemporal Adapter Tuning
Zhiyuan Yan, Yandan Zhao, Shen Chen, Xinghe Fu, Taiping Yao, Shouhong Ding, Li Yuan ๐ฎ
CVPR, 2025
Transcending Forgery Specificity with Latent Space Augmentation for Generalizable Deepfake Detection
Zhiyuan Yan, Yuhao Luo, Siwei Lyu, Qingshan Liu, and Baoyuan Wu ๐ฎ
CVPR, 2024
DeepfakeBench: A Comprehensive Benchmark of Deepfake Detection
Zhiyuan Yan, Yong Zhang, Xinhang Yuan, Siwei Lyu, and Baoyuan Wu ๐ฎ
NeurIPS, 2023
UCF: Uncovering Common Features for Generalizable Deepfake Detection
Zhiyuan Yan ๐งโ๐ป, Yong Zhang ๐งโ๐ป, Yanbo Fan, and Baoyuan Wu ๐ฎ
ICCV, 2023
HelixADMET: a robust and endpoint extensible ADMET system incorporating self-supervised knowledge transfer
Shanzhuo Zhang, Zhiyuan Yan, Yueyang Huang, lihang liu, Donglong He, Wei Wang, Xiaomin Fang ๐ฎ, Xiaonan Zhang, Fan Wang ๐ฎ, Hua Wu, and Haifeng Wang
Bioinformatics, 2022