Zhiyuan Yan
๐Ÿ“– First-year PhD Student
๐Ÿซ Peking University, previously at CUHK-SZ
๐Ÿ’ก Vision Language Model, Generative AI, AIGC Detection, AI4Science
Email / Github / Scholar / OpenReview
Biography
  • I am a first-year PhD Student in computer science at Peking University (PKU), supervised by Prof. Li Yuan (Deep Learning) and Prof. Fanyang Mo (AI4Science). I have published over 10 papers at the top international AI conferences with a total Google Scholar citations of 700+. I am now a research intern for multimodal generation at the Baidu ERNIE team through the "ERNIE Star Top Talent Program", under the supervision of Jingdong Wang and Haifeng Wang.

  • My current research interest primarily focuses on:
  • 1. Multimodal Generation and Unified Model: I'm deeply researching the hybrid architecture of autoregressive (AR) and diffusion models, and have developed a thorough understanding and thoughtful insights into their complementary mechanisms, architectural design, and joint training strategies, with the goal of advancing unified multimodal understanding and generation.
    2. Vision-Language Model: I'm also very interested in how language can be used to learn effective visual representations.
  • I am also very interested in their applications in:
  • 1. AIGC Detection (my previous research direction): Developing generalizable and interpretable methods for detecting AI-generated images and videos, along with designing an effective discriminator tailored for DiT and other diffusion-based generative models.
    2. AI4Science: Utilizing DL/ML methods or the latest reasoning models to address challenges in the science topics, especially chemistry and drug discovery. I believe that interdisciplinary cross-pollination and integration can spark new insights and innovative thinking.
    If you are interested in my work and would like to cooperate with me, please do not hesitate to contact!
    News
    Research Highlights (๐Ÿง‘โ€๐Ÿ’ป Co-first Author, ๐Ÿ“ฎ Corresponding Author)
    GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation
    Zhiyuan Yan ๐Ÿง‘โ€๐Ÿ’ป, Junyan Ye ๐Ÿง‘โ€๐Ÿ’ป, Weijia Li ๐Ÿ“ฎ, Zilong Huang, Shenghai Yuan, Xiangyang He, Kaiqing Lin, Jun He, Conghui He, Li Yuan ๐Ÿ“ฎ
    ArXiv, 2025
    arXiv / Project Page
    ImgEdit: A Unified Image Editing Dataset and Benchmark
    Yang Ye ๐Ÿง‘โ€๐Ÿ’ป, Xianyi He ๐Ÿง‘โ€๐Ÿ’ป, Zongjian Li ๐Ÿง‘โ€๐Ÿ’ป, Bin Lin ๐Ÿง‘โ€๐Ÿ’ป, Shenghai Yuan ๐Ÿง‘โ€๐Ÿ’ป, Zhiyuan Yan ๐Ÿง‘โ€๐Ÿ’ป, Bohan Hou, Li Yuan ๐Ÿ“ฎ
    ArXiv, 2025
    arXiv / Project Page
    Navigating Chemical-Linguistic Sharing Space with Heterogeneous Molecular Encoding
    Liuzhenghao Lv ๐Ÿง‘โ€๐Ÿ’ป, Hao Li ๐Ÿง‘โ€๐Ÿ’ป, Yu Wang, Zhiyuan Yan, Zijun Chen, Zongying Lin, Li Yuan ๐Ÿ“ฎ, Yonghong Tian ๐Ÿ“ฎ
    Nature Communication (reviewed), 2024
    arXiv / Project Page
    Orthogonal Subspace Decomposition for Generalizable AI-Generated Image Detection
    Zhiyuan Yan ๐Ÿง‘โ€๐Ÿ’ป, Jiangming Wang ๐Ÿง‘โ€๐Ÿ’ป, Zhendong Wang ๐Ÿง‘โ€๐Ÿ’ป, Peng Jin, Ke-Yue Zhang, Shen Chen, Taiping Yao, Shouhong Ding ๐Ÿ“ฎ, Baoyuan Wu, Li Yuan ๐Ÿ“ฎ
    ICML, Oral ๐Ÿ†, 2024
    arXiv
    Stacking Brick by Brick: Aligned Feature Isolation for Incremental Face Forgery Detection
    Jikang Cheng ๐Ÿง‘โ€๐Ÿ’ป, Zhiyuan Yan ๐Ÿง‘โ€๐Ÿ’ป, Ying Zhang, Hao Li, Jiaxin Ai, Qin Zou, Chen Li, Zhongyuan Wang
    CVPR, 2025
    arXiv
    Exploring Unbiased Deepfake Detection via Token-Level Shuffling and Mixing
    Xinghe Fu ๐Ÿง‘โ€๐Ÿ’ป, Zhiyuan Yan ๐Ÿง‘โ€๐Ÿ’ป, Taiping Yao ๐Ÿ“ฎ, Shen Chen, Xi Li ๐Ÿ“ฎ
    AAAI, Oral๐ŸŽค, 2024
    arXiv
    ๐’ณ2-DFD: A framework for e๐’ณplainable and e๐’ณtendable Deepfake Detection
    Yize Chen ๐Ÿง‘โ€๐Ÿ’ป, Zhiyuan Yan ๐Ÿง‘โ€๐Ÿ’ป, Siwei Lyu, Baoyuan Wu ๐Ÿ“ฎ
    ArXiv, 2024
    arXiv
    DF40: Toward Next-Generation Deepfake Detection
    Zhiyuan Yan, Taiping Yao, Shen Chen, Yandan Zhao, Xinghe Fu, Junwei Zhu, Donghao Luo, Chengjie Wang, Shouhong Ding, Yunsheng Wu, and Li Yuan ๐Ÿ“ฎ
    NeurIPS, 2024
    arXiv / Project Page
    Can We Leave Deepfake Data Behind in Training Deepfake Detector?
    Jikang Cheng, Zhiyuan Yan, Ying Zhang, Yuhao Luo, Zhongyuan Wang ๐Ÿ“ฎ, Chen Li
    NeurIPS, 2024
    arXiv
    Generalizing Deepfake Video Detection with Plug-and-Play: Video-Level Blending and Spatiotemporal Adapter Tuning
    Zhiyuan Yan, Yandan Zhao, Shen Chen, Xinghe Fu, Taiping Yao, Shouhong Ding, Li Yuan ๐Ÿ“ฎ
    CVPR, 2025
    arXiv
    Transcending Forgery Specificity with Latent Space Augmentation for Generalizable Deepfake Detection
    Zhiyuan Yan, Yuhao Luo, Siwei Lyu, Qingshan Liu, and Baoyuan Wu ๐Ÿ“ฎ
    CVPR, 2024
    arXiv / Project Page
    DeepfakeBench: A Comprehensive Benchmark of Deepfake Detection
    Zhiyuan Yan, Yong Zhang, Xinhang Yuan, Siwei Lyu, and Baoyuan Wu ๐Ÿ“ฎ
    NeurIPS, 2023
    arXiv / Project Page
    UCF: Uncovering Common Features for Generalizable Deepfake Detection
    Zhiyuan Yan ๐Ÿง‘โ€๐Ÿ’ป, Yong Zhang ๐Ÿง‘โ€๐Ÿ’ป, Yanbo Fan, and Baoyuan Wu ๐Ÿ“ฎ
    ICCV, 2023
    arXiv / Project Page
    HelixADMET: a robust and endpoint extensible ADMET system incorporating self-supervised knowledge transfer
    Shanzhuo Zhang, Zhiyuan Yan, Yueyang Huang, lihang liu, Donglong He, Wei Wang, Xiaomin Fang ๐Ÿ“ฎ, Xiaonan Zhang, Fan Wang ๐Ÿ“ฎ, Hua Wu, and Haifeng Wang
    Bioinformatics, 2022
    arXiv / Project Page
    Experience
    Academic Services
    Contact