About Me

Hello! 😊 I’m a third-year PhD candidate at The Chinese University of Hong Kong (CUHK) in the Department of Computer Science and Engineering, supervised by Prof. James Cheng. My research interests include Agentic Reinforcement Learning, Visual Language Models, and Machine Learning Theory.

On the application side, I am interested in building the efficient and effective agent systems, especially in real-world, multi-turn settings using a reinforcement learning approach. On the theory side, I have done research on graph neural networks and attention mechanisms supervised by Dr. Yifei Wang at MIT CSAIL. I have also collaborated closely with Dr. Xinyi Wu at MIT IDSS and Dr. Kevin Qinghong Lin at Oxford Torr Visual Group.

Prior to coming to CUHK, I was an undergraduate student at Harbin Institute of Technology, where I worked as a research intern at SCIR, supervised by Prof. Libo Qin.

If you are seeking any form of academic collaboration, please feel free to email me at wjqkoko@gmail.com.

Expected to graduate in Fall 2027. I am currently looking for internship opportunities or full-time roles starting late 2027. Please feel free to reach out!

🔥 News

2026.07: 🎉 Our TCOD is accepted by COLM 2026.
2026.06: ✍️ New blog post out: On-Policy Distillation Pitfalls — sharing the lessons and pitfalls behind our TCOD work. Welcome to read and discuss on my blog!
2026.05: 🎉 Our VideoASMR-Bench is now public and has reached over 5k downloads and accepted by CVPR 2026 VGBE Workshop.
2026.01: 🎉 Our Prost-LLM is accepted by ICASSP 2026.
2025.09: 🎉 Our three papers are accepted by NeurIPS 2025.
2025.08: 🎉 Our MLMT is accepted by 2025 IEEE ASRU.
2025.05: 🎉 Our TON is accepted by ICML 2025 EXAIT Workshop.
2025.04: 🎉 Our PIGDreamer is accepted by ICML 2025.
2025.02: 🎉 Our DivIL is accepted by TMLR 2025.
2024.09: 🎉 Our Reasoning Boundary is accepted by NeurIPS 2024 (Oral).
2024.03: 🎉 Our MISTS is accepted by AAAI 2024 (Oral) .

📝 Publications

† indicates equal contribution.

First-author

	TCOD: Exploring Temporal Curriculum in On-Policy Distillation for Multi-turn Autonomous Agents Jiaqi Wang, Wenhao Zhang, Weijie Shi, Yaliang Li, James Cheng. COLM 2026 [paper] [code] [huggingface paper]
	VideoASMR-Bench: Can AI-Generated ASMR Videos Fool VLMs and Humans? Jiaqi Wang†, Weijia Wu†, Yi Zhan, Rui Zhao, Ming Hu, James Cheng, Wei Liu, Philip Torr, Kevin Qinghong Lin. CVPR 2026 Workshop VGBE [paper] [homepage] [code] [dataset] [huggingface paper]
	Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models Jiaqi Wang†, Kevin QH. Lin†, James Cheng, Mike Z. Shou. NeurIPS 2025, ICML 2025 EXAIT Workshop [paper] [code] [huggingface]
	A Signed Graph Approach to Understanding and Mitigating Oversmoothing in GNNs Jiaqi Wang†, Xinyi Wu†, James Cheng, Yifei Wang. NeurIPS 2025 [paper] [code]
	DivIL: Unveiling and Addressing Over-Invariance for Out-of-Distribution Generalization Jiaqi Wang†, Yuhang Zhou†, Zhixiong Zhang†, Qiguang Chen, Yongqiang Chen, James Cheng. TMLR 2025 [paper] [code]

Non-first-author

	PROST-LLM: Progressively Enhancing the Speech-to-Speech Translation Capability in LLMs Jing Xu, Jiaqi Wang, Daxin Tan, Xiao Chen. ICASSP 2026 [paper]
	Visual Thoughts: A Unified Perspective of Understanding Multimodal Chain-of-Thought Zihui Cheng, Qiguang Chen, Xiao Xu, Jiaqi Wang, Weiyun Wang, Hao Fei, Yidong Wang, Alex Jinpeng Wang, Zhi Chen, Wanxiang Che, Libo Qin. NeurIPS 2025 [paper]
	Enhancing Multilingual Speech Generation and Recognition Abilities in LLMs with Constructed Code-switched Data Jing Xu, Daxin Tan, Jiaqi Wang, Xiao Chen. IEEE ASRU 2025 [paper]
	PIGDreamer: Privileged Information Guided World Models for Safe Partially Observable Reinforcement Learning Dongchi Huang, Jiaqi Wang, Yang Li, Chunhe Xia, Tianle Zhang, Kaige Zhang. ICML 2025 [paper] [code]
	Unlocking the capabilities of thought: A reasoning boundary framework to quantify and optimize chain-of-thought Qiguang Chen, Libo Qin, Jiaqi Wang, Jingxuan Zhou, Wanxiang Che. NeurIPS 2024, Oral [paper] [code]
	Enhancing evolving domain generalization through dynamic latent representations Binghui Xie, Yongqiang Chen, Jiaqi Wang, Kaiwen Zhou, Bo Han, Wei Meng, James Cheng. AAAI 2024, Oral [paper]

🏆 Honors

2025.10, Neurips Scholar Award.
2021.10, National Scholarship.

💻 Internships

2025.10 - 2026.04, Research Intern @ Alibaba Tongyi Lab, Hangzhou.
2022.05 - 2022.08, Engineer Intern @ Microsoft Azure Spring App, Shanghai.