I am a Ph.D. student in Human-Computer Communications Laboratory at The Chinese University of Hong Kong, supervised by Prof. Helen Meng since August 2019. My research focuses on AI Ethics, particularly the development of ethical Large Language Models.
My current work aims to investigate the lifecycle of LLM development - from data curation to model deployment - through a normative lens, to identify ethical issues embedded in existing practices and address them, making LLMs better aligned with human values. My published work tackles these challenges through biased data identification, adversarial training, and structured inference-time reasoning, seeking to bridge the gap between abstract ethical principles and aligned, reliable model behavior. More broadly, I am interested in moving beyond reactive patches toward proactive, normatively sound approaches to ethical AI.
I was an intern in the Speech and Language Processing group at Huawei Noah’s Ark Lab, supervised by Dr. Fei Mi and Dr. Yitong Li, collaborating with the COAI group from Tsinghua University on safety issues in dialogue systems. Prior to this, I was an intern in the NLP group at JingDong AI Research Institute working on task-oriented dialogue systems.
Based on our work on machine ethics (Zhou et al., NAACL2024) and Purple-teaming LLMs (Zhou et al., NAACL2024), I was invited as a contributing interviewee for the MoralPLAI project (Institute for Ethics in AI, Technical University of Munich). Our perspectives on AI ethics were incorporated into a research-based play developed collaboratively with other researchers.
The CDial-Bias dataset (Zhou & Deng et al., EMNLP 2022) and COLD benchmark (Deng & Zhou et al., EMNLP 2022) have been adopted by CLEVA (Chinese Language Models EVAluation Platform) as standard evaluation tasks for assessing social bias and offensiveness in Chinese language models. The released datasets and models have about 400 monthly downloads on Hugging Face.
Based on the CDial-Bias work, we were invited to organize NLPCC 2022 Shared Task 7: Fine-Grained Dialogue Social Bias Measurement, with multiple participating teams [overview].
The JDDC Corpus (Chen et al., LREC 2020), a large-scale multi-turn Chinese dialogue dataset constructed during my internship at JingDong, has been widely used in dialogue systems research.
Hong Kong ICT Awards — Student Innovation Award, HKITDA, 2019 — Shortlisted, Certificate of Excellence
JingDong Dialogue Challenge Technological Innovation Award, JD.com, 2018 — Top 10 among >600 participants
New Asia College Head’s List (Merit), CUHK, 2019 — Top 10% in Department of Information Engineering
Dean’s List, Faculty of Engineering, CUHK, 2018 & 2019 — Top 10% in Faculty of Engineering
Chiu Fuksan Scholarship, CUHK, 2019 — Academic Achievement Scholarship
Mr. and Mrs. Chan Foo Chuen Scholarships, CUHK, 2018 — New Asia College Academic Achievement Scholarship
2025, Dec.: Coordinated the Young Scholar Poster Session of the Responsible AI in Action Workshop.
2025, Nov.: Invited talk titled: ‘‘Data Bias in NLP’’ at The Chinese University of Hong Kong
2024, Mar.: Invited talk titled: ‘‘Recent Advances in AI Ethics and Generative AI’’ at The Chinese University of Hong Kong
2024, Feb.: Invited talk titled: ‘‘Generative AI, the Future of Work, and Talent Cultivation’’ at City University of Hong Kong
2023, Dec.: Student Organizing Chair, 2023 International Doctoral Forum
2022-2023, Term 1: Guest lecturer, SEEM5630: Conversational AI systems
2022, Sept.: Organizer, NLPCC 2022 shared task 7 Fine-Grained Dialogue Social Bias Measurement
Every term 2 during 2019-2023: Teaching Assistant, ENGG1130: Multivariable Calculus
Purple-teaming LLMs with Adversarial Defender Training
Jingyan Zhou, Kun Li, Junan Li, Jiawen Kang, Minda Hu, Xixin Wu, Helen Meng
Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation
Xiaoying Zhang, Baolin Peng, Ye Tian, Jingyan Zhou, Lifeng Jin, Linfeng Song, Haitao Mi, Helen Meng
Rethinking Machine Ethics – Can LLMs Perform Moral Reasoning through the Lens of Moral Theories?
Jingyan Zhou, Minda Hu, Junan Li, Xiaoying Zhang, Xixin Wu, Irwin King, Helen Meng
Towards Identifying Social Bias in Dialog Systems: Frame, Datasets, and Benchmarks
Jingyan Zhou*, Jiawen Deng*, Fei Mi, Yitong Li, Yasheng Wang, Minlie Huang, Xin Jiang, Qun Liu, Helen Meng [data]
COLD: A Benchmark for Chinese Offensive Language Detection
Jiawen Deng*, Jingyan Zhou*, Hao Sun, Chujie Zheng, Fei Mi, Helen Meng and Minlie Huang [data]
PanGu-Bot: Efficient Generative Dialogue Pre-training from Pre-trained Language Model
Fei Mi, Yitong Li, Yulong Zeng, Jingyan Zhou, Yasheng Wang, Chuanfei Xu, Lifeng Shang, Xin Jiang, Shiqi Zhao, Qun Liu [Blog]
Overview of NLPCC 2022 Shared Task 7: Fine-Grained Dialogue Social Bias Measurement
Jingyan Zhou, Fei Mi, Helen Meng, Jiawen Deng [webpage]
SGP-TOD: Building task bots effortlessly via schema-guided llm prompting
Xiaoying Zhang, Baolin Peng, Kun Li, Jingyan Zhou, Helen Meng
(All equal contribution) Mudit Chaudhary, Borislav Dzodzo, Sida Huang, Chun Hei Lo, Mingzhi Lyu, Lun Yiu Nie, Jinbo Xing, Tianhua Zhang, Xiaoying Zhang, Jingyan Zhou, Hong Cheng, Wai Lam, Helen Meng [code]
Automatic Extraction of Semantic Patterns in Dialogs using Convex Polytopic Model
Jingyan Zhou, Xiaoying Zhang, Xiaohan Feng, King Keung Wu, Helen Meng
Jingyan Zhou, Xiaohan Feng, King Keung Wu, Helen Meng
The JDDC Corpus: A Large-Scale Multi-Turn Chinese Dialogue Dataset for E-commerce Customer Service
Meng Chen, Ruixue Liu, Lei Shen, Shaozu Yuan, Jingyan Zhou, Youzheng Wu, Xiaodong He, Bowen Zhou