đź—¨ About Me
I am an incoming PhD student in Computational Biology and Bioinformatics (CBB) program at USC. I am currently working as a visiting researcher at the Westlake University with Prof. Fajie Yuan and Dr. Zitong Jerry Wang. I received my Master’s degree in Computer Science from Johns Hopkins University and Bachelor’s degree in Computer Science from Wenzhou-Kean University.
Previously, I was advised by Prof. Aloysius Wong at Wenzhou-Kean University when I was an undergrad. After graduation, I was honored to work with Prof. Chongzhi Zang at the University of Virginia Medical School exploring computational genomics. In addition, I worked with Prof. Yanjun Li at UFL on AI-driven drug discovery and later developed a single-cell foundation model in Prof. Xiaojie Qiu’s Lab at Stanford University. I am also collaborating with Prof. Han Xiao’s Group at Rice University Chemistry in developing a robust platform for machine learning guided protein evolution.
I mainly work on the intersection of machine learning and biology in a broad scope. I am interested in foundation models for life sciences and how to use them to discover and interpret new biology. If you’re interested in a discussion, feel free to reach out to me at shiyujia@usc.edu.
Research Interests:
- Deep Learning Methods for Molecular Interactions
- Large Language Model for Life Sciences
- Computational Genomics and Systems Biology
- Drug Discovery
đź“– Educations
- 2025 - 2030, incoming PhD student, Computational Biology and Bioinformatics. University of Southern California. Los Angeles, CA
- 2022 - 2024, Master of Science in Engineering, Computer Science. Johns Hopkins University. Baltimore, MD
- 2018 - 2022, Bachelor of Science, Computer Science. Wenzhou-Kean University. Wenzhou, China
đź“° News
- 2025.01: one paper on developing a single-cell foundation model has ben released to bioRxiv, see our X post here.
- 2024.09: one paper on accurate nanoplastics classification has been accepted by ACS Nano, featured in the cover.
- 2023.07: one paper on developing a multi-agent simulation model for pandemic spread has been accepted by ALIFE 2023.
- 2022.08: one paper on developing a protein recognition webserver has been accepted by Bioinformatics.
đź“ť Publications

Computational approaches and bioinformatic tools for the identification of cryptic enzymes Shiyu Jiang, Aloysius Wong, Chunyun Bi. 2025. As a book chapter in Cryptic Enzymes and Moonlighting. Book

Decoding the Molecular Language of Proteins with Evola Xibin Zhou †, Chenchen Han †, Yingqi Zhang ‡, Jin Su ‡, Kai Zhuang ‡, Shiyu Jiang ‡, Zichen Yuan, Wei Zheng, Fengyuan Dai, Yuyang Zhou, Yuyang Tao, Dan Wu, Fajie Yuan. bioRxiv, 2025. Online Server

Toward a privacy-preserving predictive foundation model of single-cell transcriptomics with federated learning and tabular modeling Jiayuan Ding †, Jianhui Lin †, Shiyu Jiang †, Yixin Wang, Ziyang Mao, Zhaoyu Fang, Jiliang Tang, Min Li, Xiaojie Qiu. bioRxiv, 2025. GitHub

ProTrek: Navigating the Protein Universe through Tri-Modal Contrastive Learning Jin Su †, Yan He †, Shiyang You †, Shiyu Jiang ‡, Xibin Zhou ‡, Xuting Zhang, Yuxuan Wang, Igor Tolstoy, Hongyuan Lu, Xing Chang, Fajie Yuan. bioRxiv, 2024, (In Submission). Online Server

SaprotHub: Making Protein Modeling Accessible to All Biologists Jin Su, Zhikai Li, Chenchen Han, Yuyang Zhou, Yan He, Junjie Shan, Xibin Zhou, Xing Chang, Shiyu Jiang, Dacheng Ma, The OPMC, Martin Steinegger, Sergey Ovchinnikov, Fajie Yuan. bioRxiv, 2024, (In Submission). GitHub | OPMC

Integrating Metal–Phenolic Networks-Mediated Separation and Machine Learning-Aided Surface-Enhanced Raman Spectroscopy for Accurate Nanoplastics Quantification and Classification Haoxin Ye, Shiyu Jiang, Yan Yan, Bin Zhao, Edward R Grant, David D Kitts, Rickey Y Yada, Anubhav Pratap-Singh, Alberto Baldelli, Tianxi Yang. ACS Nano, 2024. Featured on Cover

Simulating Disease Spread During Disaster Scenarios Shiyu Jiang, Heejoong Kim, Fabio Henrique Tanaka, Claus Aranha, Anna Bogdanova, Kimia Ghobadi, Anton Dahbura. The International Conference on Artificial Life, 2023. GitHub

HNOXPred: a web tool for the prediction of gas-sensing H-NOX proteins from amino acid sequence Shiyu Jiang, Hemn Barzan Abdalla, Chuyun Bi, Yi Zhu, Xuechen Tian, Yixin Yang, Aloysius Wong. Bioinformatics, 2022. Online Server | GitHub

Deblur-yolo: Real-time object detection with efficient blind motion deblurring
Shen Zheng, Yuxiong Wu, Shiyu Jiang, Changjie Lu, Gaurav Gupta. International Joint Conference on Neural Networks, 2021
🧑‍💻 Experience
- 2024.08 - Present, Visiting Researcher, Representation Learning Lab & Cell Ethology Lab, Westlake University, Hangzhou, China.
- 2024.01 - 2024.07, Lab Specialist, Chongzhi Zang Lab, University of Virginia School of Medicine, Charlottesville, VA.
- 2022.06 - 2022.08, Software Engineer Intern, Alibaba Cloud - PolarDB, Hangzhou, China.
- 2021.09 - 2022.03, Applied Research Intern, Institute of Automation, Chinese Academy of Sciences, Beijing, China.
🔨 Models
Genomics
-
SICER 2.0 (Spatial-clustering Identification of ChIP-Enriched Regions): a ChIP-Seq broad peak calling data analysis method.
-
Tabula: A privacy-preserving predictive foundation model for single-cell transcriptomics, leveraging federated learning and tabular modeling.
Protein
-
ProTrek: a tri-modal protein language model that jointly models protein sequence, structure and function (SSF).
-
Evolla: a protein-language generative model designed to decode the molecular language of proteins.
-
SaProtHub: making Protein Modeling Accessible to All Biologists.
-
HNOXPred (Prediction of Heme-Nitric oxide/OXygen domains): a web server to predict gas-sensing H-NOX proteins from amino acid sequences.
Molecule
Other
- Koudou: an agent-based model that simulates the infectious disease spread under college town scenario.
🌎 Miscellaneous
Outside of work, you’ll often find me at the gym, playing soccer, road cycling, or go hiking. I also enjoy playing table tennis and the piano occasionally.
