Curriculum Vitae
Postgraduate Student with Computer Vision and Medical AI Experience | Learning Embodied Intelligence
Education
M.S. in Artificial Intelligence and Adaptive System
Sussex Artificial Intelligence Institute, Zhejiang Gongshang University
Hangzhou, China | 2024-Present
- Core Courses: Intelligence in Animals and Machines, Intelligent Systems Techniques, Image Processing, Natural Language Processing, Machine Learning
- Supervisor: Assistant Professor Peter Wijeratne (University of Sussex) and Professor Xie Mande (Zhejiang Gongshang University)
- Research Direction:Integrating physics models into a VAE framework, mainly to enhance the interpretability of the latent space.
- Expected Graduation: March 2026
B.S. in Computer Science and Technology
Wenzhou Business College
Wenzhou, China | 2019-2023
- GPA: 3.41/5.0 (84.7/100)
- Thesis: "Smoking behavior detection based on deep learning and skeletal framework"
- Relevant Coursework: Data Structures and Algorithms, Python Programming, Data Analysis, Machine Learning
- Thesis Advisor: Prof. Fangjun Kuang
Projects & Internship Experience
Research Intern
Wenzhou Medical University First Affiliated Hospital - Hepato-Pancreato-biliary Surgery Laboratory
Wenzhou, China | Sept 2022 - Jan 2023
- Assisted in developing medical image preprocessing software for clinical applications, resulting in software copyright registration (2022SR0252378)
- Contributed to deep learning models for leukemia diagnosis based on tongue image analysis
- Helped create machine learning algorithms for exosome feature analysis in hepatocellular carcinoma research, contributing to a paper published in Frontiers in Cell and Developmental Biology
- Technologies: PyTorch, TensorFlow, OpenCV
Student Team Leader, National Innovation Training Project
Wenzhou Business College
June 2022 - June 2023
- Led a team of 4 students in exploring attention mechanisms for enhancing YOLO models
- Implemented and tested self-attention module modifications, exploring methods to improve detection accuracy
- Contributed to securing 3 software copyrights and 1 patent application based on project outcomes
- Technologies: PyTorch, YOLO, Computer Vision, OpenCV
- Advisor: Prof. Siyang Zhang
Intern
Zhejiang University Urban — Rural Planning & Design Institute Co., Ltd
Hangzhou, China | Sep 2025 - Dec 2025
- Enterprise/government scenario project (NDA): intelligent document processing and multi‑agent workflow
- Introduced MCP tools and pluggable toolchains, enabling multi‑agent collaboration and observability
- Designed user interruption and resume mechanisms for stable production operation
Enterprise Collaboration Project
Industry partner
2025
- Migrated CUDA‑based YOLOv8 training/inference pipeline to 8× Huawei Ascend 910B (NPU)
- Completed CANN/ACL adaptation and HCCL multi‑card training; fixed operator discrepancies and aligned accuracy
- Optimized data pipelines and graph mode execution with near‑baseline mAP; supported containerized deployment
Publications
Machine Learning Identifies Exosome Features Related to Hepatocellular Carcinoma
Journal: Frontiers in Cell and Developmental Biology (September 2022)
Authors: Kai Zhu, Qiqi Tao, Jiatao Yan, Zhichao Lang, Xinmiao Li, Yifei Li, Congcong Fan, Zhengping Yu
DOI: 10.3389/fcell.2022.1020415
Impact Factor: 5.8
Co-first author (third position): Designed and implemented the ML analysis pipeline. Compared multiple algorithms (Random Forest, SVM‑RFE, LASSO) to identify and validate high‑value exosome biomarkers from high‑dimensional proteomics data.
Multi-omics and Machine Learning-driven CD8+ T Cell Heterogeneity Score for Prognosis
Journal: Molecular Therapy Nucleic Acids (December 2024)
Authors: Di He, Zhan Yang, Tian Zhang, Yaxian Luo, Lianjie Peng, Jiatao Yan, Tao Qiu, Jingyu Zhang, Luying Qin, Zhichao Liu, Xiaoting Zhang, Lining Lin, Mouyuan Sun
DOI: 10.1016/j.omtn.2024.102413
Impact Factor: 6.4
Contribution: Provided ML support, implementing methods including LASSO regression to identify key prognostic genes from multi‑omics data and supply features for building the CD8+ T cell heterogeneity score (CD8THS).
Using Multiomics and Machine Learning: Insights into Improving the Outcomes of Clear Cell Renal Cell Carcinoma via the SRD5A3-AS1/hsa-let-7e-5p/RRM2 Axis
Journal: ACS Omega (June 2025)
Authors: Mouyuan Sun, Zhan Yang, Yaxian Luo, Luying Qin, Lianjie Peng, Chaoran Pan, Jiatao Yan, Tao Qiu, Yan Zhang
Impact Factor: 3.7
Contribution: Implemented the ML pipeline to identify key features of the SRD5A3‑AS1/hsa‑let‑7e‑5p/RRM2 axis and quantify its prognostic value in clear cell renal cell carcinoma.
A multi-data fusion deep learning model for prognostic prediction in upper tract urothelial carcinoma
Journal: Frontiers in Oncology (August 2025)
Authors: Hongdi Sun, Siping Chen, Yongxing Bao, Fengyan You, Honghui Zhu, Xin Yao, Lianguo Chen, Jiangwei Miao, Fanggui Shao, Xiaomin Gao, Binwei Lin
DOI: 10.3389/fonc.2025.1644250
Contribution: Designed and implemented deep learning architectures for multi-phase CT analysis; integrated imaging features with clinical tabular data to build a comprehensive prognostic model; participated in model validation and optimization, providing key technical support for the publication.
Manuscripts in Preparation
YOLOv11-LCDFS: Enhanced Smoking Detection With Low-light Enhancement
In Revision
Authors: Jiatao Yan, Zhuzikai Zheng, Zhengtan Yang, Hao Jiang, Peichen Wang, Fangjun Kuang, Siyang Zhang
First author: working on a YOLO-based architecture with integrated low-light enhancement capabilities, specialized loss functions, attention mechanisms, and optimized upsampling techniques for improved detection in challenging lighting conditions.
Software Copyrights & Patent Application
Patent Application
Title: Smoking Behavior Recognition Camera and Determination Method
Application Number: 202310277784.1
Status: Application process concluded - withdrawn due to academic credential requirements rather than technical merit
Inventors: Jiatao Yan, Siyang Zhang, Fangjun Kuang, Peichen Wang, Zhuzikai Zheng, Hao Jiang, Zhengtan Yang, Hanwen Bao, Chunqiu Xia
Summary: A method combining pose estimation for real-time smoking behavior detection in public spaces.
Software Copyrights
- Medical Image Computing Software (2022SR0252378)
Registered: April 2022 - Human Skeleton Recognition Software (2022SR1258998)
Registered: October 2022 - Cigarette Recognition Software (2022SR1277520)
Registered: October 2022 - Smoking Behavior Detection Software (2022SR1277521)
Registered: October 2022
Academic Achievements & Awards
Yale/UNC-CH - Geophysical Waveform Inversion
Ranked 255/1365 | Top 19% | Kaggle Global Competition | July 1, 2025
BYU - Locating Bacterial Flagellar Motors 2025
Ranked 315/1136 | Top 28% | Kaggle Global Competition | June 5, 2025
Predict Calorie Expenditure Competition
Ranked 178/4316 | Top 5% | Kaggle Global Competition | June 1, 2025
Stanford RNA 3D Folding Competition
Bronze Medal | Ranked 144/1516 | Top 10% | Kaggle Global Competition | Deadline: May 23, 2025
Predict Podcast Listening Time Competition
Ranked 116/3310 | Top 4% | Kaggle Global Competition | May 1, 2025
HuBMAP + HPA Competition
Ranked 441/1174 | Top 38% | Kaggle Global Competition | September 2022
18th Challenge Cup College Student Competition
Bronze Medal | Zhejiang Province Level | May 2023
4th National "Chuanzhi Cup" IT Skills Competition
Provincial Excellent Award | Zhejiang Province | December 2021
2023 Wenzhou Computer Society Student Member Innovation and Entrepreneurship Award
3rd Prize | Wenzhou | April 2024
Projects
YOLOv11-LCDFS: Enhanced Smoking Detection With Low-light Enhancement
Extension of smoking detection research focusing on improved object detection in challenging lighting conditions. Developing a YOLO-based architecture with specialized components addressing the unique challenges of low-light environments. GitHub
- Designing custom loss functions specifically optimized for low-light object detection scenarios
- Implementing attention mechanisms to focus on key visual features in varying illumination conditions
- Optimizing upsampling techniques to preserve fine details in dark environments
- Integrating lightweight low-light enhancement module into the detection pipeline
Technologies: PyTorch, YOLO, Computer Vision, CUDA, Attention Mechanisms
Status: Ongoing (April 2025)
Related publications: YOLOv11‑LCDFS: Enhanced Smoking Detection With Low‑light Enhancement (manuscript in preparation)
Multi-modal Medical Image Analysis for Cancer Research
Developing medical image analysis systems for cancer research. GitHub
- Designing medical image segmentation algorithms for hepatocellular carcinoma and renal cell carcinoma
- Designing methods for integrating clinical tabular data with imaging features for comprehensive analysis
- Implementing multi-modal fusion techniques for combining different CT scan phases
- Developing 3D volumetric segmentation approaches for comprehensive anatomical analysis
Technologies: Python, Deep Learning, 3D Segmentation, Multi-modal Fusion, PyDicom, NumPy
Status: Ongoing (April 2025)
Related publications: Multi‑modal medical image segmentation and fusion (manuscript in preparation)
Disease Detection Using Deep Learning
Developed deep learning models for automated disease detection and classification from medical imaging data, with a focus on diabetic foot ulcer detection. GitHub
- Implemented and compared multiple YOLOv8 architecture variants enhanced with advanced attention mechanisms (GAM, CBAM, ECA, CoordAtt) for precise localization of diabetic wounds in clinical images
- Designed and evaluated custom YOLOv8 architectures integrating novel upsampling techniques including CARAFE and DySample for improved feature map resolution
- Developed innovative triplet-based loss functions and Inner-CIoU mechanisms to enhance detection accuracy for wounds of varying sizes and appearances
- Employed dynamic convolution techniques to adaptively capture wound features across diverse clinical settings and lighting conditions
- Constructed and curated specialized datasets of diabetic foot ulcers for training and comprehensive validation
- Achieved significant improvements in both detection accuracy and inference speed compared to baseline models, with particular gains for small and atypical lesions
- Implemented model explainability techniques to visualize feature importance and attention maps for clinical interpretation
Technologies: PyTorch, YOLO, Attention Mechanisms (GAM/CBAM/ECA), Custom Loss Functions, Feature Upsampling Techniques, Dynamic Convolutions
Date: August 2024
Twitter Quality and Spam Detection System
Developed an advanced machine learning system for Twitter content quality assessment and spam detection, achieving over 88% accuracy in distinguishing between legitimate tweets and spam content. GitHub
- Implemented comprehensive data preprocessing techniques for Twitter data, including text normalization, feature extraction, and handling of missing values
- Engineered complex features by combining user metrics (follower count, following count) and behavioral patterns to enhance classification performance
- Applied sentiment analysis and content analysis techniques to identify quality patterns in tweets
- Developed and compared multiple machine learning models to optimize classification accuracy
- Created data visualizations to communicate findings and identify key patterns in Twitter content quality assessment
Technologies: Python, NLTK, scikit-learn, Pandas, Matplotlib, Seaborn
Date: August 2024
Airbnb Price Analysis and Prediction System
Developed a comprehensive data analysis and machine learning system to predict Airbnb listing prices in New York City based on various property features and location data. GitHub
- Performed extensive exploratory data analysis on NYC Airbnb dataset with over 48,000 listings, creating visualizations to reveal pricing patterns across neighborhoods
- Implemented geospatial analysis to visualize property distribution and identify high-value areas using Python GIS libraries
- Engineered relevant features by transforming categorical variables and creating new metrics to better capture pricing factors
- Developed a RandomForest regression model to predict continuous listing prices with optimized hyperparameters
- Created a classification model to categorize listings into price brackets, achieving high accuracy through model tuning with RandomizedSearchCV
- Built interactive visualizations to help hosts and travelers understand pricing determinants in the NYC short-term rental market
Technologies: Python, Pandas, Scikit-learn, GeoPandas, Matplotlib, Seaborn, RandomForest
Date: January 2024
Airline Sentiment Analysis System
Learning to analyze airline customer sentiment from Twitter data using natural language processing techniques. GitHub
- Studying text preprocessing methods for social media data (handling hashtags, mentions, emojis, URLs)
- Learning about different feature extraction approaches including TF-IDF vectorization and N-gram analysis
- Working with various classification models and comparing their performance
- Practicing error analysis to understand patterns in misclassified tweets
- Creating visualizations using Matplotlib, Seaborn, and Plotly to understand sentiment trends
Technologies: Python, NLTK, spaCy, scikit-learn, Pandas, Matplotlib, Seaborn, Plotly, WordCloud
Date: December 2023
Smoking Behavior Detection System
Undergraduate thesis project combining YOLO object detection with MediaPipe skeletal tracking to accurately identify and classify smoking gestures in video streams. GitHub
- Utilized YOLO object detection to identify cigarettes and related objects in video footage
- Implemented MediaPipe for real-time skeletal tracking and pose estimation
- Designed algorithms to recognize characteristic smoking hand-to-mouth gesture patterns
Technologies: YOLO, MediaPipe, Pose Estimation, Action Recognition, PyTorch, OpenCV
Date: April 2023
Outcomes: Patent application (202310277784.1), 3 software copyrights, thesis received outstanding evaluation
Heart Disease Prediction System
Learning to analyze health indicators for heart disease prediction using machine learning approaches. GitHub
- Working with health metrics from the CDC's BRFSS survey dataset
- Learning about feature selection and preprocessing for health indicators
- Studying both traditional machine learning models and deep learning approaches
- Practicing model evaluation using cross-validation and performance metrics
- Creating visualizations to understand relationships between health factors
Technologies: Python, TensorFlow 2.11.0, scikit-learn, Pandas, NumPy, Matplotlib, Seaborn
Date: May 2022
Research Interests
Medical Artificial Intelligence
- Studying deep learning applications in medical image analysis and disease diagnosis
- Learning about ways to integrate multi-modal clinical data
- Interested in computer-aided diagnosis systems for clinical applications
Embodied Intelligence
- Beginning to learn about how robots might develop intelligence through physical interaction
- Interested in understanding the relationship between perception and action in embodied agents
- Looking forward to studying reinforcement learning in robotic applications
Computer Vision
- Learning about attention-based architectures for object detection
- Studying approaches to human pose estimation and behavior recognition
- Interested in visual feature extraction methods for real-world applications
Multi-Agent Systems
- Starting to learn about how multiple agents might work together
- Interested in understanding group behaviors in agent communities
- Looking forward to studying the basics of multi-agent AI systems
Technical Skills
Programming Languages
- Python
- C, Java, SQL
- C#, JavaScript, Vue
AI & Machine Learning
- Frameworks: PyTorch, TensorFlow
- Areas: Computer Vision, Deep Learning, Reinforcement Learning
- Techniques: CNNs, Attention Mechanisms, Transfer Learning
Development Tools
- Version Control: Git, GitHub
- Documentation: LaTeX, Markdown
- Environment: Linux, Jupyter, Docker
Data Analysis
- Libraries: NumPy, Pandas, SciPy
- Visualization: Matplotlib, Seaborn, Plotly
Languages
- Chinese: Native speaker
- English: IELTS 6.0 (L 6.5 / R 6.5 / W 5.5 / S 6.0); CET‑6 467
Certifications
- DevCloud Summer Training Camp (Huawei)
- Python for AI Development (Shandong University)
- Deep Learning Fundamentals (Shandong University)
Learning Journey & Future Interests
Current Learning Focus
Beginning to explore embodied intelligence and robotics through self-study:
- Basic Algorithms: Starting to learn about reinforcement learning, imitation learning, and control methods for robotics
- Vision-Language-Action Models: Beginning to understand how perception, language, and action might work together in robots
- Robot Learning: Starting to learn about robot manipulation and navigation
- Multi-Agent Systems: Beginning to explore how multiple agents might work together
- Simulation Tools: Learning to use basic simulation environments like MuJoCo or Isaac Gym
Learning Resources
- GitHub Resources:
- Embodied-AI-Guide (github.com/tianxingchen/Embodied-AI-Guide): A helpful guide for learning about:
- Basic algorithms in robotics and AI
- Vision-Language-Action models
- Common simulation environments
- Computer vision basics
- Introduction to robot learning
- Embodied-AI-Paper-List (github.com/Lumina-EAI/Embodied-AI-Paper-List): Collection of papers to learn from
- Awesome-Embodied-AI-Job (github.com/StarCycle/Awesome-Embodied-AI-Job): Resource for learning about opportunities in the field
- Embodied-AI-Guide (github.com/tianxingchen/Embodied-AI-Guide): A helpful guide for learning about:
- Reading Materials:
- Learning from papers at ICRA, CoRL, NeurIPS, CVPR, and ICLR conferences
- Studying foundation models in robot learning
- Reading about LLM applications in robotics
- Learning about sim-to-real transfer methods
Questions I'm Curious About
- How might robots learn from interacting with their environment?
- What might help multiple agents work together effectively?
- How might AI systems learn to break down complex tasks?
- What role might physical interaction play in developing AI systems?
References
Professional and academic references available upon request.
