Curriculum Vitae

Postgraduate Student with Computer Vision and Medical AI Experience | Learning Embodied Intelligence


Education

M.S. in Artificial Intelligence and Adaptive System

Sussex Artificial Intelligence Institute, Zhejiang Gongshang University
Hangzhou, China | 2024-Present

  • Core Courses: Intelligence in Animals and Machines, Intelligent Systems Techniques, Image Processing, Natural Language Processing, Machine Learning
  • Supervisor: Assistant Professor Peter Wijeratne (University of Sussex) and Professor Xie Mande (Zhejiang Gongshang University)
  • Research Direction:Integrating physics models into a VAE framework, mainly to enhance the interpretability of the latent space.
  • Expected Graduation: March 2026

B.S. in Computer Science and Technology

Wenzhou Business College
Wenzhou, China | 2019-2023

  • GPA: 3.41/5.0 (84.7/100)
  • Thesis: "Smoking behavior detection based on deep learning and skeletal framework"
  • Relevant Coursework: Data Structures and Algorithms, Python Programming, Data Analysis, Machine Learning
  • Thesis Advisor: Prof. Fangjun Kuang

Projects & Internship Experience

Research Intern

Wenzhou Medical University First Affiliated Hospital - Hepato-Pancreato-biliary Surgery Laboratory
Wenzhou, China | Sept 2022 - Jan 2023

  • Assisted in developing medical image preprocessing software for clinical applications, resulting in software copyright registration (2022SR0252378)
  • Contributed to deep learning models for leukemia diagnosis based on tongue image analysis
  • Helped create machine learning algorithms for exosome feature analysis in hepatocellular carcinoma research, contributing to a paper published in Frontiers in Cell and Developmental Biology
  • Technologies: PyTorch, TensorFlow, OpenCV

Student Team Leader, National Innovation Training Project

Wenzhou Business College
June 2022 - June 2023

  • Led a team of 4 students in exploring attention mechanisms for enhancing YOLO models
  • Implemented and tested self-attention module modifications, exploring methods to improve detection accuracy
  • Contributed to securing 3 software copyrights and 1 patent application based on project outcomes
  • Technologies: PyTorch, YOLO, Computer Vision, OpenCV
  • Advisor: Prof. Siyang Zhang

Intern

Zhejiang University Urban — Rural Planning & Design Institute Co., Ltd
Hangzhou, China | Sep 2025 - Dec 2025

  • Enterprise/government scenario project (NDA): intelligent document processing and multi‑agent workflow
  • Introduced MCP tools and pluggable toolchains, enabling multi‑agent collaboration and observability
  • Designed user interruption and resume mechanisms for stable production operation

Enterprise Collaboration Project

Industry partner
2025

  • Migrated CUDA‑based YOLOv8 training/inference pipeline to 8× Huawei Ascend 910B (NPU)
  • Completed CANN/ACL adaptation and HCCL multi‑card training; fixed operator discrepancies and aligned accuracy
  • Optimized data pipelines and graph mode execution with near‑baseline mAP; supported containerized deployment

Publications

Machine Learning Identifies Exosome Features Related to Hepatocellular Carcinoma

Journal: Frontiers in Cell and Developmental Biology (September 2022)

Authors: Kai Zhu, Qiqi Tao, Jiatao Yan, Zhichao Lang, Xinmiao Li, Yifei Li, Congcong Fan, Zhengping Yu

DOI: 10.3389/fcell.2022.1020415

Impact Factor: 5.8

Co-first author (third position): Designed and implemented the ML analysis pipeline. Compared multiple algorithms (Random Forest, SVM‑RFE, LASSO) to identify and validate high‑value exosome biomarkers from high‑dimensional proteomics data.

Multi-omics and Machine Learning-driven CD8+ T Cell Heterogeneity Score for Prognosis

Journal: Molecular Therapy Nucleic Acids (December 2024)

Authors: Di He, Zhan Yang, Tian Zhang, Yaxian Luo, Lianjie Peng, Jiatao Yan, Tao Qiu, Jingyu Zhang, Luying Qin, Zhichao Liu, Xiaoting Zhang, Lining Lin, Mouyuan Sun

DOI: 10.1016/j.omtn.2024.102413

Impact Factor: 6.4

Contribution: Provided ML support, implementing methods including LASSO regression to identify key prognostic genes from multi‑omics data and supply features for building the CD8+ T cell heterogeneity score (CD8THS).

Using Multiomics and Machine Learning: Insights into Improving the Outcomes of Clear Cell Renal Cell Carcinoma via the SRD5A3-AS1/hsa-let-7e-5p/RRM2 Axis

Journal: ACS Omega (June 2025)

Authors: Mouyuan Sun, Zhan Yang, Yaxian Luo, Luying Qin, Lianjie Peng, Chaoran Pan, Jiatao Yan, Tao Qiu, Yan Zhang

DOI: 10.1021/acsomega.5c01337

Impact Factor: 3.7

Contribution: Implemented the ML pipeline to identify key features of the SRD5A3‑AS1/hsa‑let‑7e‑5p/RRM2 axis and quantify its prognostic value in clear cell renal cell carcinoma.

A multi-data fusion deep learning model for prognostic prediction in upper tract urothelial carcinoma

Journal: Frontiers in Oncology (August 2025)

Authors: Hongdi Sun, Siping Chen, Yongxing Bao, Fengyan You, Honghui Zhu, Xin Yao, Lianguo Chen, Jiangwei Miao, Fanggui Shao, Xiaomin Gao, Binwei Lin

DOI: 10.3389/fonc.2025.1644250

Contribution: Designed and implemented deep learning architectures for multi-phase CT analysis; integrated imaging features with clinical tabular data to build a comprehensive prognostic model; participated in model validation and optimization, providing key technical support for the publication.


Manuscripts in Preparation

YOLOv11-LCDFS: Enhanced Smoking Detection With Low-light Enhancement

In Revision

Authors: Jiatao Yan, Zhuzikai Zheng, Zhengtan Yang, Hao Jiang, Peichen Wang, Fangjun Kuang, Siyang Zhang

First author: working on a YOLO-based architecture with integrated low-light enhancement capabilities, specialized loss functions, attention mechanisms, and optimized upsampling techniques for improved detection in challenging lighting conditions.


Software Copyrights & Patent Application

Patent Application

Title: Smoking Behavior Recognition Camera and Determination Method

Application Number: 202310277784.1

Status: Application process concluded - withdrawn due to academic credential requirements rather than technical merit

Inventors: Jiatao Yan, Siyang Zhang, Fangjun Kuang, Peichen Wang, Zhuzikai Zheng, Hao Jiang, Zhengtan Yang, Hanwen Bao, Chunqiu Xia

Summary: A method combining pose estimation for real-time smoking behavior detection in public spaces.

Software Copyrights

  • Medical Image Computing Software (2022SR0252378)
    Registered: April 2022
  • Human Skeleton Recognition Software (2022SR1258998)
    Registered: October 2022
  • Cigarette Recognition Software (2022SR1277520)
    Registered: October 2022
  • Smoking Behavior Detection Software (2022SR1277521)
    Registered: October 2022

Academic Achievements & Awards

Yale/UNC-CH - Geophysical Waveform Inversion

Ranked 255/1365 | Top 19% | Kaggle Global Competition | July 1, 2025

BYU - Locating Bacterial Flagellar Motors 2025

Ranked 315/1136 | Top 28% | Kaggle Global Competition | June 5, 2025

Predict Calorie Expenditure Competition

Ranked 178/4316 | Top 5% | Kaggle Global Competition | June 1, 2025

Stanford RNA 3D Folding Competition

Bronze Medal | Ranked 144/1516 | Top 10% | Kaggle Global Competition | Deadline: May 23, 2025

Predict Podcast Listening Time Competition

Ranked 116/3310 | Top 4% | Kaggle Global Competition | May 1, 2025

HuBMAP + HPA Competition

Ranked 441/1174 | Top 38% | Kaggle Global Competition | September 2022

18th Challenge Cup College Student Competition

Bronze Medal | Zhejiang Province Level | May 2023

4th National "Chuanzhi Cup" IT Skills Competition

Provincial Excellent Award | Zhejiang Province | December 2021

2023 Wenzhou Computer Society Student Member Innovation and Entrepreneurship Award

3rd Prize | Wenzhou | April 2024


Projects

YOLOv11-LCDFS: Enhanced Smoking Detection With Low-light Enhancement

Extension of smoking detection research focusing on improved object detection in challenging lighting conditions. Developing a YOLO-based architecture with specialized components addressing the unique challenges of low-light environments. GitHub

  • Designing custom loss functions specifically optimized for low-light object detection scenarios
  • Implementing attention mechanisms to focus on key visual features in varying illumination conditions
  • Optimizing upsampling techniques to preserve fine details in dark environments
  • Integrating lightweight low-light enhancement module into the detection pipeline

Technologies: PyTorch, YOLO, Computer Vision, CUDA, Attention Mechanisms

Status: Ongoing (April 2025)

Related publications: YOLOv11‑LCDFS: Enhanced Smoking Detection With Low‑light Enhancement (manuscript in preparation)

Multi-modal Medical Image Analysis for Cancer Research

Developing medical image analysis systems for cancer research. GitHub

  • Designing medical image segmentation algorithms for hepatocellular carcinoma and renal cell carcinoma
  • Designing methods for integrating clinical tabular data with imaging features for comprehensive analysis
  • Implementing multi-modal fusion techniques for combining different CT scan phases
  • Developing 3D volumetric segmentation approaches for comprehensive anatomical analysis

Technologies: Python, Deep Learning, 3D Segmentation, Multi-modal Fusion, PyDicom, NumPy

Status: Ongoing (April 2025)

Related publications: Multi‑modal medical image segmentation and fusion (manuscript in preparation)

Disease Detection Using Deep Learning

Developed deep learning models for automated disease detection and classification from medical imaging data, with a focus on diabetic foot ulcer detection. GitHub

  • Implemented and compared multiple YOLOv8 architecture variants enhanced with advanced attention mechanisms (GAM, CBAM, ECA, CoordAtt) for precise localization of diabetic wounds in clinical images
  • Designed and evaluated custom YOLOv8 architectures integrating novel upsampling techniques including CARAFE and DySample for improved feature map resolution
  • Developed innovative triplet-based loss functions and Inner-CIoU mechanisms to enhance detection accuracy for wounds of varying sizes and appearances
  • Employed dynamic convolution techniques to adaptively capture wound features across diverse clinical settings and lighting conditions
  • Constructed and curated specialized datasets of diabetic foot ulcers for training and comprehensive validation
  • Achieved significant improvements in both detection accuracy and inference speed compared to baseline models, with particular gains for small and atypical lesions
  • Implemented model explainability techniques to visualize feature importance and attention maps for clinical interpretation

Technologies: PyTorch, YOLO, Attention Mechanisms (GAM/CBAM/ECA), Custom Loss Functions, Feature Upsampling Techniques, Dynamic Convolutions

Date: August 2024

Twitter Quality and Spam Detection System

Developed an advanced machine learning system for Twitter content quality assessment and spam detection, achieving over 88% accuracy in distinguishing between legitimate tweets and spam content. GitHub

  • Implemented comprehensive data preprocessing techniques for Twitter data, including text normalization, feature extraction, and handling of missing values
  • Engineered complex features by combining user metrics (follower count, following count) and behavioral patterns to enhance classification performance
  • Applied sentiment analysis and content analysis techniques to identify quality patterns in tweets
  • Developed and compared multiple machine learning models to optimize classification accuracy
  • Created data visualizations to communicate findings and identify key patterns in Twitter content quality assessment

Technologies: Python, NLTK, scikit-learn, Pandas, Matplotlib, Seaborn

Date: August 2024

Airbnb Price Analysis and Prediction System

Developed a comprehensive data analysis and machine learning system to predict Airbnb listing prices in New York City based on various property features and location data. GitHub

  • Performed extensive exploratory data analysis on NYC Airbnb dataset with over 48,000 listings, creating visualizations to reveal pricing patterns across neighborhoods
  • Implemented geospatial analysis to visualize property distribution and identify high-value areas using Python GIS libraries
  • Engineered relevant features by transforming categorical variables and creating new metrics to better capture pricing factors
  • Developed a RandomForest regression model to predict continuous listing prices with optimized hyperparameters
  • Created a classification model to categorize listings into price brackets, achieving high accuracy through model tuning with RandomizedSearchCV
  • Built interactive visualizations to help hosts and travelers understand pricing determinants in the NYC short-term rental market

Technologies: Python, Pandas, Scikit-learn, GeoPandas, Matplotlib, Seaborn, RandomForest

Date: January 2024

Airline Sentiment Analysis System

Learning to analyze airline customer sentiment from Twitter data using natural language processing techniques. GitHub

  • Studying text preprocessing methods for social media data (handling hashtags, mentions, emojis, URLs)
  • Learning about different feature extraction approaches including TF-IDF vectorization and N-gram analysis
  • Working with various classification models and comparing their performance
  • Practicing error analysis to understand patterns in misclassified tweets
  • Creating visualizations using Matplotlib, Seaborn, and Plotly to understand sentiment trends

Technologies: Python, NLTK, spaCy, scikit-learn, Pandas, Matplotlib, Seaborn, Plotly, WordCloud

Date: December 2023

Smoking Behavior Detection System

Undergraduate thesis project combining YOLO object detection with MediaPipe skeletal tracking to accurately identify and classify smoking gestures in video streams. GitHub

  • Utilized YOLO object detection to identify cigarettes and related objects in video footage
  • Implemented MediaPipe for real-time skeletal tracking and pose estimation
  • Designed algorithms to recognize characteristic smoking hand-to-mouth gesture patterns

Technologies: YOLO, MediaPipe, Pose Estimation, Action Recognition, PyTorch, OpenCV

Date: April 2023

Outcomes: Patent application (202310277784.1), 3 software copyrights, thesis received outstanding evaluation

Heart Disease Prediction System

Learning to analyze health indicators for heart disease prediction using machine learning approaches. GitHub

  • Working with health metrics from the CDC's BRFSS survey dataset
  • Learning about feature selection and preprocessing for health indicators
  • Studying both traditional machine learning models and deep learning approaches
  • Practicing model evaluation using cross-validation and performance metrics
  • Creating visualizations to understand relationships between health factors

Technologies: Python, TensorFlow 2.11.0, scikit-learn, Pandas, NumPy, Matplotlib, Seaborn

Date: May 2022


Research Interests

Medical Artificial Intelligence

  • Studying deep learning applications in medical image analysis and disease diagnosis
  • Learning about ways to integrate multi-modal clinical data
  • Interested in computer-aided diagnosis systems for clinical applications

Embodied Intelligence

  • Beginning to learn about how robots might develop intelligence through physical interaction
  • Interested in understanding the relationship between perception and action in embodied agents
  • Looking forward to studying reinforcement learning in robotic applications

Computer Vision

  • Learning about attention-based architectures for object detection
  • Studying approaches to human pose estimation and behavior recognition
  • Interested in visual feature extraction methods for real-world applications

Multi-Agent Systems

  • Starting to learn about how multiple agents might work together
  • Interested in understanding group behaviors in agent communities
  • Looking forward to studying the basics of multi-agent AI systems

Technical Skills

Programming Languages

  • Python
  • C, Java, SQL
  • C#, JavaScript, Vue

AI & Machine Learning

  • Frameworks: PyTorch, TensorFlow
  • Areas: Computer Vision, Deep Learning, Reinforcement Learning
  • Techniques: CNNs, Attention Mechanisms, Transfer Learning

Development Tools

  • Version Control: Git, GitHub
  • Documentation: LaTeX, Markdown
  • Environment: Linux, Jupyter, Docker

Data Analysis

  • Libraries: NumPy, Pandas, SciPy
  • Visualization: Matplotlib, Seaborn, Plotly

Languages

  • Chinese: Native speaker
  • English: IELTS 6.0 (L 6.5 / R 6.5 / W 5.5 / S 6.0); CET‑6 467

Certifications

  • DevCloud Summer Training Camp (Huawei)
  • Python for AI Development (Shandong University)
  • Deep Learning Fundamentals (Shandong University)

Learning Journey & Future Interests

Current Learning Focus

Beginning to explore embodied intelligence and robotics through self-study:

  • Basic Algorithms: Starting to learn about reinforcement learning, imitation learning, and control methods for robotics
  • Vision-Language-Action Models: Beginning to understand how perception, language, and action might work together in robots
  • Robot Learning: Starting to learn about robot manipulation and navigation
  • Multi-Agent Systems: Beginning to explore how multiple agents might work together
  • Simulation Tools: Learning to use basic simulation environments like MuJoCo or Isaac Gym

Learning Resources

  • GitHub Resources:
  • Reading Materials:
    • Learning from papers at ICRA, CoRL, NeurIPS, CVPR, and ICLR conferences
    • Studying foundation models in robot learning
    • Reading about LLM applications in robotics
    • Learning about sim-to-real transfer methods

Questions I'm Curious About

  • How might robots learn from interacting with their environment?
  • What might help multiple agents work together effectively?
  • How might AI systems learn to break down complex tasks?
  • What role might physical interaction play in developing AI systems?

References

Professional and academic references available upon request.