Curriculum Vitae

Postgraduate Student with Computer Vision and Medical AI Experience | Learning Embodied Intelligence and Neuroscience

Download CV (PDF)

Education

M.S. in Artificial Intelligence and Adaptive System

Sussex Artificial Intelligence Institute, Zhejiang Gongshang University
Hangzhou, China | Sep 2024 - Jan 2026

Core Courses: Intelligence in Animals and Machines, Intelligent Systems Techniques, Image Processing, Natural Language Processing, Machine Learning
Supervisors: Assistant Professor Temitayo Olugbade (University of Sussex), Assistant Professor Peter Wijeratne (University of Sussex), and Professor Xie Mande (Zhejiang Gongshang University)
Research Direction: Integrating physics models into a VAE framework to enhance latent-space interpretability.
Overall Average: 79
Thesis Title: Disentangling physics, anatomical, time, and identity information in latent variables of medical images (interpretable representation learning for Alzheimer's disease-related medical imaging)

B.S. in Computer Science and Technology

Wenzhou Business College
Wenzhou, China | 2019-2023

GPA: 3.41/5.0 (84.7/100)
Thesis: "Smoking behavior detection based on deep learning and skeletal framework"
Relevant Coursework: Data Structures and Algorithms, Python Programming, Data Analysis, Machine Learning
Thesis Advisor: Prof. Fangjun Kuang
Leadership: Student union staff and club president during undergraduate study; class committee and Party branch committee member during graduate study

Projects & Internship Experience

Research Intern

Wenzhou Medical University First Affiliated Hospital - Hepato-Pancreato-biliary Surgery Laboratory
Wenzhou, China | Sept 2022 - Jan 2023

Assisted in developing medical image preprocessing software for clinical applications, resulting in software copyright registration (2022SR0252378)
Contributed to deep learning models for leukemia diagnosis based on tongue image analysis
Helped create machine learning algorithms for exosome feature analysis in hepatocellular carcinoma research, contributing to a paper published in Frontiers in Cell and Developmental Biology
Technologies: PyTorch, TensorFlow, OpenCV

Student Team Leader, National Innovation Training Project

Wenzhou Business College
June 2022 - June 2023

Led a team of 4 students in exploring attention mechanisms for enhancing YOLO models
Implemented and tested self-attention module modifications, exploring methods to improve detection accuracy
Contributed to securing 3 software copyrights and 1 patent application based on project outcomes
Technologies: PyTorch, YOLO, Computer Vision, OpenCV
Advisor: Prof. Siyang Zhang

Intern

Zhejiang University Urban — Rural Planning & Design Institute Co., Ltd
Hangzhou, China | Sep 2025 - Dec 2025

Multi-role agent workflow project for document processing/review (NDA)
Delivered a system with document preview, OCR parsing services, and Docker Compose deployment
Introduced MCP tools and a pluggable toolchain to support multi-agent collaboration and observability; supported user interruption and checkpoint resume

Enterprise Collaboration Project

Industry partner
2025

Migrated CUDA‑based YOLOv8 training/inference pipeline to 8× Huawei Ascend 910B (NPU)
Completed CANN/ACL adaptation and HCCL multi‑card training; fixed operator discrepancies and aligned accuracy
Optimized data pipelines and graph mode execution with near‑baseline mAP; supported containerized deployment

Publications

Machine Learning Identifies Exosome Features Related to Hepatocellular Carcinoma

Journal: Frontiers in Cell and Developmental Biology (September 2022)

Authors: Kai Zhu, Qiqi Tao, Jiatao Yan, Zhichao Lang, Xinmiao Li, Yifei Li, Congcong Fan, Zhengping Yu

DOI: 10.3389/fcell.2022.1020415

Impact Factor: 5.8

Co-first author (third position): Designed and implemented the ML analysis pipeline. Compared multiple algorithms (Random Forest, SVM‑RFE, LASSO) to identify and validate high‑value exosome biomarkers from high‑dimensional proteomics data.

Multi-omics and Machine Learning-driven CD8+ T Cell Heterogeneity Score for Prognosis

Journal: Molecular Therapy Nucleic Acids (December 2024)

Authors: Di He, Zhan Yang, Tian Zhang, Yaxian Luo, Lianjie Peng, Jiatao Yan, Tao Qiu, Jingyu Zhang, Luying Qin, Zhichao Liu, Xiaoting Zhang, Lining Lin, Mouyuan Sun

DOI: 10.1016/j.omtn.2024.102413

Impact Factor: 6.4

Contribution: Provided machine learning support by implementing multiple methods (including LASSO regression) to identify key prognostic genes from multi‑omics data, providing computational support and feature inputs for score construction.

Using Multiomics and Machine Learning: Insights into Improving the Outcomes of Clear Cell Renal Cell Carcinoma via the SRD5A3-AS1/hsa-let-7e-5p/RRM2 Axis

Journal: ACS Omega (June 2025)

Authors: Mouyuan Sun, Zhan Yang, Yaxian Luo, Luying Qin, Lianjie Peng, Chaoran Pan, Jiatao Yan, Tao Qiu, Yan Zhang

DOI： 10.1021/acsomega.5c01337

Impact Factor: 3.7

Contribution: Implemented the complete machine learning analysis pipeline to identify and quantify the prognostic value of the SRD5A3‑AS1/hsa‑let‑7e‑5p/RRM2 axis in clear cell renal cell carcinoma; participated in validation using single‑cell and spatial transcriptomics analyses.

A multi-data fusion deep learning model for prognostic prediction in upper tract urothelial carcinoma

Journal: Frontiers in Oncology (August 2025)

Authors: Hongdi Sun, Siping Chen, Yongxing Bao, Fengyan You, Honghui Zhu, Xin Yao, Lianguo Chen, Jiangwei Miao, Fanggui Shao, Xiaomin Gao, Binwei Lin

DOI: 10.3389/fonc.2025.1644250

Contribution: Designed and implemented deep learning architectures for multi-phase CT analysis; integrated imaging features with clinical tabular data to build a comprehensive prognostic model; participated in model validation and optimization, providing key technical support for the publication.

Manuscripts in Preparation

YOLOv11-LCDFS: Enhanced Smoking Detection With Low-light Enhancement

In Revision

Authors: Jiatao Yan, Zhuzikai Zheng, Zhengtan Yang, Hao Jiang, Peichen Wang, Fangjun Kuang, Siyang Zhang

First author: working on a YOLO-based architecture with integrated low-light enhancement capabilities, specialized loss functions, attention mechanisms, and optimized upsampling techniques for improved detection in challenging lighting conditions.

Software Copyrights & Patent Application

Patent Application

Title: Smoking Behavior Recognition Camera and Determination Method

Application Number: 202310277784.1

Status: Application process concluded - withdrawn due to academic credential requirements rather than technical merit

Inventors: Jiatao Yan, Siyang Zhang, Fangjun Kuang, Peichen Wang, Zhuzikai Zheng, Hao Jiang, Zhengtan Yang, Hanwen Bao, Chunqiu Xia

Summary: A method combining pose estimation for real-time smoking behavior detection in public spaces.

Software Copyrights

Medical Image Computing Software (2022SR0252378)
Registered: April 2022
Human Skeleton Recognition Software (2022SR1258998)
Registered: October 2022
Cigarette Recognition Software (2022SR1277520)
Registered: October 2022
Smoking Behavior Detection Software (2022SR1277521)
Registered: October 2022

Academic Achievements & Awards

ECG Image Digitization

Bronze Medal | Ranked 97/1424 | Top 7% | Kaggle Global Competition | 2026.01.23

Stanford RNA 3D Folding Competition

Ranked 338/1516 | Top 22% | Kaggle Global Competition | 2025.09.25

Yale/UNC-CH - Geophysical Waveform Inversion

Ranked 255/1365 | Top 19% | Kaggle Global Competition | 2025.07.01

BYU - Locating Bacterial Flagellar Motors 2025

Ranked 315/1136 | Top 28% | Kaggle Global Competition | 2025.06.05

Predict Calorie Expenditure

Ranked 178/4316 | Top 5% | Kaggle Competition | 2025.06.01

Predict Podcast Listening Time

Ranked 116/3310 | Top 4% | Kaggle Competition | 2025.05.01

HuBMAP + HPA Competition

Ranked 441/1174 | Top 38% | Kaggle Global Competition | 2022.09.23

18th Challenge Cup College Student Competition

Bronze Medal | Zhejiang Province Level | May 2023

4th National "Chuanzhi Cup" IT Skills Competition

Provincial Excellent Award | Zhejiang Province | December 2021

2023 Wenzhou Computer Society Student Member Innovation and Entrepreneurship Award

Third Prize | Wenzhou | April 2024

Projects

Nuwax Platform: Agent Development (NDA)

Participated in developing multi-agent and task-specific skills for a privately deployed agent platform; packaged skills as standardized tools and configurations. Under NDA, business details are omitted.

Supported packaging the project into OpenClaw for direct execution
Designed input/output constraints and tool-calling boundaries to reduce format drift and invalid calls in multi-turn interactions
Built a reusable tool adapter layer (APIs/files/data queries) with authentication, timeout control, retries, and exception fallbacks

Technologies: Python, Agent Engineering, Tool Calling, Workflow Orchestration, Observability

Date: Feb 2026

Contract Review Agent Module (NDA)

Participated in building an internal contract review assistant module that converts contracts into structured clauses and produces risk notes with evidence references. Under NDA, organization, contract types, and data details are omitted.

Implemented contract parsing and clause indexing to produce structured outputs (clauses, original text locations, citation anchors) for traceable review
Integrated retrieval and tool calling to bind review conclusions with evidence spans (paragraph/page), generating a verifiable risk list
Packaged review task APIs and background processing (queue/retries) to support batch processing and resubmission after manual verification

Technologies: LLM Application Development, Document Parsing, RAG, Tool Calling, Multi-Agent Workflows

Date: Jan 2026

3D-Enhanced Scientific Image Forgery Detection (NDA)

Participated in exploring 3D-enhanced scientific image forgery detection by introducing pseudo-3D and physics-inspired features into pixel-level segmentation, with supporting feature caching and evaluation workflows. Under NDA, data and business details are omitted.

Implemented offline extraction and cache reuse for pseudo-3D and physics-inspired features to reduce repeated computation during training
Improved data preparation and mask alignment scripts to support batch preprocessing, patch-based training, and unified evaluation entrypoints
Generated inference visualizations and offline evaluation reports to compare the impact of different feature combinations

Technologies: PyTorch, Swin Transformer (Training), OpenCV, Feature Caching, Segmentation Evaluation & Reporting

Date: Dec 2025

Biomedical Scientific Image Copy-Move Forgery Detection & Segmentation (NDA)

Participated in building a scientific image integrity screening module that detects suspected copy-move operations and outputs pixel-level localization masks to assist reviewers. Under NDA, project source and data details are omitted.

Implemented high-resolution sliding-window inference and mask stitching to output overlays, masks, and key region coordinates
Implemented post-processing and risk grading strategies (thresholding, area ratio, etc.) to enable fast triage and ranking
Provided FastAPI inference endpoints and batch scripts for offline runs, result persistence, and report generation

Technologies: PyTorch, timm, OpenCV, FastAPI, ONNX/ONNXRuntime (Deployment), Sliding-Window Aggregation, Image Forensics

Date: Nov 2025

Scientific Content Review & Image Analysis Platform (NDA)

Participated in developing a platform to assist review of scientific PDFs and images, covering document parsing, text-image extraction, similarity screening, quality evaluation, and report export. Under NDA, clients and data sources are omitted.

Implemented an upload and parsing pipeline: normalized document formats, performed page-by-page parsing, extracted images/text, and generated traceable metadata (unique ID, page number, context text, etc.)
Implemented “evidence filtering → parallel analysis”: filtered non-evidence images via a hybrid vision/rule strategy and ran quality evaluation, image-text consistency checks, and anomaly trace detection in parallel
Implemented duplication and tampering risk detection: combined retrieval-based similarity, local matching, and deep model inference to output verifiable localization evidence and structured summaries (for frontend display/report export)

Technologies: Python, Document Parsing, Image Processing, Vector Retrieval/Similarity Search, Deep Learning Inference, Async Task Orchestration, Docker

Date: Oct 2025

Multi-Agent Document Workflow Platform (NDA)

Participated in developing a document processing and review platform covering upload, preview, OCR, information extraction, and export. Under NDA, organization, data, and details are omitted.

Implemented end-to-end task orchestration and state tracking for long-running document jobs, including progress reporting, retries, step-level reruns, and checkpoint resume
Implemented pluggable integration for document preview and parsing services via a unified adapter layer (request/response schemas, error codes, and timeout policies), enabling parser switching by scenario
Built a multi-role Agent/workflow execution framework based on LangChain, persisting key intermediate artifacts in structured form and recording auditable logs for review, tracing, and reproduction

Technologies: FastAPI, React/Vite, Docker Compose, LangChain, OCR, Multi-Agent Workflows, RAG, Pluggable Toolchain

Date: Sep 2025

YOLOv11-LCDFS: Enhanced Smoking Detection With Low-light Enhancement

Extension of smoking detection research focusing on improved object detection in challenging lighting conditions. Developing a YOLO-based architecture with specialized components addressing the unique challenges of low-light environments. GitHub

Designing custom loss functions specifically optimized for low-light object detection scenarios
Implementing attention mechanisms to focus on key visual features in varying illumination conditions
Optimizing upsampling techniques to preserve fine details in dark environments
Integrating lightweight low-light enhancement module into the detection pipeline

Technologies: PyTorch, YOLO, Computer Vision, CUDA, Attention Mechanisms

Status: Ongoing (April 2025)

Related publications: YOLOv11‑LCDFS: Enhanced Smoking Detection With Low‑light Enhancement (manuscript in preparation)

Multi-modal Medical Image Analysis for Cancer Research

Developing medical image analysis systems for cancer research. GitHub

Designing medical image segmentation algorithms for hepatocellular carcinoma and renal cell carcinoma
Designing methods for integrating clinical tabular data with imaging features for comprehensive analysis
Implementing multi-modal fusion techniques for combining different CT scan phases
Developing 3D volumetric segmentation approaches for comprehensive anatomical analysis

Technologies: Python, Deep Learning, 3D Segmentation, Multi-modal Fusion, PyDicom, NumPy

Status: Ongoing (April 2025)

Related publications: Multi‑modal medical image segmentation and fusion (manuscript in preparation)

Disease Detection Using Deep Learning

Developed deep learning models for automated disease detection and classification from medical imaging data, with a focus on diabetic foot ulcer detection. GitHub

Implemented and compared multiple YOLOv8 architecture variants enhanced with advanced attention mechanisms (GAM, CBAM, ECA, CoordAtt) for precise localization of diabetic wounds in clinical images
Designed and evaluated custom YOLOv8 architectures integrating novel upsampling techniques including CARAFE and DySample for improved feature map resolution
Developed innovative triplet-based loss functions and Inner-CIoU mechanisms to enhance detection accuracy for wounds of varying sizes and appearances
Employed dynamic convolution techniques to adaptively capture wound features across diverse clinical settings and lighting conditions
Constructed and curated specialized datasets of diabetic foot ulcers for training and comprehensive validation
Achieved significant improvements in both detection accuracy and inference speed compared to baseline models, with particular gains for small and atypical lesions
Implemented model explainability techniques to visualize feature importance and attention maps for clinical interpretation

Technologies: PyTorch, YOLO, Attention Mechanisms (GAM/CBAM/ECA), Custom Loss Functions, Feature Upsampling Techniques, Dynamic Convolutions

Date: August 2024

Twitter Quality and Spam Detection System

Developed an advanced machine learning system for Twitter content quality assessment and spam detection, achieving over 88% accuracy in distinguishing between legitimate tweets and spam content. GitHub

Implemented comprehensive data preprocessing techniques for Twitter data, including text normalization, feature extraction, and handling of missing values
Engineered complex features by combining user metrics (follower count, following count) and behavioral patterns to enhance classification performance
Applied sentiment analysis and content analysis techniques to identify quality patterns in tweets
Developed and compared multiple machine learning models to optimize classification accuracy
Created data visualizations to communicate findings and identify key patterns in Twitter content quality assessment

Technologies: Python, NLTK, scikit-learn, Pandas, Matplotlib, Seaborn

Date: August 2024

Airbnb Price Analysis and Prediction System

Developed a comprehensive data analysis and machine learning system to predict Airbnb listing prices in New York City based on various property features and location data. GitHub

Performed extensive exploratory data analysis on NYC Airbnb dataset with over 48,000 listings, creating visualizations to reveal pricing patterns across neighborhoods
Implemented geospatial analysis to visualize property distribution and identify high-value areas using Python GIS libraries
Engineered relevant features by transforming categorical variables and creating new metrics to better capture pricing factors
Developed a RandomForest regression model to predict continuous listing prices with optimized hyperparameters
Created a classification model to categorize listings into price brackets, achieving high accuracy through model tuning with RandomizedSearchCV
Built interactive visualizations to help hosts and travelers understand pricing determinants in the NYC short-term rental market

Technologies: Python, Pandas, Scikit-learn, GeoPandas, Matplotlib, Seaborn, RandomForest

Date: January 2024

Airline Sentiment Analysis System

Learning to analyze airline customer sentiment from Twitter data using natural language processing techniques. GitHub

Studying text preprocessing methods for social media data (handling hashtags, mentions, emojis, URLs)
Learning about different feature extraction approaches including TF-IDF vectorization and N-gram analysis
Working with various classification models and comparing their performance
Practicing error analysis to understand patterns in misclassified tweets
Creating visualizations using Matplotlib, Seaborn, and Plotly to understand sentiment trends

Technologies: Python, NLTK, spaCy, scikit-learn, Pandas, Matplotlib, Seaborn, Plotly, WordCloud

Date: December 2023

Smoking Behavior Detection System

Undergraduate thesis project combining YOLO object detection with MediaPipe skeletal tracking to accurately identify and classify smoking gestures in video streams. GitHub

Utilized YOLO object detection to identify cigarettes and related objects in video footage
Implemented MediaPipe for real-time skeletal tracking and pose estimation
Designed algorithms to recognize characteristic smoking hand-to-mouth gesture patterns

Technologies: YOLO, MediaPipe, Pose Estimation, Action Recognition, PyTorch, OpenCV

Date: April 2023

Outcomes: Patent application (202310277784.1), 3 software copyrights, thesis received outstanding evaluation

Heart Disease Prediction System

Learning to analyze health indicators for heart disease prediction using machine learning approaches. GitHub

Working with health metrics from the CDC's BRFSS survey dataset
Learning about feature selection and preprocessing for health indicators
Studying both traditional machine learning models and deep learning approaches
Practicing model evaluation using cross-validation and performance metrics
Creating visualizations to understand relationships between health factors

Technologies: Python, TensorFlow 2.11.0, scikit-learn, Pandas, NumPy, Matplotlib, Seaborn

Date: May 2022

Research Interests

Medical Artificial Intelligence

Learning deep learning methods for medical image analysis and disease diagnosis
Multi-modal integration and feature extraction for clinical data
Exploring computer-aided diagnosis systems for real-world clinical use

Embodied Intelligence

Studying how robots may acquire intelligence through physical interaction with the world
Exploring the connection between perception and action in embodied agents
Learning reinforcement learning for robotics applications

Computer Vision

Attention-based object detection architectures
Human pose estimation and behavior recognition
Visual feature extraction for real-world applications

Multi-Agent Systems

Understanding how multiple agents collaborate
Curious about emergent behaviors and collective intelligence
Learning the foundations of multi-agent AI systems

Technical Skills

Programming Languages

Python
C, Java, SQL
C#, JavaScript, Vue

AI & Machine Learning

Frameworks: PyTorch, TensorFlow
Areas: Computer Vision, Deep Learning, Reinforcement Learning
Techniques: CNNs, Attention Mechanisms, Transfer Learning

Development Tools

Version Control: Git, GitHub
Documentation: LaTeX, Markdown
Environment: Linux, Jupyter, Docker

Data Analysis

Libraries: NumPy, Pandas, SciPy
Visualization: Matplotlib, Seaborn, Plotly

Languages

Chinese: Native speaker
English: IELTS 6.0 (L 6.5 / R 6.5 / W 5.5 / S 6.0); CET‑6 467

Certifications

DevCloud Summer Training Camp (Huawei)
Python for AI Development (Shandong University)
Deep Learning Fundamentals (Shandong University)

References

Professional and academic references can be provided upon request.