Curriculum Vitae

Postgraduate Student with Computer Vision and Medical AI Experience | Learning Embodied Intelligence and Neuroscience


Education

M.S. in Artificial Intelligence and Adaptive System

Sussex Artificial Intelligence Institute, Zhejiang Gongshang University
Hangzhou, China | Sep 2024 - Jan 2026

  • Core Courses: Intelligence in Animals and Machines, Intelligent Systems Techniques, Image Processing, Natural Language Processing, Machine Learning
  • Supervisors: Assistant Professor Temitayo Olugbade (University of Sussex), Assistant Professor Peter Wijeratne (University of Sussex), and Professor Xie Mande (Zhejiang Gongshang University)
  • Research Direction: Integrating physics models into a VAE framework to enhance latent-space interpretability.
  • Overall Average: 79
  • Thesis Title: Disentangling physics, anatomical, time, and identity information in latent variables of medical images (interpretable representation learning for Alzheimer's disease-related medical imaging)

B.S. in Computer Science and Technology

Wenzhou Business College
Wenzhou, China | 2019-2023

  • GPA: 3.41/5.0 (84.7/100)
  • Thesis: "Smoking behavior detection based on deep learning and skeletal framework"
  • Relevant Coursework: Data Structures and Algorithms, Python Programming, Data Analysis, Machine Learning
  • Thesis Advisor: Prof. Fangjun Kuang
  • Leadership: Student union staff and club president during undergraduate study; class committee and Party branch committee member during graduate study

Projects & Internship Experience

Research Intern

Wenzhou Medical University First Affiliated Hospital - Hepato-Pancreato-biliary Surgery Laboratory
Wenzhou, China | Sept 2022 - Jan 2023

  • Assisted in developing medical image preprocessing software for clinical applications, resulting in software copyright registration (2022SR0252378)
  • Contributed to deep learning models for leukemia diagnosis based on tongue image analysis
  • Helped create machine learning algorithms for exosome feature analysis in hepatocellular carcinoma research, contributing to a paper published in Frontiers in Cell and Developmental Biology
  • Technologies: PyTorch, TensorFlow, OpenCV

Student Team Leader, National Innovation Training Project

Wenzhou Business College
June 2022 - June 2023

  • Led a team of 4 students in exploring attention mechanisms for enhancing YOLO models
  • Implemented and tested self-attention module modifications, exploring methods to improve detection accuracy
  • Contributed to securing 3 software copyrights and 1 patent application based on project outcomes
  • Technologies: PyTorch, YOLO, Computer Vision, OpenCV
  • Advisor: Prof. Siyang Zhang

Intern

Zhejiang University Urban — Rural Planning & Design Institute Co., Ltd
Hangzhou, China | Sep 2025 - Dec 2025

  • Multi-role agent workflow project for document processing/review (NDA)
  • Delivered a system with document preview, OCR parsing services, and Docker Compose deployment
  • Introduced MCP tools and a pluggable toolchain to support multi-agent collaboration and observability; supported user interruption and checkpoint resume

Enterprise Collaboration Project

Industry partner
2025

  • Migrated CUDA‑based YOLOv8 training/inference pipeline to 8× Huawei Ascend 910B (NPU)
  • Completed CANN/ACL adaptation and HCCL multi‑card training; fixed operator discrepancies and aligned accuracy
  • Optimized data pipelines and graph mode execution with near‑baseline mAP; supported containerized deployment

Publications

Machine Learning Identifies Exosome Features Related to Hepatocellular Carcinoma

Journal: Frontiers in Cell and Developmental Biology (September 2022)

Authors: Kai Zhu, Qiqi Tao, Jiatao Yan, Zhichao Lang, Xinmiao Li, Yifei Li, Congcong Fan, Zhengping Yu

DOI: 10.3389/fcell.2022.1020415

Impact Factor: 5.8

Co-first author (third position): Designed and implemented the ML analysis pipeline. Compared multiple algorithms (Random Forest, SVM‑RFE, LASSO) to identify and validate high‑value exosome biomarkers from high‑dimensional proteomics data.

Multi-omics and Machine Learning-driven CD8+ T Cell Heterogeneity Score for Prognosis

Journal: Molecular Therapy Nucleic Acids (December 2024)

Authors: Di He, Zhan Yang, Tian Zhang, Yaxian Luo, Lianjie Peng, Jiatao Yan, Tao Qiu, Jingyu Zhang, Luying Qin, Zhichao Liu, Xiaoting Zhang, Lining Lin, Mouyuan Sun

DOI: 10.1016/j.omtn.2024.102413

Impact Factor: 6.4

Contribution: Provided machine learning support by implementing multiple methods (including LASSO regression) to identify key prognostic genes from multi‑omics data, providing computational support and feature inputs for score construction.

Using Multiomics and Machine Learning: Insights into Improving the Outcomes of Clear Cell Renal Cell Carcinoma via the SRD5A3-AS1/hsa-let-7e-5p/RRM2 Axis

Journal: ACS Omega (June 2025)

Authors: Mouyuan Sun, Zhan Yang, Yaxian Luo, Luying Qin, Lianjie Peng, Chaoran Pan, Jiatao Yan, Tao Qiu, Yan Zhang

DOI: 10.1021/acsomega.5c01337

Impact Factor: 3.7

Contribution: Implemented the complete machine learning analysis pipeline to identify and quantify the prognostic value of the SRD5A3‑AS1/hsa‑let‑7e‑5p/RRM2 axis in clear cell renal cell carcinoma; participated in validation using single‑cell and spatial transcriptomics analyses.

A multi-data fusion deep learning model for prognostic prediction in upper tract urothelial carcinoma

Journal: Frontiers in Oncology (August 2025)

Authors: Hongdi Sun, Siping Chen, Yongxing Bao, Fengyan You, Honghui Zhu, Xin Yao, Lianguo Chen, Jiangwei Miao, Fanggui Shao, Xiaomin Gao, Binwei Lin

DOI: 10.3389/fonc.2025.1644250

Contribution: Designed and implemented deep learning architectures for multi-phase CT analysis; integrated imaging features with clinical tabular data to build a comprehensive prognostic model; participated in model validation and optimization, providing key technical support for the publication.


Manuscripts in Preparation

YOLOv11-LCDFS: Enhanced Smoking Detection With Low-light Enhancement

In Revision

Authors: Jiatao Yan, Zhuzikai Zheng, Zhengtan Yang, Hao Jiang, Peichen Wang, Fangjun Kuang, Siyang Zhang

First author: working on a YOLO-based architecture with integrated low-light enhancement capabilities, specialized loss functions, attention mechanisms, and optimized upsampling techniques for improved detection in challenging lighting conditions.


Software Copyrights & Patent Application

Patent Application

Title: Smoking Behavior Recognition Camera and Determination Method

Application Number: 202310277784.1

Status: Application process concluded - withdrawn due to academic credential requirements rather than technical merit

Inventors: Jiatao Yan, Siyang Zhang, Fangjun Kuang, Peichen Wang, Zhuzikai Zheng, Hao Jiang, Zhengtan Yang, Hanwen Bao, Chunqiu Xia

Summary: A method combining pose estimation for real-time smoking behavior detection in public spaces.

Software Copyrights

  • Medical Image Computing Software (2022SR0252378)
    Registered: April 2022
  • Human Skeleton Recognition Software (2022SR1258998)
    Registered: October 2022
  • Cigarette Recognition Software (2022SR1277520)
    Registered: October 2022
  • Smoking Behavior Detection Software (2022SR1277521)
    Registered: October 2022

Academic Achievements & Awards

ECG Image Digitization

Bronze Medal | Ranked 97/1424 | Top 7% | Kaggle Global Competition | 2026.01.23

Stanford RNA 3D Folding Competition

Ranked 338/1516 | Top 22% | Kaggle Global Competition | 2025.09.25

Yale/UNC-CH - Geophysical Waveform Inversion

Ranked 255/1365 | Top 19% | Kaggle Global Competition | 2025.07.01

BYU - Locating Bacterial Flagellar Motors 2025

Ranked 315/1136 | Top 28% | Kaggle Global Competition | 2025.06.05

Predict Calorie Expenditure

Ranked 178/4316 | Top 5% | Kaggle Competition | 2025.06.01

Predict Podcast Listening Time

Ranked 116/3310 | Top 4% | Kaggle Competition | 2025.05.01

HuBMAP + HPA Competition

Ranked 441/1174 | Top 38% | Kaggle Global Competition | 2022.09.23

18th Challenge Cup College Student Competition

Bronze Medal | Zhejiang Province Level | May 2023

4th National "Chuanzhi Cup" IT Skills Competition

Provincial Excellent Award | Zhejiang Province | December 2021

2023 Wenzhou Computer Society Student Member Innovation and Entrepreneurship Award

Third Prize | Wenzhou | April 2024


Projects

Nuwax Platform: Agent Development (NDA)

Participated in developing multi-agent and task-specific skills for a privately deployed agent platform; packaged skills as standardized tools and configurations. Under NDA, business details are omitted.

  • Supported packaging the project into OpenClaw for direct execution
  • Designed input/output constraints and tool-calling boundaries to reduce format drift and invalid calls in multi-turn interactions
  • Built a reusable tool adapter layer (APIs/files/data queries) with authentication, timeout control, retries, and exception fallbacks

Technologies: Python, Agent Engineering, Tool Calling, Workflow Orchestration, Observability

Date: Feb 2026

Contract Review Agent Module (NDA)

Participated in building an internal contract review assistant module that converts contracts into structured clauses and produces risk notes with evidence references. Under NDA, organization, contract types, and data details are omitted.

  • Implemented contract parsing and clause indexing to produce structured outputs (clauses, original text locations, citation anchors) for traceable review
  • Integrated retrieval and tool calling to bind review conclusions with evidence spans (paragraph/page), generating a verifiable risk list
  • Packaged review task APIs and background processing (queue/retries) to support batch processing and resubmission after manual verification

Technologies: LLM Application Development, Document Parsing, RAG, Tool Calling, Multi-Agent Workflows

Date: Jan 2026

3D-Enhanced Scientific Image Forgery Detection (NDA)

Participated in exploring 3D-enhanced scientific image forgery detection by introducing pseudo-3D and physics-inspired features into pixel-level segmentation, with supporting feature caching and evaluation workflows. Under NDA, data and business details are omitted.

  • Implemented offline extraction and cache reuse for pseudo-3D and physics-inspired features to reduce repeated computation during training
  • Improved data preparation and mask alignment scripts to support batch preprocessing, patch-based training, and unified evaluation entrypoints
  • Generated inference visualizations and offline evaluation reports to compare the impact of different feature combinations

Technologies: PyTorch, Swin Transformer (Training), OpenCV, Feature Caching, Segmentation Evaluation & Reporting

Date: Dec 2025

Biomedical Scientific Image Copy-Move Forgery Detection & Segmentation (NDA)

Participated in building a scientific image integrity screening module that detects suspected copy-move operations and outputs pixel-level localization masks to assist reviewers. Under NDA, project source and data details are omitted.

  • Implemented high-resolution sliding-window inference and mask stitching to output overlays, masks, and key region coordinates
  • Implemented post-processing and risk grading strategies (thresholding, area ratio, etc.) to enable fast triage and ranking
  • Provided FastAPI inference endpoints and batch scripts for offline runs, result persistence, and report generation

Technologies: PyTorch, timm, OpenCV, FastAPI, ONNX/ONNXRuntime (Deployment), Sliding-Window Aggregation, Image Forensics

Date: Nov 2025

Scientific Content Review & Image Analysis Platform (NDA)

Participated in developing a platform to assist review of scientific PDFs and images, covering document parsing, text-image extraction, similarity screening, quality evaluation, and report export. Under NDA, clients and data sources are omitted.

  • Implemented an upload and parsing pipeline: normalized document formats, performed page-by-page parsing, extracted images/text, and generated traceable metadata (unique ID, page number, context text, etc.)
  • Implemented “evidence filtering → parallel analysis”: filtered non-evidence images via a hybrid vision/rule strategy and ran quality evaluation, image-text consistency checks, and anomaly trace detection in parallel
  • Implemented duplication and tampering risk detection: combined retrieval-based similarity, local matching, and deep model inference to output verifiable localization evidence and structured summaries (for frontend display/report export)

Technologies: Python, Document Parsing, Image Processing, Vector Retrieval/Similarity Search, Deep Learning Inference, Async Task Orchestration, Docker

Date: Oct 2025

Multi-Agent Document Workflow Platform (NDA)

Participated in developing a document processing and review platform covering upload, preview, OCR, information extraction, and export. Under NDA, organization, data, and details are omitted.

  • Implemented end-to-end task orchestration and state tracking for long-running document jobs, including progress reporting, retries, step-level reruns, and checkpoint resume
  • Implemented pluggable integration for document preview and parsing services via a unified adapter layer (request/response schemas, error codes, and timeout policies), enabling parser switching by scenario
  • Built a multi-role Agent/workflow execution framework based on LangChain, persisting key intermediate artifacts in structured form and recording auditable logs for review, tracing, and reproduction

Technologies: FastAPI, React/Vite, Docker Compose, LangChain, OCR, Multi-Agent Workflows, RAG, Pluggable Toolchain

Date: Sep 2025

YOLOv11-LCDFS: Enhanced Smoking Detection With Low-light Enhancement

Extension of smoking detection research focusing on improved object detection in challenging lighting conditions. Developing a YOLO-based architecture with specialized components addressing the unique challenges of low-light environments. GitHub

  • Designing custom loss functions specifically optimized for low-light object detection scenarios
  • Implementing attention mechanisms to focus on key visual features in varying illumination conditions
  • Optimizing upsampling techniques to preserve fine details in dark environments
  • Integrating lightweight low-light enhancement module into the detection pipeline

Technologies: PyTorch, YOLO, Computer Vision, CUDA, Attention Mechanisms

Status: Ongoing (April 2025)

Related publications: YOLOv11‑LCDFS: Enhanced Smoking Detection With Low‑light Enhancement (manuscript in preparation)

Multi-modal Medical Image Analysis for Cancer Research

Developing medical image analysis systems for cancer research. GitHub

  • Designing medical image segmentation algorithms for hepatocellular carcinoma and renal cell carcinoma
  • Designing methods for integrating clinical tabular data with imaging features for comprehensive analysis
  • Implementing multi-modal fusion techniques for combining different CT scan phases
  • Developing 3D volumetric segmentation approaches for comprehensive anatomical analysis

Technologies: Python, Deep Learning, 3D Segmentation, Multi-modal Fusion, PyDicom, NumPy

Status: Ongoing (April 2025)

Related publications: Multi‑modal medical image segmentation and fusion (manuscript in preparation)

Disease Detection Using Deep Learning

Developed deep learning models for automated disease detection and classification from medical imaging data, with a focus on diabetic foot ulcer detection. GitHub

  • Implemented and compared multiple YOLOv8 architecture variants enhanced with advanced attention mechanisms (GAM, CBAM, ECA, CoordAtt) for precise localization of diabetic wounds in clinical images
  • Designed and evaluated custom YOLOv8 architectures integrating novel upsampling techniques including CARAFE and DySample for improved feature map resolution
  • Developed innovative triplet-based loss functions and Inner-CIoU mechanisms to enhance detection accuracy for wounds of varying sizes and appearances
  • Employed dynamic convolution techniques to adaptively capture wound features across diverse clinical settings and lighting conditions
  • Constructed and curated specialized datasets of diabetic foot ulcers for training and comprehensive validation
  • Achieved significant improvements in both detection accuracy and inference speed compared to baseline models, with particular gains for small and atypical lesions
  • Implemented model explainability techniques to visualize feature importance and attention maps for clinical interpretation

Technologies: PyTorch, YOLO, Attention Mechanisms (GAM/CBAM/ECA), Custom Loss Functions, Feature Upsampling Techniques, Dynamic Convolutions

Date: August 2024

Twitter Quality and Spam Detection System

Developed an advanced machine learning system for Twitter content quality assessment and spam detection, achieving over 88% accuracy in distinguishing between legitimate tweets and spam content. GitHub

  • Implemented comprehensive data preprocessing techniques for Twitter data, including text normalization, feature extraction, and handling of missing values
  • Engineered complex features by combining user metrics (follower count, following count) and behavioral patterns to enhance classification performance
  • Applied sentiment analysis and content analysis techniques to identify quality patterns in tweets
  • Developed and compared multiple machine learning models to optimize classification accuracy
  • Created data visualizations to communicate findings and identify key patterns in Twitter content quality assessment

Technologies: Python, NLTK, scikit-learn, Pandas, Matplotlib, Seaborn

Date: August 2024

Airbnb Price Analysis and Prediction System

Developed a comprehensive data analysis and machine learning system to predict Airbnb listing prices in New York City based on various property features and location data. GitHub

  • Performed extensive exploratory data analysis on NYC Airbnb dataset with over 48,000 listings, creating visualizations to reveal pricing patterns across neighborhoods
  • Implemented geospatial analysis to visualize property distribution and identify high-value areas using Python GIS libraries
  • Engineered relevant features by transforming categorical variables and creating new metrics to better capture pricing factors
  • Developed a RandomForest regression model to predict continuous listing prices with optimized hyperparameters
  • Created a classification model to categorize listings into price brackets, achieving high accuracy through model tuning with RandomizedSearchCV
  • Built interactive visualizations to help hosts and travelers understand pricing determinants in the NYC short-term rental market

Technologies: Python, Pandas, Scikit-learn, GeoPandas, Matplotlib, Seaborn, RandomForest

Date: January 2024

Airline Sentiment Analysis System

Learning to analyze airline customer sentiment from Twitter data using natural language processing techniques. GitHub

  • Studying text preprocessing methods for social media data (handling hashtags, mentions, emojis, URLs)
  • Learning about different feature extraction approaches including TF-IDF vectorization and N-gram analysis
  • Working with various classification models and comparing their performance
  • Practicing error analysis to understand patterns in misclassified tweets
  • Creating visualizations using Matplotlib, Seaborn, and Plotly to understand sentiment trends

Technologies: Python, NLTK, spaCy, scikit-learn, Pandas, Matplotlib, Seaborn, Plotly, WordCloud

Date: December 2023

Smoking Behavior Detection System

Undergraduate thesis project combining YOLO object detection with MediaPipe skeletal tracking to accurately identify and classify smoking gestures in video streams. GitHub

  • Utilized YOLO object detection to identify cigarettes and related objects in video footage
  • Implemented MediaPipe for real-time skeletal tracking and pose estimation
  • Designed algorithms to recognize characteristic smoking hand-to-mouth gesture patterns

Technologies: YOLO, MediaPipe, Pose Estimation, Action Recognition, PyTorch, OpenCV

Date: April 2023

Outcomes: Patent application (202310277784.1), 3 software copyrights, thesis received outstanding evaluation

Heart Disease Prediction System

Learning to analyze health indicators for heart disease prediction using machine learning approaches. GitHub

  • Working with health metrics from the CDC's BRFSS survey dataset
  • Learning about feature selection and preprocessing for health indicators
  • Studying both traditional machine learning models and deep learning approaches
  • Practicing model evaluation using cross-validation and performance metrics
  • Creating visualizations to understand relationships between health factors

Technologies: Python, TensorFlow 2.11.0, scikit-learn, Pandas, NumPy, Matplotlib, Seaborn

Date: May 2022


Research Interests

Medical Artificial Intelligence

  • Learning deep learning methods for medical image analysis and disease diagnosis
  • Multi-modal integration and feature extraction for clinical data
  • Exploring computer-aided diagnosis systems for real-world clinical use

Embodied Intelligence

  • Studying how robots may acquire intelligence through physical interaction with the world
  • Exploring the connection between perception and action in embodied agents
  • Learning reinforcement learning for robotics applications

Computer Vision

  • Attention-based object detection architectures
  • Human pose estimation and behavior recognition
  • Visual feature extraction for real-world applications

Multi-Agent Systems

  • Understanding how multiple agents collaborate
  • Curious about emergent behaviors and collective intelligence
  • Learning the foundations of multi-agent AI systems

Technical Skills

Programming Languages

  • Python
  • C, Java, SQL
  • C#, JavaScript, Vue

AI & Machine Learning

  • Frameworks: PyTorch, TensorFlow
  • Areas: Computer Vision, Deep Learning, Reinforcement Learning
  • Techniques: CNNs, Attention Mechanisms, Transfer Learning

Development Tools

  • Version Control: Git, GitHub
  • Documentation: LaTeX, Markdown
  • Environment: Linux, Jupyter, Docker

Data Analysis

  • Libraries: NumPy, Pandas, SciPy
  • Visualization: Matplotlib, Seaborn, Plotly

Languages

  • Chinese: Native speaker
  • English: IELTS 6.0 (L 6.5 / R 6.5 / W 5.5 / S 6.0); CET‑6 467

Certifications

  • DevCloud Summer Training Camp (Huawei)
  • Python for AI Development (Shandong University)
  • Deep Learning Fundamentals (Shandong University)

References

Professional and academic references can be provided upon request.