Curriculum Vitae
Postgraduate Student with Computer Vision and Medical AI Experience | Learning Embodied Intelligence and Neuroscience
Education
M.S. in Artificial Intelligence and Adaptive System
Sussex Artificial Intelligence Institute, Zhejiang Gongshang University
Hangzhou, China | Sep 2024 - Jan 2026
- Core Courses: Intelligence in Animals and Machines, Intelligent Systems Techniques, Image Processing, Natural Language Processing, Machine Learning
- Supervisors: Assistant Professor Temitayo Olugbade (University of Sussex), Assistant Professor Peter Wijeratne (University of Sussex), and Professor Xie Mande (Zhejiang Gongshang University)
- Research Direction: Integrating physics models into a VAE framework to enhance latent-space interpretability.
- Overall Average: 79
- Thesis Title: Disentangling physics, anatomical, time, and identity information in latent variables of medical images (interpretable representation learning for Alzheimer's disease-related medical imaging)
B.S. in Computer Science and Technology
Wenzhou Business College
Wenzhou, China | 2019-2023
- GPA: 3.41/5.0 (84.7/100)
- Thesis: "Smoking behavior detection based on deep learning and skeletal framework"
- Relevant Coursework: Data Structures and Algorithms, Python Programming, Data Analysis, Machine Learning
- Thesis Advisor: Prof. Fangjun Kuang
- Leadership: Student union staff and club president during undergraduate study; class committee and Party branch committee member during graduate study
Projects & Internship Experience
Research Intern
Wenzhou Medical University First Affiliated Hospital - Hepato-Pancreato-biliary Surgery Laboratory
Wenzhou, China | Sept 2022 - Jan 2023
- Assisted in developing medical image preprocessing software for clinical applications, resulting in software copyright registration (2022SR0252378)
- Contributed to deep learning models for leukemia diagnosis based on tongue image analysis
- Helped create machine learning algorithms for exosome feature analysis in hepatocellular carcinoma research, contributing to a paper published in Frontiers in Cell and Developmental Biology
- Technologies: PyTorch, TensorFlow, OpenCV
Student Team Leader, National Innovation Training Project
Wenzhou Business College
June 2022 - June 2023
- Led a team of 4 students in exploring attention mechanisms for enhancing YOLO models
- Implemented and tested self-attention module modifications, exploring methods to improve detection accuracy
- Contributed to securing 3 software copyrights and 1 patent application based on project outcomes
- Technologies: PyTorch, YOLO, Computer Vision, OpenCV
- Advisor: Prof. Siyang Zhang
Intern
Zhejiang University Urban — Rural Planning & Design Institute Co., Ltd
Hangzhou, China | Sep 2025 - Dec 2025
- Multi-role agent workflow project for document processing/review (NDA)
- Delivered a system with document preview, OCR parsing services, and Docker Compose deployment
- Introduced MCP tools and a pluggable toolchain to support multi-agent collaboration and observability; supported user interruption and checkpoint resume
Enterprise Collaboration Project
Industry partner
2025
- Migrated CUDA‑based YOLOv8 training/inference pipeline to 8× Huawei Ascend 910B (NPU)
- Completed CANN/ACL adaptation and HCCL multi‑card training; fixed operator discrepancies and aligned accuracy
- Optimized data pipelines and graph mode execution with near‑baseline mAP; supported containerized deployment
Publications
Machine Learning Identifies Exosome Features Related to Hepatocellular Carcinoma
Journal: Frontiers in Cell and Developmental Biology (September 2022)
Authors: Kai Zhu, Qiqi Tao, Jiatao Yan, Zhichao Lang, Xinmiao Li, Yifei Li, Congcong Fan, Zhengping Yu
DOI: 10.3389/fcell.2022.1020415
Impact Factor: 5.8
Co-first author (third position): Designed and implemented the ML analysis pipeline. Compared multiple algorithms (Random Forest, SVM‑RFE, LASSO) to identify and validate high‑value exosome biomarkers from high‑dimensional proteomics data.
Multi-omics and Machine Learning-driven CD8+ T Cell Heterogeneity Score for Prognosis
Journal: Molecular Therapy Nucleic Acids (December 2024)
Authors: Di He, Zhan Yang, Tian Zhang, Yaxian Luo, Lianjie Peng, Jiatao Yan, Tao Qiu, Jingyu Zhang, Luying Qin, Zhichao Liu, Xiaoting Zhang, Lining Lin, Mouyuan Sun
DOI: 10.1016/j.omtn.2024.102413
Impact Factor: 6.4
Contribution: Provided machine learning support by implementing multiple methods (including LASSO regression) to identify key prognostic genes from multi‑omics data, providing computational support and feature inputs for score construction.
Using Multiomics and Machine Learning: Insights into Improving the Outcomes of Clear Cell Renal Cell Carcinoma via the SRD5A3-AS1/hsa-let-7e-5p/RRM2 Axis
Journal: ACS Omega (June 2025)
Authors: Mouyuan Sun, Zhan Yang, Yaxian Luo, Luying Qin, Lianjie Peng, Chaoran Pan, Jiatao Yan, Tao Qiu, Yan Zhang
Impact Factor: 3.7
Contribution: Implemented the complete machine learning analysis pipeline to identify and quantify the prognostic value of the SRD5A3‑AS1/hsa‑let‑7e‑5p/RRM2 axis in clear cell renal cell carcinoma; participated in validation using single‑cell and spatial transcriptomics analyses.
A multi-data fusion deep learning model for prognostic prediction in upper tract urothelial carcinoma
Journal: Frontiers in Oncology (August 2025)
Authors: Hongdi Sun, Siping Chen, Yongxing Bao, Fengyan You, Honghui Zhu, Xin Yao, Lianguo Chen, Jiangwei Miao, Fanggui Shao, Xiaomin Gao, Binwei Lin
DOI: 10.3389/fonc.2025.1644250
Contribution: Designed and implemented deep learning architectures for multi-phase CT analysis; integrated imaging features with clinical tabular data to build a comprehensive prognostic model; participated in model validation and optimization, providing key technical support for the publication.
Manuscripts in Preparation
YOLOv11-LCDFS: Enhanced Smoking Detection With Low-light Enhancement
In Revision
Authors: Jiatao Yan, Zhuzikai Zheng, Zhengtan Yang, Hao Jiang, Peichen Wang, Fangjun Kuang, Siyang Zhang
First author: working on a YOLO-based architecture with integrated low-light enhancement capabilities, specialized loss functions, attention mechanisms, and optimized upsampling techniques for improved detection in challenging lighting conditions.
Software Copyrights & Patent Application
Patent Application
Title: Smoking Behavior Recognition Camera and Determination Method
Application Number: 202310277784.1
Status: Application process concluded - withdrawn due to academic credential requirements rather than technical merit
Inventors: Jiatao Yan, Siyang Zhang, Fangjun Kuang, Peichen Wang, Zhuzikai Zheng, Hao Jiang, Zhengtan Yang, Hanwen Bao, Chunqiu Xia
Summary: A method combining pose estimation for real-time smoking behavior detection in public spaces.
Software Copyrights
- Medical Image Computing Software (2022SR0252378)
Registered: April 2022 - Human Skeleton Recognition Software (2022SR1258998)
Registered: October 2022 - Cigarette Recognition Software (2022SR1277520)
Registered: October 2022 - Smoking Behavior Detection Software (2022SR1277521)
Registered: October 2022
Academic Achievements & Awards
ECG Image Digitization
Bronze Medal | Ranked 97/1424 | Top 7% | Kaggle Global Competition | 2026.01.23
Stanford RNA 3D Folding Competition
Ranked 338/1516 | Top 22% | Kaggle Global Competition | 2025.09.25
Yale/UNC-CH - Geophysical Waveform Inversion
Ranked 255/1365 | Top 19% | Kaggle Global Competition | 2025.07.01
BYU - Locating Bacterial Flagellar Motors 2025
Ranked 315/1136 | Top 28% | Kaggle Global Competition | 2025.06.05
Predict Calorie Expenditure
Ranked 178/4316 | Top 5% | Kaggle Competition | 2025.06.01
Predict Podcast Listening Time
Ranked 116/3310 | Top 4% | Kaggle Competition | 2025.05.01
HuBMAP + HPA Competition
Ranked 441/1174 | Top 38% | Kaggle Global Competition | 2022.09.23
18th Challenge Cup College Student Competition
Bronze Medal | Zhejiang Province Level | May 2023
4th National "Chuanzhi Cup" IT Skills Competition
Provincial Excellent Award | Zhejiang Province | December 2021
2023 Wenzhou Computer Society Student Member Innovation and Entrepreneurship Award
Third Prize | Wenzhou | April 2024
Projects
Nuwax Platform: Agent Development (NDA)
Participated in developing multi-agent and task-specific skills for a privately deployed agent platform; packaged skills as standardized tools and configurations. Under NDA, business details are omitted.
- Supported packaging the project into OpenClaw for direct execution
- Designed input/output constraints and tool-calling boundaries to reduce format drift and invalid calls in multi-turn interactions
- Built a reusable tool adapter layer (APIs/files/data queries) with authentication, timeout control, retries, and exception fallbacks
Technologies: Python, Agent Engineering, Tool Calling, Workflow Orchestration, Observability
Date: Feb 2026
Contract Review Agent Module (NDA)
Participated in building an internal contract review assistant module that converts contracts into structured clauses and produces risk notes with evidence references. Under NDA, organization, contract types, and data details are omitted.
- Implemented contract parsing and clause indexing to produce structured outputs (clauses, original text locations, citation anchors) for traceable review
- Integrated retrieval and tool calling to bind review conclusions with evidence spans (paragraph/page), generating a verifiable risk list
- Packaged review task APIs and background processing (queue/retries) to support batch processing and resubmission after manual verification
Technologies: LLM Application Development, Document Parsing, RAG, Tool Calling, Multi-Agent Workflows
Date: Jan 2026
3D-Enhanced Scientific Image Forgery Detection (NDA)
Participated in exploring 3D-enhanced scientific image forgery detection by introducing pseudo-3D and physics-inspired features into pixel-level segmentation, with supporting feature caching and evaluation workflows. Under NDA, data and business details are omitted.
- Implemented offline extraction and cache reuse for pseudo-3D and physics-inspired features to reduce repeated computation during training
- Improved data preparation and mask alignment scripts to support batch preprocessing, patch-based training, and unified evaluation entrypoints
- Generated inference visualizations and offline evaluation reports to compare the impact of different feature combinations
Technologies: PyTorch, Swin Transformer (Training), OpenCV, Feature Caching, Segmentation Evaluation & Reporting
Date: Dec 2025
Biomedical Scientific Image Copy-Move Forgery Detection & Segmentation (NDA)
Participated in building a scientific image integrity screening module that detects suspected copy-move operations and outputs pixel-level localization masks to assist reviewers. Under NDA, project source and data details are omitted.
- Implemented high-resolution sliding-window inference and mask stitching to output overlays, masks, and key region coordinates
- Implemented post-processing and risk grading strategies (thresholding, area ratio, etc.) to enable fast triage and ranking
- Provided FastAPI inference endpoints and batch scripts for offline runs, result persistence, and report generation
Technologies: PyTorch, timm, OpenCV, FastAPI, ONNX/ONNXRuntime (Deployment), Sliding-Window Aggregation, Image Forensics
Date: Nov 2025
Scientific Content Review & Image Analysis Platform (NDA)
Participated in developing a platform to assist review of scientific PDFs and images, covering document parsing, text-image extraction, similarity screening, quality evaluation, and report export. Under NDA, clients and data sources are omitted.
- Implemented an upload and parsing pipeline: normalized document formats, performed page-by-page parsing, extracted images/text, and generated traceable metadata (unique ID, page number, context text, etc.)
- Implemented “evidence filtering → parallel analysis”: filtered non-evidence images via a hybrid vision/rule strategy and ran quality evaluation, image-text consistency checks, and anomaly trace detection in parallel
- Implemented duplication and tampering risk detection: combined retrieval-based similarity, local matching, and deep model inference to output verifiable localization evidence and structured summaries (for frontend display/report export)
Technologies: Python, Document Parsing, Image Processing, Vector Retrieval/Similarity Search, Deep Learning Inference, Async Task Orchestration, Docker
Date: Oct 2025
Multi-Agent Document Workflow Platform (NDA)
Participated in developing a document processing and review platform covering upload, preview, OCR, information extraction, and export. Under NDA, organization, data, and details are omitted.
- Implemented end-to-end task orchestration and state tracking for long-running document jobs, including progress reporting, retries, step-level reruns, and checkpoint resume
- Implemented pluggable integration for document preview and parsing services via a unified adapter layer (request/response schemas, error codes, and timeout policies), enabling parser switching by scenario
- Built a multi-role Agent/workflow execution framework based on LangChain, persisting key intermediate artifacts in structured form and recording auditable logs for review, tracing, and reproduction
Technologies: FastAPI, React/Vite, Docker Compose, LangChain, OCR, Multi-Agent Workflows, RAG, Pluggable Toolchain
Date: Sep 2025
YOLOv11-LCDFS: Enhanced Smoking Detection With Low-light Enhancement
Extension of smoking detection research focusing on improved object detection in challenging lighting conditions. Developing a YOLO-based architecture with specialized components addressing the unique challenges of low-light environments. GitHub
- Designing custom loss functions specifically optimized for low-light object detection scenarios
- Implementing attention mechanisms to focus on key visual features in varying illumination conditions
- Optimizing upsampling techniques to preserve fine details in dark environments
- Integrating lightweight low-light enhancement module into the detection pipeline
Technologies: PyTorch, YOLO, Computer Vision, CUDA, Attention Mechanisms
Status: Ongoing (April 2025)
Related publications: YOLOv11‑LCDFS: Enhanced Smoking Detection With Low‑light Enhancement (manuscript in preparation)
Multi-modal Medical Image Analysis for Cancer Research
Developing medical image analysis systems for cancer research. GitHub
- Designing medical image segmentation algorithms for hepatocellular carcinoma and renal cell carcinoma
- Designing methods for integrating clinical tabular data with imaging features for comprehensive analysis
- Implementing multi-modal fusion techniques for combining different CT scan phases
- Developing 3D volumetric segmentation approaches for comprehensive anatomical analysis
Technologies: Python, Deep Learning, 3D Segmentation, Multi-modal Fusion, PyDicom, NumPy
Status: Ongoing (April 2025)
Related publications: Multi‑modal medical image segmentation and fusion (manuscript in preparation)
Disease Detection Using Deep Learning
Developed deep learning models for automated disease detection and classification from medical imaging data, with a focus on diabetic foot ulcer detection. GitHub
- Implemented and compared multiple YOLOv8 architecture variants enhanced with advanced attention mechanisms (GAM, CBAM, ECA, CoordAtt) for precise localization of diabetic wounds in clinical images
- Designed and evaluated custom YOLOv8 architectures integrating novel upsampling techniques including CARAFE and DySample for improved feature map resolution
- Developed innovative triplet-based loss functions and Inner-CIoU mechanisms to enhance detection accuracy for wounds of varying sizes and appearances
- Employed dynamic convolution techniques to adaptively capture wound features across diverse clinical settings and lighting conditions
- Constructed and curated specialized datasets of diabetic foot ulcers for training and comprehensive validation
- Achieved significant improvements in both detection accuracy and inference speed compared to baseline models, with particular gains for small and atypical lesions
- Implemented model explainability techniques to visualize feature importance and attention maps for clinical interpretation
Technologies: PyTorch, YOLO, Attention Mechanisms (GAM/CBAM/ECA), Custom Loss Functions, Feature Upsampling Techniques, Dynamic Convolutions
Date: August 2024
Twitter Quality and Spam Detection System
Developed an advanced machine learning system for Twitter content quality assessment and spam detection, achieving over 88% accuracy in distinguishing between legitimate tweets and spam content. GitHub
- Implemented comprehensive data preprocessing techniques for Twitter data, including text normalization, feature extraction, and handling of missing values
- Engineered complex features by combining user metrics (follower count, following count) and behavioral patterns to enhance classification performance
- Applied sentiment analysis and content analysis techniques to identify quality patterns in tweets
- Developed and compared multiple machine learning models to optimize classification accuracy
- Created data visualizations to communicate findings and identify key patterns in Twitter content quality assessment
Technologies: Python, NLTK, scikit-learn, Pandas, Matplotlib, Seaborn
Date: August 2024
Airbnb Price Analysis and Prediction System
Developed a comprehensive data analysis and machine learning system to predict Airbnb listing prices in New York City based on various property features and location data. GitHub
- Performed extensive exploratory data analysis on NYC Airbnb dataset with over 48,000 listings, creating visualizations to reveal pricing patterns across neighborhoods
- Implemented geospatial analysis to visualize property distribution and identify high-value areas using Python GIS libraries
- Engineered relevant features by transforming categorical variables and creating new metrics to better capture pricing factors
- Developed a RandomForest regression model to predict continuous listing prices with optimized hyperparameters
- Created a classification model to categorize listings into price brackets, achieving high accuracy through model tuning with RandomizedSearchCV
- Built interactive visualizations to help hosts and travelers understand pricing determinants in the NYC short-term rental market
Technologies: Python, Pandas, Scikit-learn, GeoPandas, Matplotlib, Seaborn, RandomForest
Date: January 2024
Airline Sentiment Analysis System
Learning to analyze airline customer sentiment from Twitter data using natural language processing techniques. GitHub
- Studying text preprocessing methods for social media data (handling hashtags, mentions, emojis, URLs)
- Learning about different feature extraction approaches including TF-IDF vectorization and N-gram analysis
- Working with various classification models and comparing their performance
- Practicing error analysis to understand patterns in misclassified tweets
- Creating visualizations using Matplotlib, Seaborn, and Plotly to understand sentiment trends
Technologies: Python, NLTK, spaCy, scikit-learn, Pandas, Matplotlib, Seaborn, Plotly, WordCloud
Date: December 2023
Smoking Behavior Detection System
Undergraduate thesis project combining YOLO object detection with MediaPipe skeletal tracking to accurately identify and classify smoking gestures in video streams. GitHub
- Utilized YOLO object detection to identify cigarettes and related objects in video footage
- Implemented MediaPipe for real-time skeletal tracking and pose estimation
- Designed algorithms to recognize characteristic smoking hand-to-mouth gesture patterns
Technologies: YOLO, MediaPipe, Pose Estimation, Action Recognition, PyTorch, OpenCV
Date: April 2023
Outcomes: Patent application (202310277784.1), 3 software copyrights, thesis received outstanding evaluation
Heart Disease Prediction System
Learning to analyze health indicators for heart disease prediction using machine learning approaches. GitHub
- Working with health metrics from the CDC's BRFSS survey dataset
- Learning about feature selection and preprocessing for health indicators
- Studying both traditional machine learning models and deep learning approaches
- Practicing model evaluation using cross-validation and performance metrics
- Creating visualizations to understand relationships between health factors
Technologies: Python, TensorFlow 2.11.0, scikit-learn, Pandas, NumPy, Matplotlib, Seaborn
Date: May 2022
Research Interests
Medical Artificial Intelligence
- Learning deep learning methods for medical image analysis and disease diagnosis
- Multi-modal integration and feature extraction for clinical data
- Exploring computer-aided diagnosis systems for real-world clinical use
Embodied Intelligence
- Studying how robots may acquire intelligence through physical interaction with the world
- Exploring the connection between perception and action in embodied agents
- Learning reinforcement learning for robotics applications
Computer Vision
- Attention-based object detection architectures
- Human pose estimation and behavior recognition
- Visual feature extraction for real-world applications
Multi-Agent Systems
- Understanding how multiple agents collaborate
- Curious about emergent behaviors and collective intelligence
- Learning the foundations of multi-agent AI systems
Technical Skills
Programming Languages
- Python
- C, Java, SQL
- C#, JavaScript, Vue
AI & Machine Learning
- Frameworks: PyTorch, TensorFlow
- Areas: Computer Vision, Deep Learning, Reinforcement Learning
- Techniques: CNNs, Attention Mechanisms, Transfer Learning
Development Tools
- Version Control: Git, GitHub
- Documentation: LaTeX, Markdown
- Environment: Linux, Jupyter, Docker
Data Analysis
- Libraries: NumPy, Pandas, SciPy
- Visualization: Matplotlib, Seaborn, Plotly
Languages
- Chinese: Native speaker
- English: IELTS 6.0 (L 6.5 / R 6.5 / W 5.5 / S 6.0); CET‑6 467
Certifications
- DevCloud Summer Training Camp (Huawei)
- Python for AI Development (Shandong University)
- Deep Learning Fundamentals (Shandong University)
References
Professional and academic references can be provided upon request.
