CV | Yu S. Huang

General Information

Full Name	Yu S. Huang/黄宇
Languages	Chinese, English

Experience

2022 -

Senior Director of Bioinformatics

臻和 Genecast Biotechnology Corp Ltd, Shanghai/Beijing/Wuhan, China

Define and execute long-term technical strategy for AI-driven precision oncology, aligned with corporate product pipelines and business goals.
Lead the development of multimodal AI platforms integrating sequence, structure, and epigenomic data for non-invasive cancer detection.
Built enterprise-grade AI computing infrastructure (K8s, PyTorch, distributed storage, high-speed interconnect) to support large-scale computing.
Lead and mentor a high-performance team of algorithm scientists, bioinformaticians, and software engineers to deliver end-to-end solutions from in silico modeling to experimental validation.
Led cross-disciplinary team management and promoted tight integration between computational models and experimental biology.
External scientific engagement, conference presentations, high-impact publications, and IP strategy; drove research-to-product translation.
AI model for Multi Cancer Early Detection (Nature Communications 2023).
Optimize core bioinformatics algorithms using Deep/Machine/Statistical Learning techniques.
AI models for MRD (Minimal Residual Disease) fixed-panel and WES custom-panel products.
Teach Bayesian Statistics, Machine/Deep Learning, Julia/Rust Programming.

2015 - 2021

Professor/Principal Investigator

Drug Discovery and Design Center, Shanghai Institute of Materia Medica, Chinese Academy of Sciences 中科院

Led the establishment of AI-driven drug discovery center and built a mature structure-based drug design & virtual screening system.
Developed Fergie (VAE-based small molecule generation) and Deffini (structure-based virtual screening DNN) to enable structure-guided drug design at scale, designed one kinase inhibitor molecule entering the PCC phase.
Developed core algorithms for genomic variant calling, copy number analysis, and methylation sequencing to support early-stage innovative drug R&D.
Directed national/provincial research projects, built academic-industry partnerships, and delivered high-impact publications.
Built the AI computing infrastructure for CAS SIMM.
Teach Russell & Norvig 2020 "Artificial Intelligence_ A Modern Approach".
Teach Chris Bishop 2006 book "Pattern Recognition and Machine Learning".
Teach Julia programming, Matrix Computations, Optimization.

2014-2015

Bioinformatics Scientist

Illumina Inc., San Diego, California, USA

Developed algorithms and pipelines for high-throughput sequencing data analysis.
Built MethylSeq analysis tool on Illumina BaseSpace for bisulfite sequencing data processing.
Developed UFlow, a Directed-Acyclic-Graph workflow system that speeds up Illumina bioinformatics workflows by >50X.
Bioinformatics libraries in GOlang that speed up sequencing analysis by >100X.

Education

2010

PhD in Computational Biology and Bioinformatics

University of Southern California, Los Angeles, USA

2003

B.S. in Biology

Fudan University, Shanghai, China

Honors and Awards

2023
- 江苏省省双创人才
2016
- China Thousand-Talent Program
2015
- Hundred-Talent Program of Chinese Academy of Sciences
2003-2008
- Merit Award Fellowship, University of Southern California.
2022
- Third Award, Computer Programming Contest, Fudan University.
1999-2003
- People's Scholarship, Fudan University.
1998
- Third Award, National High School Mathematics Competition of China.
1996
- Third Award, Junior High School Physics Competition, Shanghai.

Expertise & Skills

AI & Deep Learning
- Transformers, VAE, Multimodal Fusion, Statistical Learning, Deep Neural Networks
AI for Protein Design
- Protein Language Models, Generative AI, Diffusion Models, Structure-aware Models, AlphaFold / Rosetta workflows
AI Drug Discovery
- Virtual Screening, De novo Design, Structure-Based Drug Design
AI Infra & HPC
- PyTorch, TensorFlow, K8s, Kubeflow, Lustre FS, Infiniband, OpenMPI
Programming
- Rust, Python, C++, Julia, Go, R, Java, SQL

Open Source Projects

2017
Accucopy
- A computational method that infers Allele-specific Copy Number alterations from low-coverage low-purity tumor sequencing Data.
2021
eGADA
- enhanced GADA: a fast segmentation algorithm utilizing the Sparse Bayesian Learning (also called Relevance Vector Machine). It can be applied to array intensity data, NGS sequencing data, or any sequential data that displays characteristics of stepwise functions. Enhancements include: 1) a customized Red-Black tree to significantly expedite the final backward elimination step; 2) coded in C++, which is better structured than C; 3) export eGADA.so, a Python API.

Hobbies

Read, Surf, Snowboard, Swim

General Information

Experience

Senior Director of Bioinformatics

Professor/Principal Investigator

Bioinformatics Scientist

Education

PhD in Computational Biology and Bioinformatics

B.S. in Biology

Honors and Awards

Expertise & Skills

AI & Deep Learning

AI for Protein Design

AI Drug Discovery

AI Infra & HPC

Programming

Open Source Projects

Accucopy

eGADA

Hobbies