CV
General Information
Full Name | Yu S. Huang/黄宇 |
Languages | Chinese, English |
Experience
-
2022 - Senior Director of Bioinformatics
臻和 Genecast Biotechnology Corp Ltd, Shanghai/Beijing/Wuxi, China - Led the modelling for the Genecast Multi Cancer Early Detection (THEMIS) https://www.genecast.com.cn/solutions/detail?id=29 (Bie et al. Nature Communications 2023).
- Led the computational development for fixed-panel and WES custom-panel MRD (Minimal Residual Development) products.
- Led the optimization of core bioinformatics algorithms.
- Taught Bayesian Statistics in Julia via Turing.jl [Statistical Rethinking by Richard McElreath](https://www.yfish.org/teaching/SR2/).
-
2015 - 2021 Professor/Principal Investigator, Director of Bioinformatics
Shanghai Institute of Materia Medica), Chinese Academy of Sciences 中科院 - Develop AI models, algorithms, distributed computing platforms, and databases in personalized medicine (target discovery, validation, biomarker) and AI models in drug design and virtual screening.
- Accucopy, Accurity in C++.
- Taught Chris Bishop 2006 book "Pattern Recognition and Machine Learning".
- Taught Julia programming language, Matrix Computing, Optimization.
-
2014-2015 Bioinformatics Scientist
Illumina Inc., San Diego, California, USA - Developed the MethylSeq app in BaseSpace (C#).
- Developed a Directed-Acyclic-Graph Workflow system in C# that boosts Illumina bioinformatics workflow runtime by >50X.
- Developed some bioinformatics libraries in GOlang that sped up the analysis by >100X.
- Forensics, cancer, whole-genome, exome competitive analyses.
Education
-
2010 PhD in Computational Biology and Bioinformatics
University of Southern California, Los Angeles, USA -
2003 B.S. in Biology
Fudan University, Shanghai, China
Open Source Projects
-
2017-now Accucopy
- A computational method that infers Allele-specific Copy Number alterations from low-coverage low-purity tumor sequencing Data.
-
2021-now eGADA
- enhanced GADA: a fast segmentation algorithm utilizing the Sparse Bayesian Learning (or Relevance Vector Machine). It can be applied to array intensity data, NGS sequencing data, or any sequential data that displays characteristics of stepwise functions. Enhancements include: 1) a customized Red-Black tree to significantly expedite the final backward elimination step; 2) coded in C++, which is better structured than C; 3) export eGADA.so, a Python API.
Honors and Awards
-
2023 - 江苏省省级某人才项目
-
2016 - China Thousand-Talent Program
-
2015 - Hundred-Talent Program of Chinese Academy of Sciences
-
2003-2008 - Merit Award Fellowship, University of Southern California.
-
2022 - Third Award, Computer Programming Contest, Fudan University.
-
1999-2003 - People's Scholarship, Fudan University.
-
1998 - Third Award, National High School Mathematics Competition of China.
-
1996 - Third Award, Junior High School Physics Competition, Shanghai.
Expertise & Skills
-
Modelling & Algorithm
- Statistical Learning, Machine/Deep Learning, Optimization
-
Programming
- Rust, Python, C++, Julia, GO, R, Java, SQL, shell, awk
-
Library
- Parallel-Computing (open-MPI, MPICH), Boost C++ Library, Pegasus workflow system
-
SysAdmin
- PostgreSQL, MySQL DB, Lustre FS, zfs, NFS, Ceph, LDAP, K8S, Kubeflow, iptables, NGINX
Hobbies
- Surf, Snowboard, Swim, Reading