CV

General Information

Full Name Yu S. Huang/黄宇
Languages Chinese, English

Experience

  • 2022 -
    Senior Director of Bioinformatics
    臻和 Genecast Biotechnology Corp Ltd, Shanghai/Beijing/Wuxi, China
    • Led the modelling for the Genecast Multi Cancer Early Detection (THEMIS) https://www.genecast.com.cn/solutions/detail?id=29 (Bie et al. Nature Communications 2023).
    • Led the computational development for fixed-panel and WES custom-panel MRD (Minimal Residual Development) products.
    • Led the optimization of core bioinformatics algorithms.
    • Taught Bayesian Statistics in Julia via Turing.jl [Statistical Rethinking by Richard McElreath](https://www.yfish.org/teaching/SR2/).
  • 2015 - 2021
    Professor/Principal Investigator, Director of Bioinformatics
    Shanghai Institute of Materia Medica), Chinese Academy of Sciences 中科院
    • Develop AI models, algorithms, distributed computing platforms, and databases in personalized medicine (target discovery, validation, biomarker) and AI models in drug design and virtual screening.
    • Accucopy, Accurity in C++.
    • Taught Chris Bishop 2006 book "Pattern Recognition and Machine Learning".
    • Taught Julia programming language, Matrix Computing, Optimization.
  • 2014-2015
    Bioinformatics Scientist
    Illumina Inc., San Diego, California, USA
    • Developed the MethylSeq app in BaseSpace (C#).
    • Developed a Directed-Acyclic-Graph Workflow system in C# that boosts Illumina bioinformatics workflow runtime by >50X.
    • Developed some bioinformatics libraries in GOlang that sped up the analysis by >100X.
    • Forensics, cancer, whole-genome, exome competitive analyses.

Education

  • 2010
    PhD in Computational Biology and Bioinformatics
    University of Southern California, Los Angeles, USA
  • 2003
    B.S. in Biology
    Fudan University, Shanghai, China

Open Source Projects

  • 2017-now
    Accucopy
    • A computational method that infers Allele-specific Copy Number alterations from low-coverage low-purity tumor sequencing Data.
  • 2021-now
    eGADA
    • enhanced GADA: a fast segmentation algorithm utilizing the Sparse Bayesian Learning (or Relevance Vector Machine). It can be applied to array intensity data, NGS sequencing data, or any sequential data that displays characteristics of stepwise functions. Enhancements include: 1) a customized Red-Black tree to significantly expedite the final backward elimination step; 2) coded in C++, which is better structured than C; 3) export eGADA.so, a Python API.

Honors and Awards

  • 2023
    • 江苏省省级某人才项目
  • 2016
    • China Thousand-Talent Program
  • 2015
    • Hundred-Talent Program of Chinese Academy of Sciences
  • 2003-2008
    • Merit Award Fellowship, University of Southern California.
  • 2022
    • Third Award, Computer Programming Contest, Fudan University.
  • 1999-2003
    • People's Scholarship, Fudan University.
  • 1998
    • Third Award, National High School Mathematics Competition of China.
  • 1996
    • Third Award, Junior High School Physics Competition, Shanghai.

Expertise & Skills

  • Modelling & Algorithm
    • Statistical Learning, Machine/Deep Learning, Optimization
  • Programming
    • Rust, Python, C++, Julia, GO, R, Java, SQL, shell, awk
  • Library
    • Parallel-Computing (open-MPI, MPICH), Boost C++ Library, Pegasus workflow system
  • SysAdmin
    • PostgreSQL, MySQL DB, Lustre FS, zfs, NFS, Ceph, LDAP, K8S, Kubeflow, iptables, NGINX

Hobbies

  • Surf, Snowboard, Swim, Reading