Yu S. Huang


Beijing/Wuxi/Shanghai, China

Since 2022, I have been Senior Director of Bioinformatics at Genecast Corp Ltd., working on algorithm development and modelling (算法和模型开发) in cancer therapy selection, prognosis and monitoring, early screening (肿瘤用药选择、预后、早筛). From 2015, I have been a Principal Investigator and Director of Bioinformatics at Shanghai Institute of Materia Medica (SIMM) Chinese Academy of Sciences. My interest is developing fast and accurate models, algorithms, distributed computing platforms, and larg-scale databases in bioinformatics, AI-aided drug design, other data-modelling fields.

I was formerly a bioinformatics scientist at Illumina Inc (San Diego site), and PostDoc at Human Genetics, UCLA. I completed my Ph.D. in Computational Biology and Bioinformatics at USC, working primarily on statistical methods in association mapping and population genetics of Arabidopsis thaliana, under the supervision of Magnus Nordborg. I had also worked on gene function/network inference graph theory algorithms. Being in a PhD program founded by a mathematician (M.S. Waterman), I learnt all I can about statistics and probability. In July 2003, I received B.S. of Biology from Fudan University. Fascinated with computers since I tested my first BASIC program on an Intel-8088 PC in my 8th grade, I learnt C/C++, PostgreSQL, Java, Python, and everything about Linux in my undergraduate.

In my spare time, I read broadly and enjoy sports that involve elongated boards: surfboards, skateboards (Original Apex 34, Landyachtz Grom Race, Dinghy), and snowboards.

  • GitHub: github.com/polyactis
  • ORCID: 0000-0001-5967-4948
  • Expertise: Bioinformatics, Machine/Deep/Statistical Learning, AI Drug Design, Optimization, Distributed Computing
  • Programming: Julia, Python, Rust, C/C++, SQL, shell, R, awk
  • Occasional: GO, C#, Vue.js, Java, PHP, Perl, FORTRAN, Pascal, MATLAB
  • Library: Parallel-Computing (open-MPI, MPICH), Boost C++ Library, Pegasus workflow system
  • SysAdmin: PostgreSQL, MySQL DB, Lustre FS, zfs, NFS, Ceph, LDAP, K8S, Kubeflow, iptables, NGINX

latest posts

selected publications

  1. Bie_2023_THEMIS.webp
    Multimodal analysis of cell-free DNA whole-methylome sequencing for cancer detection and localization
    F Bie , Z Wang , Li Y. , and 22 more authors
    Nature Communications, 2023
  2. Huang_2023_Fig1_eGADA_vs_GADA.png
    eGADA: enhanced Genomic Alteration Detection Algorithm, a fast Sparse-Bayesian-Learning based genomic segmentation algorithm
    bioRxiv, 2023
  3. Deffini
    Deffini: A family-specific deep neural network model for structure-based virtual screening
    D Zhou , F Liu , Y Zheng , and 3 more authors
    Computers in Biology and Medicine, 2022
  4. Accucopy
    Accucopy: Accurate and Fast Inference of Allele-specific Copy Number Alterations from Low-coverage Low-purity Tumor Sequencing Data
    X Fan , G Luo , and YS Huang
    BMC Bioinformatics, 2021