Yu S. Huang

prof_pic.jpg

Beijing/Wuxi/Shanghai, China

Since 2022, I have been Senior Director of Bioinformatics at Genecast Corp Ltd., working on algorithm development and modelling (算法和模型开发) in cancer therapy selection, prognosis and monitoring, early screening (肿瘤用药选择、预后、早筛). From 2015, I have been a Principal Investigator and Director of Bioinformatics at Shanghai Institute of Materia Medica (SIMM) Chinese Academy of Sciences. My interest is developing fast and accurate models, algorithms, distributed computing platforms, and larg-scale databases in bioinformatics, AI-aided drug design, other data-modelling fields.

I was formerly a bioinformatics scientist at Illumina Inc (San Diego site), and PostDoc at Human Genetics, UCLA. I completed my Ph.D. in Computational Biology and Bioinformatics at USC, working primarily on statistical methods in association mapping and population genetics of Arabidopsis thaliana, under the supervision of Magnus Nordborg. I had also worked on gene function/network inference graph theory algorithms. Being in a PhD program founded by a mathematician (M.S. Waterman), I learnt all I can about statistics and probability. In July 2003, I received B.S. of Biology from Fudan University. Fascinated with computers since I tested my first BASIC program on an Intel-8088 PC in my 8th grade, I learnt C/C++, PostgreSQL, Java, Python, and everything about Linux in my undergraduate.

In my spare time, I read broadly and enjoy sports that involve elongated boards: surfboards, skateboards (Original Apex 34, Landyachtz Grom Race, Dinghy), and snowboards.

GitHub: https://github.com/polyactis.

Expertise: Bioinformatics, Machine/Deep/Statistical Learning, AI Drug Design, Optimization, Distributed Computing

Programming: Python, C/C++, Rust, SQL, shell, R, awk

Occasional: GO, C#, Vue.js, Java, Julia, PHP, Perl, FORTRAN, Pascal, MATLAB

Library: Parallel-Computing (open-MPI, MPICH), Boost C++ Library, Pegasus workflow system

SysAdmin: PostgreSQL, MySQL DB, Lustre FS, zfs, NFS, Ceph, LDAP, K8S, Kubeflow, iptables, NGINX

latest posts

selected publications

  1. Huang_2023_Fig1_eGADA_vs_GADA.png
    eGADA: enhanced Genomic Alteration Detection Algorithm, a fast Sparse-Bayesian-Learning based genomic segmentation algorithm
    bioRxiv, 2023
  2. Deffini
    Deffini: A family-specific deep neural network model for structure-based virtual screening
    D Zhou , F Liu , Y Zheng , and 3 more authors
    Computers in Biology and Medicine, 2022
  3. Accucopy
    Accucopy: Accurate and Fast Inference of Allele-specific Copy Number Alterations from Low-coverage Low-purity Tumor Sequencing Data
    X Fan , G Luo , and YS Huang
    BMC Bioinformatics, 2021