I have been a Senior Director of Bioinformatics at Genecast Biotechnology Corp. 臻和生物科技 since early 2022.
I had been a Principal Investigator at SIMM (中国科学院上海药物研究所) from 2015 till 2021. My team focused on developing AI models, algorithms, distributed computing platforms, and databases in bioinformatics for personalized medicine (target discovery, validation, biomarker), computer-aided drug design and virtual screening.
Previously at Illumina Inc, I was the main developer behind the BaseSpace App, MethylSeq, which is the bioinformatics cloud software to detect methylated cytosines in DNA from bisulfite treated next-gen sequencing data. To better utilize the computing resource of a multi-core system, I wrote a new Directed-Acyclic-Graph-based workflow system in C# and increased the MethylSeq running time by >5X. I wrote some bioinformatics software in GO, which speeds up some analyses considerably (>50X vs Python). I was the lead bioinformatician in inter-disciplinary projects involving cancer, forensics, exome, whole-genome sequencing. Before Illumina, I did a three-and-a-half-year PostDoc with Prof. Nelson Freimer at Human Genetics, UCLA (Nov 2010-Mar 2014), working on trait mapping and population genetic projects in vervet monkeys, analyzing the whole-genome DNA sequences from >700 monkeys of a vervet pedigree and >100 wild population monkeys. In Oct 2010, I completed my PhD in Computational Biology and Bioinformatics at USC, working primarily on association mapping and population genetics of Arabidopsis thaliana, under the supervision of Magnus Nordborg. I had also worked on the topic of gene function/network inference from gene expression data through graph theory. Fascinated with computers ever since I tested my first BASIC program on an Intel-8088 PC in my 8th grade, I learnt C/C++, PostgreSQL DB, Java, Python, and everything about Linux in my undergraduate. Being in a PhD program founded by a mathematician (M.S. Waterman), I learnt all I can about statistics and probability. GitHub: https://github.com/polyactis ORCID: https://orcid.org/0000-0001-5967-4948
Blog Posts
| |
polyactis at gmail.com | |
Education | 2003.08 - 2010.10 University of Southern California, Los Angeles, Ph.D. in Bioinformatics 1999.09 - 2003.07 复旦大学 Fudan University, Shanghai, B.S. in Biological Sciences 1996.09 - 1999.07 川沙中学 Shanghai Chuansha Senior High School |
Employment | 2022 - Senior Director of Bioinformatics, 臻和生物科技 Genecast Biotechnology Corp. Ltd. China 2015 - 2021 Director of Bioinformatics, Professor, Shanghai Institute of Material Medica 2014 - 2015 Bioinformatics Scientist, Illumina Inc. San Diego, USA 2010 - 2014 PostDoc, University of California Los Angeles, USA |
Research Directions |
|
Expertise | Expertise Bioinformatics, Machine/Deep/Statistical Learning, AI Models & Algorithms, Optimization, Distributed Computing, Population Genetics Programming (https://github.com/polyactis) Daily: Python, C/C++, Rust, SQL, shell, R, awk Occasional: GO, C#, Vue.js, Java, Julia, PHP, Perl, FORTRAN, Pascal, MATLAB Library Parallel-Computing (open-MPI, MPICH), Boost C++ Library, Pegasus workflow system SysAdmin: PostgreSQL DB, Lustre FS, zfs, LDAP, K8S, Kubeflow, iptables, NGINX, NFS, Ceph, MySQL |
Awards & Honours |
|
Notable works |
|
Hobbies |