Page tree
Skip to end of metadata
Go to start of metadata

Accurity (Luo et al. 2018)

A software that infers the tumor purity and ploidy from a pair of tumor-normal whole-genome sequencing data. It differentiates from others by performing well in low-purity and low-coverage samples. 

个性化癌症治疗需要针对手术中获取的肿瘤组织样本进行突变分析,从而决定下一步的治疗方向(靶向药物,肿瘤免疫等).肿瘤样本中通常含有非肿瘤细胞(正常免疫细胞等),肿瘤细胞在样本中的比例就是肿瘤纯度.肿瘤纯度过低会增加下一步分析结果的不确定性,降低下一步治疗的成功概率,根据肿瘤样本准确地估计它的纯度也就成了个性化癌症治疗的关键一步.相对传统的影像方法,超低深度(~0.5X)测序提供了一个快速、廉价、自动的癌症纯度估计路径,但是目前的算法在超低深度数据上预测纯度不是很精确。我们开发的软件,Accurity,依据精细的统计模型设计,在超低深度数据上表现突出。

目前研发计算癌症clonal evolution 的算法。

Check the Accurity page for more details.




个性化药物生物信息平台

To access the bioinformatics data generated in the 个性化药物先导专项, please login https://bioinfo.simm.ac.cn/. It contains >100TB NGS genomics, transcriptomics, histo-imaging, and proteomics data, generated during the drug development pipeline.


PatientStratifier

PatientStratifier is a software package that stratifies patients based on patient biomarker data and drug response data. Its core is a machine learning module that learns from existing patient biomarker and drug response data.  It also has a component called PatientRecommender that recommends if a patient should be given a drug or not based on its biomarker data.

Contact us for more details.


Parallel workflow to analyze the NGS data (Huang et al. 2015)

A workflow that analyzes ~900 genomes (cumulative coverage ~4000). Starting from billions of 100bp paired-end reads by Illumina GenomeAnalyzer II, the whole workflow is comprised of several different sub-workflows: the read filtering sub-workflow (whose main program is a custom-written java program based on GATK libraries), the read alignment sub-workflow (main program is bwa [Li et al. 2009], stampy[Lunter et al. 2011] used in test), the base-quality-score-recalibration sub-workflow by GATK [DePristo et al. 2011], the genotype-calling sub-workflow by SAMtools [Li et al. 2009] and GATK [DePristo et al. 2011], the pedigree calling sub-workflow (main program is TrioCaller [Chen et al. 2012]), and other sub-workflows that carry out the variant-filtering and statistics-calculation (Transition/Transversion, allele-frequency, Mendelian inconsistency, population genetic measures such as nucleotide diversity, Hardy-Weinberg equilibrium p-value, linkage disequilibrium). All workflows interact with the vervet postgreSQL database seamlessly through sqlalchemy/elixir. The workflows were constructed in a MapReduce manner using APIs from the Pegasus workflow management system to take full advantage of the parallel computing power in most clusters. The end-result is a powerful and flexible system that is capable of utilizing the full power of a computing cluster. The main code could be found at http://code.google.com/p/vervet-web/. Substantial java and C++ code are in private git repositories (contact polyactis@gmail.com if interested).

This is a job-dependency DAG (direct acyclic graph).

This is the job-duration vs. time diagram, illustrating which job takes most time, from a toy example. The real-data workflows involve 100X or more jobs.



Arabidopsis GWAS web app and database (Seren et al. 2012Huang et al. 2011Atwell et al. 2010)

The most update URL is at http://arabidopsis.gmi.oeaw.ac.at:5000/ (old links from http://arabidopsis.usc.edu/ will be re-directed to this GMI site). The MySQL database dump could be downloaded from http://arabidopsis.gmi.oeaw.ac.at/public_db_dump.tar.gz. All the code for the version demonstrated in Huang et al. 2011 could be found in this tarball (Pylons web server, web client using Google web toolkit, etc.).
 

A second-generation version is at http://gwas.gmi.oeaw.ac.at/index.html (Seren et al. 2012, source code link). The Arabidopsis polymorphism effort is at https://cynin.gmi.oeaw.ac.at/home/resources/atpolydb.

github homepage:https://github.com/polyactis.


  • No labels