RNA-Seq De novo Assembly Using Trinity
2013-12-20 14:09
344 查看
Trinity, represents a novel method for the efficient and robust de novo reconstruction of transcriptomes from RNA-seq data. Trinity combines three independent software modules: Inchworm, Chrysalis, and Butterfly, applied sequentially to process large volumes
of RNA-seq reads. Trinity partitions the sequence data into many individual de Bruijn graphs, each representing the transcriptional complexity at at a given gene or locus, and then processes each graph independently to extract full-length splicing isoforms
and to tease apart transcripts derived from paralogous genes. Briefly, the process works like so:
Inchworm assembles the RNA-seq data into the unique sequences of transcripts, often generating full-length transcripts for a dominant isoform, but then reports just the unique portions of alternatively spliced transcripts.
Chrysalis clusters the Inchworm contigs into clusters and constructs complete de Bruijn graphs for each cluster. Each cluster represents the full transcriptonal complexity for a given gene (or sets of genes that share sequences in common). Chrysalis then partitions
the full read set among these disjoint graphs.
Butterfly then processes the individual graphs in parallel, tracing the paths that reads and pairs of reads take within the graph, ultimately reporting full-length transcripts for alternatively spliced isoforms, and teasing apart transcripts that corresponds
to paralogous genes.
Typical Trinity Command Line
http://www.vcru.wisc.edu/simonlab/bioinformatics/programs/trinity/docs/index.html
of RNA-seq reads. Trinity partitions the sequence data into many individual de Bruijn graphs, each representing the transcriptional complexity at at a given gene or locus, and then processes each graph independently to extract full-length splicing isoforms
and to tease apart transcripts derived from paralogous genes. Briefly, the process works like so:
Inchworm assembles the RNA-seq data into the unique sequences of transcripts, often generating full-length transcripts for a dominant isoform, but then reports just the unique portions of alternatively spliced transcripts.
Chrysalis clusters the Inchworm contigs into clusters and constructs complete de Bruijn graphs for each cluster. Each cluster represents the full transcriptonal complexity for a given gene (or sets of genes that share sequences in common). Chrysalis then partitions
the full read set among these disjoint graphs.
Butterfly then processes the individual graphs in parallel, tracing the paths that reads and pairs of reads take within the graph, ultimately reporting full-length transcripts for alternatively spliced isoforms, and teasing apart transcripts that corresponds
to paralogous genes.
Typical Trinity Command Line
Trinity.pl --seqType fq --JM 10G --left reads_1.fq --right reads_2.fq --CPU 6
http://www.vcru.wisc.edu/simonlab/bioinformatics/programs/trinity/docs/index.html
相关文章推荐
- What’s the difference between alignment, de novo assembly, and map to reference
- Trinity简介(1)--用于无参考基因组的转录组de novo组装
- RNA-seq Differential Expression, Alternative Splicing, Transcript Assembly and Gene Fusion
- (转)8 reviews about de novo genome assembly
- 基于参考注释的RNA-seq分析
- RNA_Seq差异表达分析流程
- Using Inline Assembly With gcc
- 单细胞文献分析 Quantitative single-cell rna-seq with unique molecular identifers
- WGS,WES,RNA-seq组与ChIP-seq之间的异同
- Reproduce ENCODE/CSHL Long RNA-seq data visualization viewed in UCSC
- RNA-seq: differential gene expression analysis
- 利用R语言对RNA-Seq进行探索分析与差异表达分析
- Using SmartAssembly with MSBuild
- RNA-seq与miRNA-seq联合分析
- 基因组 de novo 组装原理
- RNA-seq差异表达分析工作流程
- 32-bit Assembler is Easy, why and how to develop using the assembler; start learning to program in Assembly now!
- Using Fusion Logs to Debug .NET Assembly Binding Issues
- 基于RNA-seq的基因表达分析
- Using assembly writing algorithm programs