De novo RNA-seq Assembly Pipeline
Short read RNASeq de novo assembly is a well established method to study transcription of organisms lacking a reference genome sequence. Available software packages such as Trinity and Oases have proven to be able to build high quality contigs from short reads. But there is still room for improvement on different points such as:
- compactness: they often produce different contigs which are included in one another or overlapping one another,
- chimerism: the contigs contain different kinds on chimera such as duplicated open reading frames,
- substitution, insertion, deletion errors: the consensus sequences build by the assembler contain errors which can be partly corrected using the read alignments.
DRAP includes three modules:
- runDrap chains an Oases or Trinity assembly of reads from a given sample with several compaction and correction steps. It produces several assembly files with different FPKM threshold for total contigs or contigs comprising an open reading frame. A report file presents the resulting assembly and alignment metrics.
- runMeta gathers all the samples assemblies and fusions the results in a unique representative contig set. It also removes the redundancy between sets and produces a general reports including assembly and alignment metrics.
- runAssessment processes different contigs sets build from the same read sets to generate assembly and alignment metrics which are collected in report. It helps to choose the best assembly.
- Docker image is now available.
- Read normalization can be run using Trinity or Khmer.
- Parallelization of multi sample assemblies processing.
- Parallelization of local (non SGE) assemblies processing.
- TSA checking now includes vector/adaptor and contaminants detection.
- New embedded Trinity version is the last production version i.e. v2.3.
- New embedded BUSCO version is 2.0.
- Add BUSCO in runAssessment procedure.
- Add runAssessment to evaluate assemblies.
- Add runCheck to check the workflows dependencies.
- Add assembly scoring by TransRate