De novo RNA-Seq Assembly Pipeline
Short read RNA-Seq de novo assembly is a well established method to study transcription of organisms lacking a reference genome sequence. Available software packages such as Trinity and Oases have proven to be able to build high quality contigs from short reads. But there is still room for improvement on different points such as:
- compactness: they often produce different contigs which are included in one another or overlapping one another,
- chimerism: the contigs contain different kinds on chimera such as duplicated open reading frames,
- substitution, insertion, deletion errors: the consensus sequences build by the assembler contain errors which can be partly corrected using the read alignments.
DRAP includes three modules:
- runDrap chains an Oases or Trinity assembly of reads from a given sample with several compaction and correction steps. It produces several assembly files with different FPKM threshold for total contigs or contigs comprising an open reading frame. A report file presents the resulting assembly and alignment metrics.
- runMeta gathers all the samples assemblies and fusions the results in a unique representative contig set. It also removes the redundancy between sets and produces a general reports including assembly and alignment metrics.
- runAssessment processes different contigs sets build from the same read sets to generate assembly and alignment metrics which are collected in report. It helps to choose the best assembly.
- External tools compatibility: TransDecoder-v5.0, Trinity-v2.8.4.
- Add option --assemblies to runMeta to perform meta-assembly directly from fasta files.
- Add SLURM scheduler compatibility.
- Provide a patch to reduce TransRate memory requirement.
- Fix various bugs.
- Update Docker image.