The following examples used two samples stored in $INSTALL_FOLDER/test/data. SampleA and sampleB are two RNA-seq samples of fish embryos at different stages of development.
RunDrap
This workflow is used to produce an assembly from one sample/tissue/development stage. In our example we need to launch runDrap on sampleA then on sampleB. Fastq input files should be renamed to .fastq or .fq if they are unzipped or .fastq.gz or .fq.gz if they are gzipped. This will assure compatibility with several tools.
$INSTALL_FOLDER/runDrap \ --R1 $INSTALL_FOLDER/test/data/sampleA_R1.fastq.gz \ --R2 $INSTALL_FOLDER/test/data/sampleA_R2.fastq.gz \ --ref $INSTALL_FOLDER/test/data/Danio_rerio.pep.fasta \ --dbg oases -kmer 25,31,37,43,49 \ --outdir $OUT_FOLDER/oases_splA \ --dbg-mem 16 --norm-mem 16 $INSTALL_FOLDER/runDrap \ --R1 $INSTALL_FOLDER/test/data/sampleB_R1.fastq.gz \ --R2 $INSTALL_FOLDER/test/data/sampleB_R2.fastq.gz \ --ref $INSTALL_FOLDER/test/data/Danio_rerio.pep.fasta \ --dbg oases --kmer 25,31,37,43,49 \ --outdir $OUT_FOLDER/oases_splB \ --dbg-mem 16 --norm-mem 16
Options --dbg-mem and --norm-mem are not required for execution on a HPC (see drap.cfg).
RunMeta
This workflow is used to produce merge assemblies on several samples/tissues/development stages in one assembly without redundancy. In our example we merge information from two development stages.
$INSTALL_FOLDER/runMeta \ --drap-dirs $OUT_FOLDER/oases_splA,$OUT_FOLDER/oases_splB \ --ref $INSTALL_FOLDER/test/data/Danio_rerio.pep.fasta \ --outdir $OUT_FOLDER/meta_oases
RunAssessment
This workflow is used to evaluate the quality of one assembly or compare several assemblies produced on the same dataset. In our example we want compare an assembly realized with DRAP oases and the an assembly realized with DRAP trinity.
RunDrap and runMeta with trinity
# RunDrap with trinity on first development stage $INSTALL_FOLDER/runDrap \ --R1 $INSTALL_FOLDER/test/data/sampleA_R1.fastq.gz \ --R2 $INSTALL_FOLDER/test/data/sampleA_R2.fastq.gz \ --ref $INSTALL_FOLDER/test/data/Danio_rerio.pep.fasta \ --dbg trinity \ --outdir $OUT_FOLDER/trinity_splA \ --dbg-mem 16 --norm-mem 16 # RunDrap with trinity on second development stage $INSTALL_FOLDER/runDrap \ --R1 $INSTALL_FOLDER/test/data/sampleB_R1.fastq.gz \ --R2 $INSTALL_FOLDER/test/data/sampleB_R2.fastq.gz \ --ref $INSTALL_FOLDER/test/data/Danio_rerio.pep.fasta \ --dbg trinity \ --outdir $OUT_FOLDER/trinity_splB \ --dbg-mem 16 --norm-mem 16
Wait the end of the execution then launch runMeta.
# RunMeta to merge DRAP trinity results $INSTALL_FOLDER/runMeta \ --drap-dirs $OUT_FOLDER/trinity_splA,$OUT_FOLDER/trinity_splB \ --ref $INSTALL_FOLDER/test/data/Danio_rerio.pep.fasta \ --outdir $OUT_FOLDER/meta_trinity
Comparison between assemblies
# Rename assemblies ln -s transcripts_fpkm_1.fa $OUT_FOLDER/meta_oases/oases_fpkm_1.fa ln -s transcripts_fpkm_1.fa $OUT_FOLDER/meta_trinity/trinity_fpkm_1.fa # Launch assessment $INSTALL_FOLDER/runAssessment \ --assemblies $OUT_FOLDER/meta_oases/oases_fpkm_1.fa,$OUT_FOLDER/meta_trinity/trinity_fpkm_1.fa \ --R1 $INSTALL_FOLDER/test/data/sampleA_R1.fastq.gz,$INSTALL_FOLDER/test/data/sampleB_R1.fastq.gz \ --R2 $INSTALL_FOLDER/test/data/sampleA_R2.fastq.gz,$INSTALL_FOLDER/test/data/sampleB_R2.fastq.gz \ --protein-db $INSTALL_FOLDER/test/data/Danio_rerio.pep.fasta \ --outdir $OUT_FOLDER/assessment
Rerun
If you have had an execution problem (timeout, error in reference path, ...) you can rerun only the failed steps.
Rerun the command with the same options: the workflow will run the unfinished steps.