KAT (K-mer Analysis Toolkit) is a set of tools that study k-mers counts. Here we are focusing on one plot generated by KAT: spectra-cn. This plot compares k-mers from input reads and k-mers from assemblies. We simulated data set to test different read error and heterozygotie rates in order to understand the impact on the plot. We produce videos to show the plots evolution in a dynamic way.
Feel free to use them for your training sessions and to share them with colleagues.
We simulated a reference genome of 5Mb and generate 50x depth fastq files using Grinder (0.5.4) with different substitution percentages following a uniform distribution .
We simulated a reference genome of 5Mb and generate a fastq file using Grinder (0.5.4), with a 50x coverage. Then we insert different percentage of substitution on the fasta file before generating an other fastq. And finally we took half of each fastq to create our data set.
We simulated a reference genome of 5Mb and generate fastq files with different substitution percentage using Grinder (0.5.4), following a uniform distribution with a 50x coverage. Then we insert different percentage of substitution on the reference genome before generating an other fastq also with substitutions. And finally we took half of each fastq to create our data set.