Hello !
I assembled my raw reads (fastq) in contigs (fasta) before mapping those contigs on a reference genome (output: sam). My command line:
minimap2 -t 14 -x map-ont -a ../AMelMel_ref.fasta ../polished_assembly_v2_1.fasta > v2_polished_assembly_mapping_1.sam
However, as I'm mapping contigs, I lose every information of sequencing depth I have with my raw reads. The problem is that as I use a polishing step on my contigs, I can't map my raw reads directly on the refence genome.
Do you perhaps know a way to include in Minimap2 either the raw reads (fastq), or another sam of the mapping of my raw reads on the contigs, along with my contigs and my reference genome? Or any other way to keep the depth info throughout the whole process?
Thank you!
What does this mean? Can you show an example?
I mean that since my overlapping reads are now represented by non-overlapping contigs (which are consensus sequences), when I map them they always give a sequencing depth of 1 (since they don't overlap, a base can be covered by one contig at most).
For example, here's a result I get on a scaffold:
But as these contigs are made of several overlapping reads, my real sequencing depth shouldn't be 1 (or at least I suppose, but I'm still a beginner in genomics so maybe I'm missing something). That's why I'm trying to include my raw reads in my contigs mapping.
I hope it's clear now!
It is not clear why you are assembling the data if you have a reference genome available. Are you using a related reference for the alignment of the contigs? If you are interested in finding out coverage in terms of reads why don't you align your sequence data directly to the reference?
Circling back to the question above. If you have a reference there is no need to assemble the data, unless you expect the genome you are sequencing to be significantly different. There are ways to do reference assisted genome assembly in case you want to consider that option.
I chose to assemble my data prior to mapping because I saw it in some papers, but I might be mistaking with de novo assembly. Also, as I'm studying the whole holobiont and trying to assemble MAGs in parallel, so I guess I mixed everything and wanted to apply a similar approach for both the host and the microbiome.
I'll try aligning my reads directly to the reference genome (which is a different subspecies from my data so I don't know how divergent they are), and if it doesn't work well I'll take a look at reference assisted genome assembly methods.
Thanks a lot for all your answers!