Question

Keep the sequencing depth info when mapping contigs in Minimap2

0

Entering edit mode

4 months ago

pauline10albert • 0

Hello !

I assembled my raw reads (fastq) in contigs (fasta) before mapping those contigs on a reference genome (output: sam). My command line:

minimap2 -t 14 -x map-ont -a ../AMelMel_ref.fasta ../polished_assembly_v2_1.fasta > v2_polished_assembly_mapping_1.sam

However, as I'm mapping contigs, I lose every information of sequencing depth I have with my raw reads. The problem is that as I use a polishing step on my contigs, I can't map my raw reads directly on the refence genome.

Do you perhaps know a way to include in Minimap2 either the raw reads (fastq), or another sam of the mapping of my raw reads on the contigs, along with my contigs and my reference genome? Or any other way to keep the depth info throughout the whole process?

Thank you!

depth coverage minimap2 • 495 views

ADD COMMENT • link 4 months ago by pauline10albert • 0

0

Entering edit mode

I lose every information of sequencing depth I have with my raw reads

What does this mean? Can you show an example?

ADD REPLY • link 4 months ago by GenoMax 152k

0

Entering edit mode

I mean that since my overlapping reads are now represented by non-overlapping contigs (which are consensus sequences), when I map them they always give a sequencing depth of 1 (since they don't overlap, a base can be covered by one contig at most).

For example, here's a result I get on a scaffold:

#rname  startpos    endpos  numreads    covbases    coverage    meandepth   meanbaseq   meanmapq
CM010319.1  1   27693668    2   7022    0.025356    0.00025356  255 60

But as these contigs are made of several overlapping reads, my real sequencing depth shouldn't be 1 (or at least I suppose, but I'm still a beginner in genomics so maybe I'm missing something). That's why I'm trying to include my raw reads in my contigs mapping.

I hope it's clear now!

ADD REPLY • link 4 months ago by pauline10albert • 0

0

Entering edit mode

It is not clear why you are assembling the data if you have a reference genome available. Are you using a related reference for the alignment of the contigs? If you are interested in finding out coverage in terms of reads why don't you align your sequence data directly to the reference?

The problem is that as I use a polishing step on my contigs, I can't map my raw reads directly on the refence genome.

Circling back to the question above. If you have a reference there is no need to assemble the data, unless you expect the genome you are sequencing to be significantly different. There are ways to do reference assisted genome assembly in case you want to consider that option.

ADD REPLY • link 4 months ago by GenoMax 152k

0

Entering edit mode

I chose to assemble my data prior to mapping because I saw it in some papers, but I might be mistaking with de novo assembly. Also, as I'm studying the whole holobiont and trying to assemble MAGs in parallel, so I guess I mixed everything and wanted to apply a similar approach for both the host and the microbiome.

I'll try aligning my reads directly to the reference genome (which is a different subspecies from my data so I don't know how divergent they are), and if it doesn't work well I'll take a look at reference assisted genome assembly methods.

Thanks a lot for all your answers!

ADD REPLY • link 4 months ago by pauline10albert • 0