Question: Low Mapping percentage with Minimap2 due to contamination
6 months ago
rah20 wrote:

Hi everyone,

Im analysing Oxford Nanopore sequenced DNA in human cells, which i align to the hg19 UCSC reference genome. Most of the time i get a high mapping percentage, however in a few cases i get mapping percentages below 5%.

Of course im able to tune the parameteres of minimap2 a bit, but i never get more than 5% mapped sequences. With the basic mapping being done as the following

minimap2 -t 8 -ax map-ont --secondary=no hg19.25chr.mmi xx.fastq | samtools sort - > xx.bam

Then for those cases with low mapping percentages, im extracting some of the unmapped sequences and by using BLAST I find the low mapping percentages are due to viral contamination of my samples.

Do any of you guys know of some better method / database in order to assess what bacteria / virus / other species the unmapped reads are aligning to, instead of just identifying the contaminated species by blasting a random amount of reads and then realigning to their respective ref genome. ?

Thanks for your help.

6 months ago

Since you have long reads you are probably going to be limited in tools you can use for screening for contamination that are currently available. kraken ( ) is generally used for this, but I am not sure if it will accept long reads.

You could use from BBMap to chop your reads up into smaller pieces and then potentially use kraken.

modified 6 months ago • written 6 months ago by genomax71k

Thanks for suggesting to cut the reads into smaller pieces, because one of my problems has exactly been im having long reads.

modified 6 months ago • written 6 months ago

you can try Kraken

written 6 months ago by Mehmet490

Thank you, ill give it a look

written 6 months ago
