Bam file with unmapped reads from another genome than reference
Entering edit mode
4.1 years ago
Vca80553 • 0

Hello everyone,

I started with bioinformatics 2 weeks ago, so maybe my question is a bit too easy for you, but I don't find any answer for it in other threads. I would appreciate a lot if you could help me out.

This is the case:

I mapped my paired end reads to my viral reference genome (Nextgenmap) and selected those that were mapped and paired (both pairs mapped only). Now I want to filter out, those reads that map also to human DNA (my contaminant). For that, I took the "viral_mapped_paired_end_reads.bam", converted it to fastq (R1 and R2) and mapped it to hg19. In this bam file, I extracted the unmapped paired end. So, I assume that here I have the paired end reads that only mapped to virus before , but not to human now. This bam file has the reads that I want, but unmapped to the human reference genome. No info about the reference viral genome.

Now, how do I continue? I want to do coverage analysis for the viral genome for example. Can I use the unmapped bam file? or do I need to use the viral_mapped_paired_end_reads.bam and filter out the reads that mapped to human? If so, Is it done by extracting reads IDs? Or the IDs change depending on the reference genome?

Thanks a lot

bam unmapped no reference genome • 1.4k views
Entering edit mode
4.1 years ago
mastal511 ★ 2.1k

Map your reads to hg19 first, remove the reads that map, then align the unmapped reads to the viral genome.

Entering edit mode

Thanks a lot! I thought it would be faster the other way around, as viral genome is less than 10000 pb. I will do it as you say.


Login before adding your answer.

Traffic: 2518 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6