How to properly subset a bam file?
0
1
Entering edit mode
20 months ago

Hello!

I've been pulling my hair Googling this and attempting the found solutions, none of which worked in the end. It baffles me that Samtools does not have a command to do just this.

I've tried view:

samtools view -h in.bam chr{1..22} chr{X,Y,M} > out.bam

This properly removes reads, but not the corresponding header lines of the unwanted contigs.

I've tried reheader:

samtools reheader -c " <sed commands that delete the header lines for the unwanted contigs> " in.bam > out.bam

But, while the header afterwards looks correct, and out.bam is indexable, the resulting file is truncated at some hundreds of reads, even though the file size is several GB!

What's going on? What is the proper standard canon official way of subsetting a bam, without breaking the bam format, like I have?

Before you ask, I really need to remove both the reads and the header lines. The alternative is to ask several developers to change their programs, and I suspect that's the worse solution.

Apologies for my frustration, and big big thanks in advance!

Joel

bam subset truncate • 755 views
ADD COMMENT
1
Entering edit mode

Before you ask, I really need to remove both the reads and the header lines.

nevertheless, I'm asking: WHY ?!!!

ADD REPLY
0
Entering edit mode

Because they are being parsed by programs I'm using in my research, and they're causing crashes since they're not among the 1-22, XYM standard contigs!

ADD REPLY

Login before adding your answer.

Traffic: 1424 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6