Extracting matching reads by read ID
2
0
Entering edit mode
7 weeks ago

What tool would you recommend to compare two BAM files and extract matching reads by read ID?

BAM • 426 views
ADD COMMENT
1
Entering edit mode

Without extracting read names, doing the comparison outside the BAM? filterbyname.sh from BBMap would be an option. You can come up with a clever way of using pipes/process redirection. May post an example later.

ADD REPLY
0
Entering edit mode

Mostly looking for performance-savvy solutions (and general inspiration if there's not a specific tool that would do it)

ADD REPLY
0
Entering edit mode

well, to be fair, I was mostly searching for a clever way to actually compare two BAM files directly, but it seems I'll have to go via extracting the read names first and then use those for subsetting (which is well covered in those posts)

ADD REPLY
2
Entering edit mode
7 weeks ago
GenoMax 111k
samtools view file1.bam | awk -F "\t" '{print $1}' | sort | uniq  > names_in_file1

filterbyname.sh -Xmx4g in=file2.bam names=names_in_file1 out=file.fq.gz include=t 

file.fq.gz will include reads that are common in both files.

ADD COMMENT
0
Entering edit mode

nice, except that I'd prefer a BAM file in the end, but I think that's an option for filterbyname.sh

ADD REPLY
0
Entering edit mode

Correct. You can simply use out=filtered.bam.

ADD REPLY
1
Entering edit mode
7 weeks ago
GenoMax 111k

There is this: https://genome.sph.umich.edu/wiki/BamUtil:_diff

@Pierre also seems to have tool for this: Comparison between .bam files

BAM file comparison

ADD COMMENT

Login before adding your answer.

Traffic: 1931 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6