Question: How to only change/substitute every QNAME(Read ID) from a Bam file?
gravatar for Joe
4.4 years ago by
Joe30 wrote:

Hi, I need to substitute a bam file's QNAME to another bam file's QNAME, but without change header, sequence and everything else. Here is Example ( I don't show seq and others here): I have A.bam:


M04402:7:000000000-ANGUV:1:1106:13178:9134 147  chr1    17027   0 ...............

M04402:7:000000000-ANGUV:1:1106:13178:9134 99   chr1    17205   0....................

M04402:7:000000000-ANGUV:1:2115:20665:6740 147  chr1    17344   0.................
and so on......

I have another B.bam. the QNAME is

M00601:223:AM07W:1:1102:11074:13214 163 chr1    19754   24.............

M00601:223:AM07W:1:2101:17585:17440 99  chr1    17205   0....................

and so on......

I want to change all the A.bam QNAMEs (eg:M04402:7:000000000-ANGUV:1:1106:13178:9134) to B.bam QNAMEs(eg:M00601:223:AM07W:1:1102:11074:13214). But only change the QNAME, not other things.

What I can think about is use command line "sed".


qname sam bam • 1.6k views
ADD COMMENTlink modified 4.4 years ago by GenoMax94k • written 4.4 years ago by Joe30

While it can probably be done using the code posted by @i.sudbery I can't figure out the use case. Why do you want to do this?

ADD REPLYlink written 4.4 years ago by GenoMax94k

The reason I want to do this is How to merge two identical BAM files? I want to change QNAME, then redo the thing I post in the link.

ADD REPLYlink written 4.4 years ago by Joe30

I see. You are creating a fake duplicate with same sequence data but different headers. You could just change ANGUV to BNGUV which would make it a new flowcell :-)

ADD REPLYlink written 4.4 years ago by GenoMax94k

It works, thanks a lot

ADD REPLYlink written 4.4 years ago by Joe30
gravatar for i.sudbery
4.4 years ago by
Sheffield, UK
i.sudbery10k wrote:

It would be easy with pysam in Python, assuming that the order of both BAM files was to be maintained:

from pysam import AlignmentFile
from itertools import izip

bam1 = AlignmentFile("A.bam")
bam2 = AlignmentFile("B.bam")
outbam = AlignmentFile("C.bam", "wb", template = bam1)

for read1, read2 in izip(bam1.fetch(until_eof=True), bam2.fetch(until_eof=True)):
    read1.query_name = read2.query_name

ADD COMMENTlink written 4.4 years ago by i.sudbery10k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1936 users visited in the last hour