Question: How to only change/substitute every QNAME(Read ID) from a Bam file?
0
gravatar for Joe
2.2 years ago by
Joe30
USA
Joe30 wrote:

Hi, I need to substitute a bam file's QNAME to another bam file's QNAME, but without change header, sequence and everything else. Here is Example ( I don't show seq and others here): I have A.bam:

A.bamHeader

M04402:7:000000000-ANGUV:1:1106:13178:9134 147  chr1    17027   0 ...............

M04402:7:000000000-ANGUV:1:1106:13178:9134 99   chr1    17205   0....................

M04402:7:000000000-ANGUV:1:2115:20665:6740 147  chr1    17344   0.................
and so on......

I have another B.bam. the QNAME is

M00601:223:AM07W:1:1102:11074:13214 163 chr1    19754   24.............

M00601:223:AM07W:1:2101:17585:17440 99  chr1    17205   0....................

and so on......

I want to change all the A.bam QNAMEs (eg:M04402:7:000000000-ANGUV:1:1106:13178:9134) to B.bam QNAMEs(eg:M00601:223:AM07W:1:1102:11074:13214). But only change the QNAME, not other things.

What I can think about is use command line "sed".

Thanks!

qname sam bam • 839 views
ADD COMMENTlink modified 2.2 years ago by genomax57k • written 2.2 years ago by Joe30

While it can probably be done using the code posted by @i.sudbery I can't figure out the use case. Why do you want to do this?

ADD REPLYlink written 2.2 years ago by genomax57k

The reason I want to do this is How to merge two identical BAM files? I want to change QNAME, then redo the thing I post in the link.

ADD REPLYlink written 2.2 years ago by Joe30

I see. You are creating a fake duplicate with same sequence data but different headers. You could just change ANGUV to BNGUV which would make it a new flowcell :-)

ADD REPLYlink written 2.2 years ago by genomax57k

It works, thanks a lot

ADD REPLYlink written 2.2 years ago by Joe30
2
gravatar for i.sudbery
2.2 years ago by
i.sudbery2.6k
Sheffield, UK
i.sudbery2.6k wrote:

It would be easy with pysam in Python, assuming that the order of both BAM files was to be maintained:

from pysam import AlignmentFile
from itertools import izip

bam1 = AlignmentFile("A.bam")
bam2 = AlignmentFile("B.bam")
outbam = AlignmentFile("C.bam", "wb", template = bam1)

for read1, read2 in izip(bam1.fetch(until_eof=True), bam2.fetch(until_eof=True)):
    read1.query_name = read2.query_name
    outbam.write(read1)

outbam.close()
ADD COMMENTlink written 2.2 years ago by i.sudbery2.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1523 users visited in the last hour