Question: Split bam by multiple chromosome to a single bam file
2
gravatar for l0o0
3.1 years ago by
l0o0170
China
l0o0170 wrote:

Hi, I am using bowtie + samtools pipeline to call snp. Split bam file and call snp by chromosome will save a lot of time.

But the reference genome have many scaffold, split bam by chromosome will produce a lot of scaffold_bam file. Now i want to split bam by scaffold*, so the scaffold will split into one file. 

Is there any way to do that?

I tried to use command:

 

samtools view in.bam scaffold1 scaffold2 -b > scaffold1_2.bam

 

but i don't know  how to check scaffold1_2.bam contains scaffold1 and scaffold2.

 

Thanks

snp genome • 2.8k views
ADD COMMENTlink modified 3.1 years ago by dariober9.9k • written 3.1 years ago by l0o0170
2
gravatar for dariober
3.1 years ago by
dariober9.9k
Glasgow - UK
dariober9.9k wrote:

Maybe something on these lines?

First prepare a string of required scaffold names. You can extract all the scaffold names for the bam header and use grep to get only those matching a certain patter. In this example get only scaffolds starting with "chr1":

chroms=`samtools view -H in.bam \
| awk '$1 == "@SQ" {sub("SN:", "", $2); print $2}' \
| grep -P '^chr1.*'`

Then pass this string to samtools. If the string is really long, you might need xargs to split it otherwise you exceed the maximum length of a single command (assuming you are on *nixsystem):

echo $chroms | xargs samtools view -b in.bam > scaffold1_2.bam

(Not fully tested...)

 

 

ADD COMMENTlink written 3.1 years ago by dariober9.9k

Hi dariober. I've tested your commands, and it works. Thanks for your replay.

ADD REPLYlink written 3.1 years ago by l0o0170
0
gravatar for l0o0
3.1 years ago by
l0o0170
China
l0o0170 wrote:

I've tried to do the command above, the output contains 2 scaffolds!

samtools view in.bam scaffold1 scaffold2 -b > scaffold1_2.bam

it works

ADD COMMENTlink written 3.1 years ago by l0o0170
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1732 users visited in the last hour