Question: Split bam by multiple chromosome to a single bam file
2
gravatar for l0o0
3.4 years ago by
l0o0190
China
l0o0190 wrote:

Hi, I am using bowtie + samtools pipeline to call snp. Split bam file and call snp by chromosome will save a lot of time.

But the reference genome have many scaffold, split bam by chromosome will produce a lot of scaffold_bam file. Now i want to split bam by scaffold*, so the scaffold will split into one file. 

Is there any way to do that?

I tried to use command:

 

samtools view in.bam scaffold1 scaffold2 -b > scaffold1_2.bam

 

but i don't know  how to check scaffold1_2.bam contains scaffold1 and scaffold2.

 

Thanks

snp genome • 3.1k views
ADD COMMENTlink modified 3.4 years ago by dariober10k • written 3.4 years ago by l0o0190
2
gravatar for dariober
3.4 years ago by
dariober10k
WCIP | Glasgow | UK
dariober10k wrote:

Maybe something on these lines?

First prepare a string of required scaffold names. You can extract all the scaffold names for the bam header and use grep to get only those matching a certain patter. In this example get only scaffolds starting with "chr1":

chroms=`samtools view -H in.bam \
| awk '$1 == "@SQ" {sub("SN:", "", $2); print $2}' \
| grep -P '^chr1.*'`

Then pass this string to samtools. If the string is really long, you might need xargs to split it otherwise you exceed the maximum length of a single command (assuming you are on *nixsystem):

echo $chroms | xargs samtools view -b in.bam > scaffold1_2.bam

(Not fully tested...)

 

 

ADD COMMENTlink written 3.4 years ago by dariober10k

Hi dariober. I've tested your commands, and it works. Thanks for your replay.

ADD REPLYlink written 3.4 years ago by l0o0190
0
gravatar for l0o0
3.4 years ago by
l0o0190
China
l0o0190 wrote:

I've tried to do the command above, the output contains 2 scaffolds!

samtools view in.bam scaffold1 scaffold2 -b > scaffold1_2.bam

it works

ADD COMMENTlink written 3.4 years ago by l0o0190
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 991 users visited in the last hour