samtools view extract from .bed list to .bam file
1
0
Entering edit mode
2.5 years ago

Hi -

I am attempting to extract certain regions from a large .bam file into a smaller subsetted .bam file using samtools view. I'm doing so in order to have a smaller file to down/upload for viewing in IGV.

I have a .bed file with the regions of interest, tab-delimited, no extra white spaces (my "chromosomes" are actually specific genes in this case, so they each start with 0 and are comparatively short). This is a snippet of the first 3 rows. Total number of regions/rows/genes is between 60-90.

52283_0_001313  0   895
52283_0_001c02  0   3102
52283_0_001309  0   2203

I have used samtools view to attempt to extract these regions of interest:

samtools view -h -b -L regions.bed input.bam > output.bam

However, the output.bam file only consists of the first region from the .bed file (e.g., "52283_0_001313" only). I have double-checked that the "chromosome" names match in my .bam file and .bed file using samtools view -H input.bam

I am using samtools version 1.10.

Any suggestions on why this is happening or how to remedy this issue? Thank you.

samtools bam view bed • 1.6k views
ADD COMMENT
0
Entering edit mode

check the output of samtools view -c input.bam "52283_0_001c02:1-3102"

ADD REPLY
1
Entering edit mode

Thank you for your reply! The output of the above was 3378, but GenoMax's solution fixed the issue.

ADD REPLY
3
Entering edit mode
2.5 years ago
GenoMax 141k

Can you check your BED file with cat -vet your.bed to make sure your line endings are unix-compatible? Did you make this file on a PC/mac and then transfer it to your server? If so doing dos2unix your.bed may fix the problem.

ADD COMMENT
0
Entering edit mode

Thank you for your quick reply! That was indeed the problem. I made the file on a Mac and transfered it to the server. I used mac2unix myfile.bed and that fixed it. Thank you very much!

Just in case others will find use in seeing the output:

> cat -vet myfile.bed

52283_0_001313^I0^I895^M52283_0_001c02^I0^I3102^M52283_0_001309^I0^I2203^M

Then

> mac2unix myfile.bed 
> cat -vet myfile.bed

52283_0_001313^I0^I895$
52283_0_001c02^I0^I3102$
52283_0_001309^I0^I2203$
ADD REPLY

Login before adding your answer.

Traffic: 2809 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6