samtools view extract from .bed list to .bam file
1
0
Entering edit mode
11 months ago

Hi -

I am attempting to extract certain regions from a large .bam file into a smaller subsetted .bam file using samtools view. I'm doing so in order to have a smaller file to down/upload for viewing in IGV.

I have a .bed file with the regions of interest, tab-delimited, no extra white spaces (my "chromosomes" are actually specific genes in this case, so they each start with 0 and are comparatively short). This is a snippet of the first 3 rows. Total number of regions/rows/genes is between 60-90.

52283_0_001313  0   895
52283_0_001c02  0   3102
52283_0_001309  0   2203


I have used samtools view to attempt to extract these regions of interest:

samtools view -h -b -L regions.bed input.bam > output.bam


However, the output.bam file only consists of the first region from the .bed file (e.g., "52283_0_001313" only). I have double-checked that the "chromosome" names match in my .bam file and .bed file using samtools view -H input.bam

I am using samtools version 1.10.

Any suggestions on why this is happening or how to remedy this issue? Thank you.

samtools bam view bed • 731 views
0
Entering edit mode

check the output of samtools view -c input.bam "52283_0_001c02:1-3102"

1
Entering edit mode

Thank you for your reply! The output of the above was 3378, but GenoMax's solution fixed the issue.

3
Entering edit mode
11 months ago
GenoMax 120k

Can you check your BED file with cat -vet your.bed to make sure your line endings are unix-compatible? Did you make this file on a PC/mac and then transfer it to your server? If so doing dos2unix your.bed may fix the problem.

0
Entering edit mode

Thank you for your quick reply! That was indeed the problem. I made the file on a Mac and transfered it to the server. I used mac2unix myfile.bed and that fixed it. Thank you very much!

Just in case others will find use in seeing the output:

> cat -vet myfile.bed

52283_0_001313^I0^I895^M52283_0_001c02^I0^I3102^M52283_0_001309^I0^I2203^M


Then

> mac2unix myfile.bed
> cat -vet myfile.bed

52283_0_001313^I0^I895$52283_0_001c02^I0^I3102$
52283_0_001309^I0^I2203\$