bedtools coverageBed vs samtools mpileup, speed
0
1
Entering edit mode
8.6 years ago
tonja.r ▴ 600

I would like to use bedtools as it seems to have much more options for the input/output file. However, it seems that bedtools is way slower than the samtools.

I have: multiple bed files (sorted and indexed) and multiple regions and I want a coverage per base. gene_coord_red.bed has only one region. I ran bedtools coverageBed following:

time coverageBed -a gene_coord_red.bed -b reads.sort.bam -d > bedtools.txt
real    2m17.861s
user    1m57.572s
sys    0m19.926s

I ran samtools mpileup:

time samtools mpileup -Q 0 -l gene_coord_red.bed reads.sort.bam > mpileup.txt
real    0m23.595s
user    0m22.808s
sys    0m0.301s

​Why is the bedtools running so slow? Can I accelerate it somehow?

sequencing • 3.7k views
ADD COMMENT
0
Entering edit mode

Bedtools usually has a -sorted option or something like that tends to speed things up when the BED/BAM files are sorted.

ADD REPLY
0
Entering edit mode

It has. However, if I sort my bam file with samtools sort, bedtools says:

ERROR: Sort order was unspecified, and file `sorted_out.bam` is not sorted lexicographically.

If I convert .bam file into .bed and them run bedtools with -sort, then the time is 1m11,249 what is still much comparing with the mpileup

ADD REPLY
0
Entering edit mode

I'm not sure that bedtools takes advantage of being able to randomly jump around in the BAM file to get the alignments intersecting each entry in the BED file. This would explain why you have similar times for BAM and BED files, since the latter also don't allow random access (well, without using something like tabix, but bedtools isn't using that).

ADD REPLY
0
Entering edit mode

Hello tonja.rand!

It appears that your post has been cross-posted to another site: http://seqanswers.com/forums/showthread.php?t=62741

This is typically not recommended as it runs the risk of annoying people in both communities.

ADD REPLY

Login before adding your answer.

Traffic: 2415 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6