Question: bedtools, merge function: avoid merging intervals if separated by a single base
0
gravatar for gabri
9 weeks ago by
gabri50
gabri50 wrote:

Hi All,

I'm using bedtools v2.26.0 to combine overlapping intervals of a bed file into “merged” intervals. I have a problem with some SNPs features (same start and end coord). These are my bed file and command line:

input.bed

chr1  70833  70833  a
chr1  70837  70837  b
chr1  70839  70839  c
chr1  71001  71001  d

$ bedtools merge -i input.bed -c 4 -o collapse > output.bed

output.bed

chr1  70833  70833  a
chr1  70836  70840  b,c
chr1  71001  71001  d

By default, overlapping and/or "book-ended" features are combined.

For my analysis, I need to be very accurate. So, I only want to merge the truly overlapping features. I need the features to remain separated if they are separated by one or two bases. So, in this case, the output should remain the same as the input because there aren't any overlapping intervals:

output.bed

chr1  70833  70833  a
chr1  70837  70837  b
chr1  70839  70839  c
chr1  71001  71001  d

Is there a way to obtain this kind of sensitivity with bedtools?

Thanks

ADD COMMENTlink modified 9 weeks ago by finswimmer11k • written 9 weeks ago by gabri50
1

Hello gabri ,

the output of bedtools is interesting. I'm not sure whether this a bug or by design.

Nevertheless I think your bed doesn't represent the positions you think. bed uses 0-based, half open intervals. That means it starts counting the position with 0 instead of 1. And the end position, given in the third column, isn't included. Saying this all your given intervals include no bases.

I guess your bed file should look like this:

chr1  70832  70833  a
chr1  70836  70837  b
chr1  70838  70839  c
chr1  71000  71001  d

fin swimmer

ADD REPLYlink modified 9 weeks ago • written 9 weeks ago by finswimmer11k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 833 users visited in the last hour