bedtools slop set start to -1
1
0
Entering edit mode
4.8 years ago
Medhat 9.7k

Hi!

I am using bedtools slop for padding bed file with 10k.
bedtool version bedtools v2.28.0

Command:

bedtools slop -header -i gene_annotations.bed -g genome_mainchrs.genome -b 10000 > gene_annotations_padding_10000.bed

Input ex:

1   11868   12227   ENSG00000223972  .....
1   11868   14409   ENSG00000223972  .....
1   11868   14409   ENSG00000223972  ....
1   12009   12057   ENSG00000223972  ......

Output:

1   -1  1868    ENSG00000223972 
1   -1  1868    ENSG00000223972
1   -1  1868    ENSG00000223972 
1   -1  2009    ENSG00000223972

According to bedtools slop

Starts will be set to 0 if options would force it below 0.

why there is -1?

and also should not the end increase with the value of 10k ?

Update: The issue was in the genome_mainchrs.genome it contains chr[number] ex chr5 while the bed file contains just [number] 5

bed gtf bedtools slop • 2.0k views
ADD COMMENT
0
Entering edit mode

Could be a version mismatch?

ADD REPLY
0
Entering edit mode

Version mismatch of the tool and documentation or what?

ADD REPLY
0
Entering edit mode

Sorry, yes - maybe the version you're using doesn't match up to the version the doc refers to.

ADD REPLY
0
Entering edit mode

No, the issue was chromosome naming.

ADD REPLY
2
Entering edit mode
4.8 years ago

Here's an alternative method:

$ bedops --range 10000 --everything genes.bed > genes.pad10k.bed

This assumes genes.bed is sorted per sort-bed. If not:

$ bedops --range 10000 --everything <(sort-bed genes.bed) > genes.pad10k.bed

No genome bounds are specified, but that can be fixed by adding a fetchChromSizes command to the one-liner:

$ bedops --range 10000 --everything genes.bed | bedops --element-of 100% - <(fetchChromSizes hg38 | sort-bed - | awk -vOFS="\t" "{ print $1, 0, $2 }") > genes.pad10k.boundedByHg38.bed

This removes any padded intervals that fall outside the bounds of hg38. This can be modified to any genome bounds.

ADD COMMENT
0
Entering edit mode

Thanks, and I mentioned also why there was an error in my case (chromosome naming), the method you mentioned here better when you do not have .genome file.

ADD REPLY

Login before adding your answer.

Traffic: 1596 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6