bedtools intersect mistakes
2
0
Entering edit mode
6.4 years ago
schelarina ▴ 30

Hello,

I am using the following command 

bedtools intersect -wb -a file1.bed -b file2.gff3 > output.txt

In the output I have more entries that are not even present in the file1.bed!

I have tried with sorting and also changing the extension of the file2gff3 to bed but again the same output..

What is the problem?

Is there another tool i can use to do the same? or  awk ?

Thank you

bedtools • 9.1k views
ADD COMMENT
2
Entering edit mode
6.4 years ago
bedtools intersect -wb -a file1.bed -b file2.gff3 > output.txt

Will write out all instances of B that overlaps with A

If you want to return all unique B that overlap with A it's this

bedtools intersect -wb -a file1.bed -b file2.gff3 | sort | uniq > output.txt

If you are interested in A and want to find all unique overlap to B it's this

bedtools intersect -wa -a file1.bed -b file2.gff3 | sort |uniq > output.txt

If you want to find the base pair overlap in A with each element in B

bedtools intersect -wao -a file1.bed -b file2.gff3 | sort | uniq  > output.txt
ADD COMMENT
0
Entering edit mode
6.4 years ago

That doesn't sound like a mistake, but rather that you're getting the correct output. You'll get >=1 line of output for every line in file1.bed, since if a line overlaps multiple entries in file2.gff3 then you'll get each of those. Since you're intersecting with a gff file, it'd be surprising not to see this sort of behaviour and all tools will and should act like this.

Perhaps you just want to intersect with unique exons.

ADD COMMENT

Login before adding your answer.

Traffic: 2843 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6