bedtools intersect + subtract not adding up to original file?
1
0
Entering edit mode
7.6 years ago
asperlea • 0

Hi,

I have a BED file of features in the genome, for which I am trying to see how many bases overlap exons and how many don't. I used bedtools intersect to get a BED file that gives me the intersection of my features with an exon annotation from gencode, and bedtools sutract to get a BED file for the bases in my features that are not in exons. I then used awk -F'\t' 'BEGIN{SUM=0}{ SUM+=$3-$2 }END{print SUM}' to get the number of bases covered by each of these files but I am running into the strange issue that the bases of the intersection + the length of the subtraction do not add up to the number of bases in the original file. They add up to slightly more than the original.

Am I making some basic logic error, misunderstanding the intersect and subtract functionality or is something very strange going on?

Thanks,

Adriana

bedtools intersect subtract BED • 3.2k views
ADD COMMENT
0
Entering edit mode

Figured it out. The exons sometimes overlap resulting in the intersection being reported twice. Had to merge the exon annotation and everything worked.

ADD REPLY

Login before adding your answer.

Traffic: 2567 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6