Question: bedtools intersect + subtract not adding up to original file?
0
gravatar for asperlea
2.6 years ago by
asperlea0
asperlea0 wrote:

Hi,

I have a BED file of features in the genome, for which I am trying to see how many bases overlap exons and how many don't. I used bedtools intersect to get a BED file that gives me the intersection of my features with an exon annotation from gencode, and bedtools sutract to get a BED file for the bases in my features that are not in exons. I then used awk -F'\t' 'BEGIN{SUM=0}{ SUM+=$3-$2 }END{print SUM}' to get the number of bases covered by each of these files but I am running into the strange issue that the bases of the intersection + the length of the subtraction do not add up to the number of bases in the original file. They add up to slightly more than the original.

Am I making some basic logic error, misunderstanding the intersect and subtract functionality or is something very strange going on?

Thanks,

Adriana

subtract intersect bed bedtools • 1.2k views
ADD COMMENTlink modified 2.6 years ago by harold.smith.tarheel4.3k • written 2.6 years ago by asperlea0

Figured it out. The exons sometimes overlap resulting in the intersection being reported twice. Had to merge the exon annotation and everything worked.

ADD REPLYlink written 2.6 years ago by asperlea0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 911 users visited in the last hour