Classifying Query Set Vs Truth Set Overlapping Genomic Ranges
0
1
Entering edit mode
10.8 years ago

I would like to classify a set of query genomic ranges against a set of truth genomic ranges given a minimum overlap rule of more than half the intersection/union. I call this overlap over the threshold a successful overlap.

If the query range, eg. Q1, successfully overlaps one of the truth ranges (eg. T12), I classify Q1 as True Positive. If it doesn't, I classify it as False Positive.

But I am considering how to classify a case where two query ranges, eg. Q1 and Q2, both successfully overlap the same truth range, eg. T3:

Example:

T3   |----------------------------|
Q1       |-------------------------|
Q2    |--------------------------|

How would people classify Q1 and Q2? Both as True Positives? One as True Positive and the other as False Positive?

classification • 2.1k views
ADD COMMENT
0
Entering edit mode

That entirely depends on your question. You can define any number of rules with or without biological backing, but even with a biological basis, there will be gray areas.

I think even if you gave more info--including the exact problem you're trying to address, there would be no single, clear answer.

ADD REPLY
0
Entering edit mode

It depends on what question you're trying to answer. Perhaps Q2 would be a True Positive and Q1 a False Positive on the basis of Q2's overlap with T3 being longer in extent than Q1's. Or perhaps you are just categorizing overlaps with T3 above some threshold, which would label Q1 and Q2 as True Positives.

ADD REPLY

Login before adding your answer.

Traffic: 1604 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6