Question

Blog:Clarifying the concept of clinical annotation vs bioinformatic annotation

0

Entering edit mode

7.6 years ago

LayneSadler ▴ 90

The Concept of Annotation in Comparison to Assessment

===

Working at a company with both clinical and bioinformatic leaders, there is confusion about the concept of “annotation.” I believe that this stems from the fact that annotation can mean one thing in a clinical context and something completely different in a bioinformatics context.

When a bioinformatician says, “I need to annotate a variant,” what they really mean is, “I have a column of variants in my spreadsheet and I need to pull in many additional columns from reference datasets that tell me more about this variant so that I can make a founded assessment.”

Thus, the phrase “reference data” is synonymous with “annotation” in that annotation means supplemental information about any genetic data point that is used to make a decision.

The diagram in the link below demonstrates how Berkeley’s open-source software, Varant, pulls data from various genomic knowledgebases in order to annotate a VCF file. The sources of annotation (on the right) are considered to be integrated into the software. https://imgur.com/KACAU1v

===

Having established this, let’s change gears to examine annotation from a clinical perspective. As demonstrated by the chart below [removed for IP purposes], a clinical geneticist would examine many forms of annotation about a variant before making an assessment as to whether or not it is pathogenic. Taking a look at the first annotation row, you can see that geneticists have indicated that they are in agreement that this variant should be considered “pathogenically strong” according to the information in the _ annotation category.

This is where it gets confusing. The dictionary definition of annotate is, “To add notes to a text in order to give an explanation or comment.” So when a doctor is making an assessment by commenting on the condition of a patient in their medical record, they are performing annotation in the literal sense. Hence, in the clinical context, “making an assessment” is synonymous with “annotating the patient’s condition.” To take this a step further, when clinicians either submit or gather their assessments of variants into a knowledgebase, their assessments become a part of a reference dataset. The flow is: I review annotation, I make an assessment, I submit my assessment to a knowledgebase, the curators of the knowledgebase determine whether or not it is significant enough to be valid, and thus it becomes part of the annotation.

===

Somewhere along the way, as confidence in public reference datasets grew, the term annotation evolved from, “notes about a patient’s condition,” to “reference data” in the vernacular of the greater genomic community.

In conclusion, it is crucial that the community make the distinction between annotation and assessment in not only our literature, but also in our product interfaces.

variant-annotation reference-data • 3.3k views

ADD COMMENT • link updated 2.4 years ago by Ram 45k • written 7.6 years ago by LayneSadler ▴ 90

score 0 · Answer 1 · 2017-11-17

Everyone involved is performing an annotation in the literal sense in English. They're just annotating things with information relevant to their step in the process.

"Reference data" is not synonymous with "annotation". You can annotate something using the reference data, which might or might not itself be an annotation. For variant databases, they are annotations. For genomic sequences they aren't (e.g., you could annotate a VCF with surrounding genetic sequence from reference data, wherein the reference data would not be an annotation).

I imagine that this wording is confusing to non-native speakers, but it is all correct usage of the word "annotate".