Blog: Clarifying the concept of clinical annotation vs bioinformatic annotation
gravatar for syntax
2.4 years ago by
syntax60 wrote:

The Concept of Annotation in Comparison to Assessment


Working at a company with both clinical and bioinformatic leaders, there is confusion about the concept of “annotation.” I believe that this stems from the fact that annotation can mean one thing in a clinical context and something completely different in a bioinformatics context.

When a bioinformatician says, “I need to annotate a variant,” what they really mean is, “I have a column of variants in my spreadsheet and I need to pull in many additional columns from reference datasets that tell me more about this variant so that I can make a founded assessment.”

Thus, the phrase “reference data” is synonymous with “annotation” in that annotation means supplemental information about any genetic data point that is used to make a decision.

The diagram in the link below demonstrates how Berkeley’s open-source software, Varant, pulls data from various genomic knowledgebases in order to annotate a VCF file. The sources of annotation (on the right) are considered to be integrated into the software.


Having established this, let’s change gears to examine annotation from a clinical perspective. As demonstrated by the chart below [removed for IP purposes], a clinical geneticist would examine many forms of annotation about a variant before making an assessment as to whether or not it is pathogenic. Taking a look at the first annotation row, you can see that geneticists have indicated that they are in agreement that this variant should be considered “pathogenically strong” according to the information in the _ annotation category.

This is where it gets confusing. The dictionary definition of annotate is, “To add notes to a text in order to give an explanation or comment.” So when a doctor is making an assessment by commenting on the condition of a patient in their medical record, they are performing annotation in the literal sense. Hence, in the clinical context, “making an assessment” is synonymous with “annotating the patient’s condition.” To take this a step further, when clinicians either submit or gather their assessments of variants into a knowledgebase, their assessments become a part of a reference dataset. The flow is: I review annotation, I make an assessment, I submit my assessment to a knowledgebase, the curators of the knowledgebase determine whether or not it is significant enough to be valid, and thus it becomes part of the annotation.


Somewhere along the way, as confidence in public reference datasets grew, the term annotation evolved from, “notes about a patient’s condition,” to “reference data” in the vernacular of the greater genomic community.

In conclusion, it is crucial that the community make the distinction between annotation and assessment in not only our literature, but also in our product interfaces.

ADD COMMENTlink modified 2.4 years ago by Devon Ryan94k • written 2.4 years ago by syntax60
gravatar for Devon Ryan
2.4 years ago by
Devon Ryan94k
Freiburg, Germany
Devon Ryan94k wrote:

Everyone involved is performing an annotation in the literal sense in English. They're just annotating things with information relevant to their step in the process.

"Reference data" is not synonymous with "annotation". You can annotate something using the reference data, which might or might not itself be an annotation. For variant databases, they are annotations. For genomic sequences they aren't (e.g., you could annotate a VCF with surrounding genetic sequence from reference data, wherein the reference data would not be an annotation).

I imagine that this wording is confusing to non-native speakers, but it is all correct usage of the word "annotate".

ADD COMMENTlink written 2.4 years ago by Devon Ryan94k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2184 users visited in the last hour