Tool:ORFanage: by-reference protein annotation and comparison for transcriptome assembly
0
0
Entering edit mode
23 hours ago
Ales ▴ 70

Shamelessly sharing an older tool ORFanage for ORF annotation - it might be useful to others who work with transcriptome assemblies and genome annotation.

While longest-ORF, most-upstream-ORF, or de novo prediction approaches often work well, they can sometimes miss biologically relevant isoforms, introduce errors or be inefficient for larger datasets. Our method solves these issues by selecting the most biologically consistent ORF for each transcript based on similarity to reference proteins, using an efficient interval-based algorithm.

In short, ORFanage:

  1. Finds the most likely ORF for each transcript in a GTF/GFF file based on maximizing similarity to proteins in one or more reference annotations.
  2. Quantifies frame shifts and other changes relative to the reference. Can also be used to perform exhaustive comparisons of annotated proteins between annotations.
  3. Scales efficiently to very large datasets using an interval-based pseudo-alignment algorithm avoiding costly sequence comparisons for most cases..

Additionally, we have recently added a small utility method ORFcompare to perform all-vs-all comparisons of CDS records between multiple annotation sources

When applied to large RNA-seq assemblies, ORFanage can help identify relevant transcripts, novel proteins, filter out noise and help take raw assemblies several steps closer towards complete annotations. It can also highlight inconsistencies or possible corrections in reference annotations—something we observed when applying it to RefSeq and GENCODE human datasets.

ORFanage and ORFcompare are both available on GitHub: https://github.com/alevar/ORFanage

You can also read more in the published study: https://pmc.ncbi.nlm.nih.gov/articles/PMC10718564/

Hope the methods are useful and easy to use!

orf rna-seq assembly annotation transcriptome • 85 views
ADD COMMENT

Login before adding your answer.

Traffic: 7898 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6