How to remove transcripts that have poor alignment scores in exonerate analysis
6.0 years ago
Ginsea Chen ▴ 130

Dear all

I am a new user of exonerate. I tried to map protein-evidences to whole genome assembly by using exonerate with protein2genome model. After protein-evidences mapping, I wanted to filter all obtained transcripts (exonerate output file) that have poor alignment scrores. In Liang et al article (Liang C, Mao L, Ware D, et al. Evidence-based gene predictions in plant genomes[J]. Genome research, 2009, 19(10): 1912-1923.), they generally use a sequence identity threshold of 90% for same-species alignment and of 30% (protein sequence similarity) for cross-species alignments, while I only found raw alignment score (such as 805) in output file of exonerate.

So I don't know how to filter my transcripts based on exonerate results. In other words, I can't find any sequence identity value (i.e 90%) in exonerate results. So I doubt that if there were some ways to transfer raw alignment score (i.e. 805) to sequence identity value (i.e. 90%).

Thanks all

6.0 years ago

From the manpage:

--ryo <format>
Roll-your-own  output  format.  This allows specification of a printf-esque format line which is used
to specify which information to include in the output, and how it is to be shown.  The  format  field
may contain the following fields:

%[qt][idlsSt]
For  either  {query,target},  report the {id,definition,length,sequence,Strand,type} Sequences
are reported in a fasta-format like block (no headers).
%[qt]a[bels]
For   either   {query,target}   region   which   occurs   in   the   alignment,   report   the
{begin,end,length,sequence}
%[qt]c[bels]
For  either {query,target} region which occurs in the coding sequence in the alignment, report
the {begin,end,length,sequence}
%s     The raw score
%r     The rank (in results from a bestn search)
%m     Model name
--->     %e[tism]
Equivalenced {total,id,similarity,mismatches} (ie. %em == (%et - %ei))
--->     %p[is] Percent {id,similarity} over the equivalenced portions of the alignment.  (ie. %pi == 100*(%ei
/ %et))

I get it ! Thanks for your suggestions!

Thank you for pointing this out, I was looking for exactly that but was too lazy to go through the --ryo options...