How to remove transcripts that have poor alignment scores in exonerate analysis
Entering edit mode
5.9 years ago
Ginsea Chen ▴ 130

Dear all

I am a new user of exonerate. I tried to map protein-evidences to whole genome assembly by using exonerate with protein2genome model. After protein-evidences mapping, I wanted to filter all obtained transcripts (exonerate output file) that have poor alignment scrores. In Liang et al article (Liang C, Mao L, Ware D, et al. Evidence-based gene predictions in plant genomes[J]. Genome research, 2009, 19(10): 1912-1923.), they generally use a sequence identity threshold of 90% for same-species alignment and of 30% (protein sequence similarity) for cross-species alignments, while I only found raw alignment score (such as 805) in output file of exonerate.

So I don't know how to filter my transcripts based on exonerate results. In other words, I can't find any sequence identity value (i.e 90%) in exonerate results. So I doubt that if there were some ways to transfer raw alignment score (i.e. 805) to sequence identity value (i.e. 90%).

Thanks all

genome alignment • 2.0k views
Entering edit mode
5.9 years ago

From the manpage:

--ryo <format>
              Roll-your-own  output  format.  This allows specification of a printf-esque format line which is used
              to specify which information to include in the output, and how it is to be shown.  The  format  field
              may contain the following fields:

                     For  either  {query,target},  report the {id,definition,length,sequence,Strand,type} Sequences
                     are reported in a fasta-format like block (no headers).
                     For   either   {query,target}   region   which   occurs   in   the   alignment,   report   the
                     For  either {query,target} region which occurs in the coding sequence in the alignment, report
                     the {begin,end,length,sequence}
              %s     The raw score
              %r     The rank (in results from a bestn search)
              %m     Model name
     --->     %e[tism]
                     Equivalenced {total,id,similarity,mismatches} (ie. %em == (%et - %ei))
     --->     %p[is] Percent {id,similarity} over the equivalenced portions of the alignment.  (ie. %pi == 100*(%ei
                     / %et))
Entering edit mode

I get it ! Thanks for your suggestions!

Entering edit mode

Thank you for pointing this out, I was looking for exactly that but was too lazy to go through the --ryo options...


Login before adding your answer.

Traffic: 1958 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6