Question: Why there is some discrepancy between blastx and blastp outputs?
23 months ago
seta wrote:

Hi all,

I have done a de novo transcriptome assembly for a non-model plant. I have then tried to predict ORFs using Transdecoder tool. Finally, I removed asterisk characters in the "file1.transdecoder.pep" by the command of "sed -i 's/*//g' file.fasta" and did blastp against some databases. But, I found that there is much differences between the blastp and blastx (that I also conducted it for the same databases) outputs. In fact, the maximum of ortholog hit ratio (OHR) for blastx was 1 while it was 0.5 for blastp. As you know, the OHR near to 1 represent the full-transcript and highly desirable. Could you please share your opinion about it and let me know which is wrong for blastp?

Thanks so much for you help.



modified 23 months ago • written 23 months ago by seta920
Michael Dondrup
23 months ago
Bergen, Norway
Michael Dondrup wrote:

There isn't anthing wrong realy with the blastp approach. But this case is a good example why it is often better to use blastx. ORF predictions can be unreliable on assembled fragmented transcripts. The contigs can e.g. contain frameshifts, and 6-frame translations cannot compensate for those. If the stop codon is not included in the contig things get worse, the ORF predictor might also pick the wrong reading frame. 

written 23 months ago by Michael Dondrup

Thanks for your response. So it refer to ORF prediction; however the translated contig sequences is required for searching conserved domain, for example. Among several ORF prediction tool, which software do you recommend for this tasks?


written 23 months ago by seta920
