Entering edit mode
7.0 years ago
Bioinfonext
▴
460
Dear all,
I have done blastx of trinity assembled transcripts with nr database. I am not able to understand one thing from output results.
During blast alignment, sometimes query sequences start from beginning and subject sequence also in the same direction.
Like this:
query start query end subject. start subject end
5 1000 8 350
but there are some cases also where one sequences direction is just opposite to the other sequence:
query start query end subject start subject end
5 850 250 2
please suggest why this kind of situation happen? are these both types of transcripts blastx correct?
Did you assemble with any strandedness option?
yes, I have strand specific RNAseq libraries so I used library type option during trinity assembly. Please suggest can we consider both types of transcripts as a protein coding transcript?
I still waiting for the answer......did you get my query?
Short answer: yes.
Long answer: your question just lacks a lot of information for an educated guess. You didn't say which library prep you used. You didn't tell the parameters used for assembly and blast. You didn't say if you have just a few of such transcripts, or if they comprise a significant proportion of your assembly. You didn't say if you examined the strand specificity of your reads. You didn't say if you looked at some of these transcripts in detail.
For example, the query and subject start and end suggest very different lengths, which in turn suggests you used lax parameters for you blastx search to recover more hits, but in turn increasing the number of false positives.
Thanks for reply:
I used Ovation RNA-Seq Systems 1–16 for Model Organisms PART NOS. 0351 ARABIDOPSIS kit for RNA-seq library preparation. I have got the response from Nugen company, they suggesting to use library type fr-secondstrand which is equivalent to library type FR in the trinity. So I assembled pair end raw read by using library type FR using Trinity, for blastx with nr database e valve 1e-30 and query cov. 50% was used. after that I did not check strand specifity. I think opposite direction transcripts are few in number but I need to cinfrim once blastx complete for all sequences. max_num_hit 1 is used during blastx.