Trinity assembled transcripts blastx with nr database
0
0
Entering edit mode
7.0 years ago
Bioinfonext ▴ 460

Dear all,

I have done blastx of trinity assembled transcripts with nr database. I am not able to understand one thing from output results.

During blast alignment, sometimes query sequences start from beginning and subject sequence also in the same direction.

Like this:

query start     query end          subject. start       subject end

5                 1000                   8                     350

but there are some cases also where one sequences direction is just opposite to the other sequence:

query start      query end         subject start         subject end

5                   850                       250                  2

please suggest why this kind of situation happen? are these both types of transcripts blastx correct?

RNA-Seq • 2.0k views
ADD COMMENT
0
Entering edit mode

Did you assemble with any strandedness option?

ADD REPLY
0
Entering edit mode

yes, I have strand specific RNAseq libraries so I used library type option during trinity assembly. Please suggest can we consider both types of transcripts as a protein coding transcript?

ADD REPLY
0
Entering edit mode

I still waiting for the answer......did you get my query?

ADD REPLY
0
Entering edit mode

can we consider both types of transcripts as a protein coding transcript?

Short answer: yes.

Long answer: your question just lacks a lot of information for an educated guess. You didn't say which library prep you used. You didn't tell the parameters used for assembly and blast. You didn't say if you have just a few of such transcripts, or if they comprise a significant proportion of your assembly. You didn't say if you examined the strand specificity of your reads. You didn't say if you looked at some of these transcripts in detail.

For example, the query and subject start and end suggest very different lengths, which in turn suggests you used lax parameters for you blastx search to recover more hits, but in turn increasing the number of false positives.

ADD REPLY
0
Entering edit mode

Thanks for reply:

I used Ovation RNA-Seq Systems 1–16 for Model Organisms PART NOS. 0351 ARABIDOPSIS kit for RNA-seq library preparation. I have got the response from Nugen company, they suggesting to use library type fr-secondstrand which is equivalent to library type FR in the trinity. So I assembled pair end raw read by using library type FR using Trinity, for blastx with nr database e valve 1e-30 and query cov. 50% was used. after that I did not check strand specifity. I think opposite direction transcripts are few in number but I need to cinfrim once blastx complete for all sequences. max_num_hit 1 is used during blastx.

ADD REPLY

Login before adding your answer.

Traffic: 2815 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6