Question: Trinity assembled transcripts blastx with nr database
0
gravatar for Bioinfonext
23 months ago by
Bioinfonext140
Korea
Bioinfonext140 wrote:

Dear all,

I have done blastx of trinity assembled transcripts with nr database. I am not able to understand one thing from output results.

During blast alignment, sometimes query sequences start from beginning and subject sequence also in the same direction.

Like this:

query start     query end          subject. start       subject end

5                 1000                   8                     350

but there are some cases also where one sequences direction is just opposite to the other sequence:

query start      query end         subject start         subject end

5                   850                       250                  2

please suggest why this kind of situation happen? are these both types of transcripts blastx correct?

rna-seq • 730 views
ADD COMMENTlink modified 23 months ago • written 23 months ago by Bioinfonext140

Did you assemble with any strandedness option?

ADD REPLYlink written 23 months ago by h.mon24k

yes, I have strand specific RNAseq libraries so I used library type option during trinity assembly. Please suggest can we consider both types of transcripts as a protein coding transcript?

ADD REPLYlink modified 23 months ago • written 23 months ago by Bioinfonext140

I still waiting for the answer......did you get my query?

ADD REPLYlink written 23 months ago by Bioinfonext140

can we consider both types of transcripts as a protein coding transcript?

Short answer: yes.

Long answer: your question just lacks a lot of information for an educated guess. You didn't say which library prep you used. You didn't tell the parameters used for assembly and blast. You didn't say if you have just a few of such transcripts, or if they comprise a significant proportion of your assembly. You didn't say if you examined the strand specificity of your reads. You didn't say if you looked at some of these transcripts in detail.

For example, the query and subject start and end suggest very different lengths, which in turn suggests you used lax parameters for you blastx search to recover more hits, but in turn increasing the number of false positives.

ADD REPLYlink written 23 months ago by h.mon24k

Thanks for reply:

I used Ovation RNA-Seq Systems 1–16 for Model Organisms PART NOS. 0351 ARABIDOPSIS kit for RNA-seq library preparation. I have got the response from Nugen company, they suggesting to use library type fr-secondstrand which is equivalent to library type FR in the trinity. So I assembled pair end raw read by using library type FR using Trinity, for blastx with nr database e valve 1e-30 and query cov. 50% was used. after that I did not check strand specifity. I think opposite direction transcripts are few in number but I need to cinfrim once blastx complete for all sequences. max_num_hit 1 is used during blastx.

ADD REPLYlink modified 23 months ago • written 23 months ago by Bioinfonext140
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1475 users visited in the last hour