Question: Trinity assembled transcripts blastx with nr database
0
gravatar for Bioinfonext
2.9 years ago by
Bioinfonext200
Korea
Bioinfonext200 wrote:

Dear all,

I have done blastx of trinity assembled transcripts with nr database. I am not able to understand one thing from output results.

During blast alignment, sometimes query sequences start from beginning and subject sequence also in the same direction.

Like this:

query start     query end          subject. start       subject end

5                 1000                   8                     350

but there are some cases also where one sequences direction is just opposite to the other sequence:

query start      query end         subject start         subject end

5                   850                       250                  2

please suggest why this kind of situation happen? are these both types of transcripts blastx correct?

rna-seq • 948 views
ADD COMMENTlink modified 2.9 years ago • written 2.9 years ago by Bioinfonext200

Did you assemble with any strandedness option?

ADD REPLYlink written 2.9 years ago by h.mon29k

yes, I have strand specific RNAseq libraries so I used library type option during trinity assembly. Please suggest can we consider both types of transcripts as a protein coding transcript?

ADD REPLYlink modified 2.9 years ago • written 2.9 years ago by Bioinfonext200

I still waiting for the answer......did you get my query?

ADD REPLYlink written 2.9 years ago by Bioinfonext200

can we consider both types of transcripts as a protein coding transcript?

Short answer: yes.

Long answer: your question just lacks a lot of information for an educated guess. You didn't say which library prep you used. You didn't tell the parameters used for assembly and blast. You didn't say if you have just a few of such transcripts, or if they comprise a significant proportion of your assembly. You didn't say if you examined the strand specificity of your reads. You didn't say if you looked at some of these transcripts in detail.

For example, the query and subject start and end suggest very different lengths, which in turn suggests you used lax parameters for you blastx search to recover more hits, but in turn increasing the number of false positives.

ADD REPLYlink written 2.9 years ago by h.mon29k

Thanks for reply:

I used Ovation RNA-Seq Systems 1–16 for Model Organisms PART NOS. 0351 ARABIDOPSIS kit for RNA-seq library preparation. I have got the response from Nugen company, they suggesting to use library type fr-secondstrand which is equivalent to library type FR in the trinity. So I assembled pair end raw read by using library type FR using Trinity, for blastx with nr database e valve 1e-30 and query cov. 50% was used. after that I did not check strand specifity. I think opposite direction transcripts are few in number but I need to cinfrim once blastx complete for all sequences. max_num_hit 1 is used during blastx.

ADD REPLYlink modified 2.9 years ago • written 2.9 years ago by Bioinfonext200
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1187 users visited in the last hour