Question: Strategies For De Novo Assembly Of A Reference
gravatar for yp.cun
4.6 years ago by
yp.cun0 wrote:

Dear all,

I am using Oases to assembly 15 paired-end RNA-SEQ data (5 tissues with 3 repeats of unknown refer genome) to generate

a)transcriptome of each sample, which mean 15 xx.fa files, 

b ) reference transcriptome based all 15 xx.fa form step a).

Taking the tissue type as A,B,C,D, E, and 1,2,3 as their repeat id. I using two following strategies to got the reference transcriptome:

1) using mutliple k-mer (31,41, 51,61) to generate 15 transcriptome of each sample(15 transcripts.fa).  And then assembly these 15 transcripts.fa to reference transcriptome by single k-mer (51). (In later step, I choose single k-mer because mutilple k-mer not working). 

2)  using single k-mer 51 to generate 15 transcriptome of each sample(15 transcripts.fa).  And then assembly these 15 transcripts.fa to reference transcriptome by single k-mer (51).

In our preliminary analysis,  the reference transcriptome generated by strategy 1) seem much better compared to strategy 2). I am not sure the  strategy 1) is wright way, could anyone give me some suggestions? Many thanks. 



ADD COMMENTlink modified 4.6 years ago by mark.ziemann1.2k • written 4.6 years ago by yp.cun0
gravatar for mark.ziemann
4.6 years ago by
mark.ziemann1.2k wrote:

Trinity assembler is a very good tool for the job because it has very few parameters to fine tune, it "just works". You can boost the accuracy of the assembly using a read corrector such as BFC.

Also, why do you want to generate individual (partial) transcriptomes for the 15 data-sets? The 15 sample assembly will be the most accurate build. You can then map reads from each of the 15 datasets to the new assembly in order to quantify transcript expression.

ADD COMMENTlink written 4.6 years ago by mark.ziemann1.2k

thanks, Mark. 

We choose Oases, cause it more computational efficient and fast. But we also tried Trinity. As Oases have multiple k-mers which theoretical and practical good compared to single k-mer, we keeping to using Oases.

As assembly 15 PE at same time required heavy memory, so that's why we use the current strategies.

ADD REPLYlink written 4.6 years ago by yp.cun0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1777 users visited in the last hour