Question: Should I be using the reference genome
gravatar for Biogeek
3.9 years ago by
Biogeek400 wrote:

Hey all,

I have a bit of a predicament. I'm a student currently analyzing RNAseq data. I usually go for de novo assembly;however, recently a reference genome has come available for our organism. The problem is, the genome is incomplete, it's a draft and about a third to a quarter of it is estimated to be missing. I am obtaining about 10,000 extra genes when using de novo transcriptome assembly, and as such this number of unigenes comes closer to the 100% estimated amount of genes.

Do I go with de novo assembly using trinity based on my deep sequenced RNA-seq reads OR do I map to the genome even though it's a draft which is quite incomplete.

Please let me know what would be best.


rnaseq genome • 979 views
ADD COMMENTlink modified 3.9 years ago by Brian Bushnell17k • written 3.9 years ago by Biogeek400
gravatar for WouterDeCoster
3.9 years ago by
WouterDeCoster44k wrote:

I would suggest to make a de novo assembly of your transcriptome. use that to estimate the completeness of the genome and base your decision on that.

ADD COMMENTlink written 3.9 years ago by WouterDeCoster44k
gravatar for Brian Bushnell
3.9 years ago by
Walnut Creek, USA
Brian Bushnell17k wrote:

Sounds to me like de-novo would be better, if a genome is really only 70% complete, unless the missing portions are just very low complexity or perfect repeats of what you've already got (which is likely, or else they should have assembled). Do you know what is expected to be missing? Or - what is the evidence that so much is missing? What percent of your RNA reads map to the genome?

You could always do a hybrid, mapping to the genome and assembling what's left over, or assembling, then mapping what didn't assemble. But if you don't trust the genome, the simplest approach is just to de-novo assemble.

ADD COMMENTlink written 3.9 years ago by Brian Bushnell17k

Thanks for your comments guys. I've downloaded the coded proteome of the genome and I'm doing a transrate CRBB coverage analysis against my de novo assembly, to determine how much of the genome I'm covering. Would this provide me with a firm foundation to stand on if someone asks me why I opted for a de novo assembly? As long as I can show statistics which indicate i have excellent coverage? Thanks

ADD REPLYlink written 3.9 years ago by Biogeek400
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1675 users visited in the last hour