Question: RNA-seq analysis between 2 closely related strains of the same species
gravatar for GiantSilverSoy
16 months ago by
GiantSilverSoy0 wrote:

Hi, I am working on an RNA-seq analysis of a wild type and a gamma-irradiated mutant of a non-model organism. The aim is to identify differentially expressed genes between them. So, what I have done is generating 2 separate de novo assemblies, identifying common sequences between the 2 with at least 90% identity using cd-hit-est-2d, and using it as a mapping reference to do DE.

My question is whether my current workflow is fine to be continued or there is any generally accepted workflow to apply in my case? What do you think? Thanks.

rna-seq next-gen assembly • 497 views
ADD COMMENTlink modified 16 months ago by kristoffer.vittingseerup830 • written 16 months ago by GiantSilverSoy0
gravatar for kristoffer.vittingseerup
16 months ago by
European Union
kristoffer.vittingseerup830 wrote:

I'm assuming you dont have a reference genome due to the non-model organisme comment (if you have you should take a different approach).

The other possible approach you could do would be to do a de-novo assembly based on the pooled data and then quantify that in each of your samples.

Both approaches have problems: The drawback of your solutions is that you rely on the % identical cutoff and you might get a 1:many or many:many relationships that are hard to untangle. The drawback of my suggestion is that you might assemble something that is not pressent in the actual samples.

I think i like mine a little better because you can be certain that it is the same transcript/gene that you quantify in both samles.

ADD COMMENTlink written 16 months ago by kristoffer.vittingseerup830

Comparing expression when you map different samples to different assemblies is just too hard, so I'd second making a single coassembly. But an alternative approach to assembling all the reads together would be:

1) Assemble the control.
2) Map irradiated sample to control assembly, with fairly loose tolerance for mapping to account for polymorphisms.
3) Assemble unmapped reads.
4) Combine the assemblies and use that as your reference.

That might reduce the number of spurious assembled sequences that are just due to polymorphisms between the samples.

ADD REPLYlink modified 16 months ago • written 16 months ago by Brian Bushnell16k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1241 users visited in the last hour