Question: RNA-seq analysis between 2 closely related strains of the same species
gravatar for GiantSilverSoy
3.6 years ago by
GiantSilverSoy20 wrote:

Hi, I am working on an RNA-seq analysis of a wild type and a gamma-irradiated mutant of a non-model organism. The aim is to identify differentially expressed genes between them. So, what I have done is generating 2 separate de novo assemblies, identifying common sequences between the 2 with at least 90% identity using cd-hit-est-2d, and using it as a mapping reference to do DE.

My question is whether my current workflow is fine to be continued or there is any generally accepted workflow to apply in my case? What do you think? Thanks.

rna-seq next-gen assembly • 956 views
ADD COMMENTlink modified 3.6 years ago by kristoffer.vittingseerup3.5k • written 3.6 years ago by GiantSilverSoy20
gravatar for kristoffer.vittingseerup
3.6 years ago by
European Union
kristoffer.vittingseerup3.5k wrote:

I'm assuming you dont have a reference genome due to the non-model organisme comment (if you have you should take a different approach).

The other possible approach you could do would be to do a de-novo assembly based on the pooled data and then quantify that in each of your samples.

Both approaches have problems: The drawback of your solutions is that you rely on the % identical cutoff and you might get a 1:many or many:many relationships that are hard to untangle. The drawback of my suggestion is that you might assemble something that is not pressent in the actual samples.

I think i like mine a little better because you can be certain that it is the same transcript/gene that you quantify in both samles.

ADD COMMENTlink written 3.6 years ago by kristoffer.vittingseerup3.5k

Comparing expression when you map different samples to different assemblies is just too hard, so I'd second making a single coassembly. But an alternative approach to assembling all the reads together would be:

1) Assemble the control.
2) Map irradiated sample to control assembly, with fairly loose tolerance for mapping to account for polymorphisms.
3) Assemble unmapped reads.
4) Combine the assemblies and use that as your reference.

That might reduce the number of spurious assembled sequences that are just due to polymorphisms between the samples.

ADD REPLYlink modified 3.6 years ago • written 3.6 years ago by Brian Bushnell17k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 900 users visited in the last hour