Question: Differential expression: replicates in one condition, no replicates in the other
2
gravatar for IP
19 months ago by
IP530
Denmark/University of Copenagen
IP530 wrote:

Hi Biostars:

I am facing a problem with differential expression analysis, were due to the intrinsic features of the samples we can't have replicates in one condition.

Study design: We have a patient with a very rare translocation , no other similar translocation has been described in the world and we expect that expression of the genes surrounding the translocation is altered. Hence, we have perform RNA-seq of the translocated patient and 4 controls.

How should I proceed?:

Before you kill me for asking "Can I do differential expression without replicates?", I known that EdgeR and DEseq2 provide ways to proceed without replicates , and that NOISeq could be used without replicates. However, in this case we have replicates for the controls were the biological variance of each gene could be estimated, but no for the the patient, as there is no other individual in the world. So, my question is: Is there any way of estimating the variance for the control group, and then compare to the expression in one single sample, the patient in this case? Or better, do a dispersion estimation for the translocation patient?

my options (From EdgeR docs):

  • Use the genes and transcripts that are far away or in other chromosomes than the chromosome with the translocation to estimate the dispersion of that sample
  • Use a dispersion value defined previously.

Have any of you faced a similar problem, and, furthermore, have anybody tested how do the two options above mentioned that EdgeR provide for working without replicates perform?

thanks for reading :)

sequencing edger rna-seq • 945 views
ADD COMMENTlink modified 8 weeks ago by Gordon Smyth620 • written 19 months ago by IP530

Coming back to this, I came across this package OUTRIDER which is able to find DEGs compared to controls in an n=1 situation. I haven't tried it myself, but might be worth looking at it:

Paper: https://www.sciencedirect.com/science/article/pii/S0002929718304014

ADD REPLYlink written 9 weeks ago by unawaz40
6
gravatar for Gordon Smyth
8 weeks ago by
Gordon Smyth620
Australia
Gordon Smyth620 wrote:

The short answer is that you just proceed as usual. limma, edgeR and DESeq2 have no trouble with this scenario, although the edgeR quasi-likelihood pipeline would be better than the other options. The packages simply estimate variability from groups where you do have replication (controls in your case), and apply the same dispersion estimates to all the samples in all the groups.

The edgeR and DESeq2 pipelines for no replicates are for when none of the groups have any replicates. You however do have replicate controls.

edgeR can be used right down to a two-group comparison with n=2 in one group and n=1 in the other. I'm not saying that such small sample sizes are desirable, but the package will do the best it can with what it gets and will present scientifically defensible results even in that extreme scenario. You can see an example of an n=2 vs n=1 analysis in the discussion to this paper (i.e., my reply to Conrad Burden's first report): https://f1000research.com/articles/5-1438

BTW, the same question has been asked several times on the Bioconductor Support forum, for example: https://support.bioconductor.org/p/63585/ or https://support.bioconductor.org/p/61904/

ADD COMMENTlink modified 8 weeks ago • written 8 weeks ago by Gordon Smyth620

Hi, thanks for the answer! that was actually what I finished doing. I had a 4 biological replicates for the "healthy" group and think that things worked fine

cheers,

ADD REPLYlink modified 7 weeks ago • written 7 weeks ago by IP530
1
gravatar for kristoffer.vittingseerup
10 weeks ago by
European Union
kristoffer.vittingseerup1.6k wrote:

I am sorry to tell you but if you cannot get more patients with the translocation you cannot make any generalizations to other patients. As you have no idea about the variation in the patient with the translocation you cannot do trustworthy statistics for testing the generalization. That said you can still do the analysis as a case study which is what a lot of medical doctors do. Alternatively you can try to create the same translocation in cell line and make generalisations from that.

ADD COMMENTlink written 10 weeks ago by kristoffer.vittingseerup1.6k
0
gravatar for unawaz
10 weeks ago by
unawaz40
Australia
unawaz40 wrote:

I've actually had a similar issue to yours and the way I resolved it was: downloading more controls from public databases. We were using LCLs, so we them from geuvadis.

I also did an outlier detection analysis in which I calculated Z-scores and looked for the genes in my patient that did not look like controls More info: https://bioinformatics.stackexchange.com/questions/2180/rnaseq-z-score-intensity-and-resources

ADD COMMENTlink written 10 weeks ago by unawaz40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1715 users visited in the last hour