Question: Differential expression: replicates in one condition, no replicates in the other
2
gravatar for IP
2.3 years ago by
IP590
Denmark/University of Copenagen
IP590 wrote:

Hi Biostars:

I am facing a problem with differential expression analysis, were due to the intrinsic features of the samples we can't have replicates in one condition.

Study design: We have a patient with a very rare translocation , no other similar translocation has been described in the world and we expect that expression of the genes surrounding the translocation is altered. Hence, we have perform RNA-seq of the translocated patient and 4 controls.

How should I proceed?:

Before you kill me for asking "Can I do differential expression without replicates?", I known that EdgeR and DEseq2 provide ways to proceed without replicates , and that NOISeq could be used without replicates. However, in this case we have replicates for the controls were the biological variance of each gene could be estimated, but no for the the patient, as there is no other individual in the world. So, my question is: Is there any way of estimating the variance for the control group, and then compare to the expression in one single sample, the patient in this case? Or better, do a dispersion estimation for the translocation patient?

my options (From EdgeR docs):

  • Use the genes and transcripts that are far away or in other chromosomes than the chromosome with the translocation to estimate the dispersion of that sample
  • Use a dispersion value defined previously.

Have any of you faced a similar problem, and, furthermore, have anybody tested how do the two options above mentioned that EdgeR provide for working without replicates perform?

thanks for reading :)

sequencing edger rna-seq • 1.4k views
ADD COMMENTlink modified 9 months ago by Gordon Smyth1.1k • written 2.3 years ago by IP590

Coming back to this, I came across this package OUTRIDER which is able to find DEGs compared to controls in an n=1 situation. I haven't tried it myself, but might be worth looking at it:

Paper: https://www.sciencedirect.com/science/article/pii/S0002929718304014

ADD REPLYlink written 10 months ago by unawaz50
6
gravatar for Gordon Smyth
9 months ago by
Gordon Smyth1.1k
Australia
Gordon Smyth1.1k wrote:

The short answer is that you just proceed as usual. limma, edgeR and DESeq2 have no trouble with this scenario, although the edgeR quasi-likelihood pipeline would be better than the other options. The packages simply estimate variability from groups where you do have replication (controls in your case), and apply the same dispersion estimates to all the samples in all the groups.

The edgeR and DESeq2 pipelines for no replicates are for when none of the groups have any replicates. You however do have replicate controls.

edgeR can be used right down to a two-group comparison with n=2 in one group and n=1 in the other. I'm not saying that such small sample sizes are desirable, but the package will do the best it can with what it gets and will present scientifically defensible results even in that extreme scenario. You can see an example of an n=2 vs n=1 analysis in the discussion to this paper (i.e., my reply to Conrad Burden's first report): https://f1000research.com/articles/5-1438

BTW, the same question has been asked several times on the Bioconductor Support forum, for example: https://support.bioconductor.org/p/63585/ or https://support.bioconductor.org/p/61904/

ADD COMMENTlink modified 9 months ago • written 9 months ago by Gordon Smyth1.1k

Hi, thanks for the answer! that was actually what I finished doing. I had a 4 biological replicates for the "healthy" group and think that things worked fine

cheers,

ADD REPLYlink modified 9 months ago • written 9 months ago by IP590
1
gravatar for kristoffer.vittingseerup
10 months ago by
European Union
kristoffer.vittingseerup2.6k wrote:

I am sorry to tell you but if you cannot get more patients with the translocation you cannot make any generalizations to other patients. As you have no idea about the variation in the patient with the translocation you cannot do trustworthy statistics for testing the generalization. That said you can still do the analysis as a case study which is what a lot of medical doctors do. Alternatively you can try to create the same translocation in cell line and make generalisations from that.

ADD COMMENTlink written 10 months ago by kristoffer.vittingseerup2.6k
0
gravatar for unawaz
10 months ago by
unawaz50
Australia
unawaz50 wrote:

I've actually had a similar issue to yours and the way I resolved it was: downloading more controls from public databases. We were using LCLs, so we them from geuvadis.

I also did an outlier detection analysis in which I calculated Z-scores and looked for the genes in my patient that did not look like controls More info: https://bioinformatics.stackexchange.com/questions/2180/rnaseq-z-score-intensity-and-resources

ADD COMMENTlink written 10 months ago by unawaz50
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1142 users visited in the last hour