Question: Differential expression: replicates in one condition, no replicates in the other
2
gravatar for IP
23 months ago by
IP590
Denmark/University of Copenagen
IP590 wrote:

Hi Biostars:

I am facing a problem with differential expression analysis, were due to the intrinsic features of the samples we can't have replicates in one condition.

Study design: We have a patient with a very rare translocation , no other similar translocation has been described in the world and we expect that expression of the genes surrounding the translocation is altered. Hence, we have perform RNA-seq of the translocated patient and 4 controls.

How should I proceed?:

Before you kill me for asking "Can I do differential expression without replicates?", I known that EdgeR and DEseq2 provide ways to proceed without replicates , and that NOISeq could be used without replicates. However, in this case we have replicates for the controls were the biological variance of each gene could be estimated, but no for the the patient, as there is no other individual in the world. So, my question is: Is there any way of estimating the variance for the control group, and then compare to the expression in one single sample, the patient in this case? Or better, do a dispersion estimation for the translocation patient?

my options (From EdgeR docs):

  • Use the genes and transcripts that are far away or in other chromosomes than the chromosome with the translocation to estimate the dispersion of that sample
  • Use a dispersion value defined previously.

Have any of you faced a similar problem, and, furthermore, have anybody tested how do the two options above mentioned that EdgeR provide for working without replicates perform?

thanks for reading :)

sequencing edger rna-seq • 1.1k views
ADD COMMENTlink modified 5 months ago by Gordon Smyth840 • written 23 months ago by IP590

Coming back to this, I came across this package OUTRIDER which is able to find DEGs compared to controls in an n=1 situation. I haven't tried it myself, but might be worth looking at it:

Paper: https://www.sciencedirect.com/science/article/pii/S0002929718304014

ADD REPLYlink written 6 months ago by unawaz40
6
gravatar for Gordon Smyth
5 months ago by
Gordon Smyth840
Australia
Gordon Smyth840 wrote:

The short answer is that you just proceed as usual. limma, edgeR and DESeq2 have no trouble with this scenario, although the edgeR quasi-likelihood pipeline would be better than the other options. The packages simply estimate variability from groups where you do have replication (controls in your case), and apply the same dispersion estimates to all the samples in all the groups.

The edgeR and DESeq2 pipelines for no replicates are for when none of the groups have any replicates. You however do have replicate controls.

edgeR can be used right down to a two-group comparison with n=2 in one group and n=1 in the other. I'm not saying that such small sample sizes are desirable, but the package will do the best it can with what it gets and will present scientifically defensible results even in that extreme scenario. You can see an example of an n=2 vs n=1 analysis in the discussion to this paper (i.e., my reply to Conrad Burden's first report): https://f1000research.com/articles/5-1438

BTW, the same question has been asked several times on the Bioconductor Support forum, for example: https://support.bioconductor.org/p/63585/ or https://support.bioconductor.org/p/61904/

ADD COMMENTlink modified 5 months ago • written 5 months ago by Gordon Smyth840

Hi, thanks for the answer! that was actually what I finished doing. I had a 4 biological replicates for the "healthy" group and think that things worked fine

cheers,

ADD REPLYlink modified 5 months ago • written 5 months ago by IP590
1
gravatar for kristoffer.vittingseerup
6 months ago by
European Union
kristoffer.vittingseerup2.0k wrote:

I am sorry to tell you but if you cannot get more patients with the translocation you cannot make any generalizations to other patients. As you have no idea about the variation in the patient with the translocation you cannot do trustworthy statistics for testing the generalization. That said you can still do the analysis as a case study which is what a lot of medical doctors do. Alternatively you can try to create the same translocation in cell line and make generalisations from that.

ADD COMMENTlink written 6 months ago by kristoffer.vittingseerup2.0k
0
gravatar for unawaz
6 months ago by
unawaz40
Australia
unawaz40 wrote:

I've actually had a similar issue to yours and the way I resolved it was: downloading more controls from public databases. We were using LCLs, so we them from geuvadis.

I also did an outlier detection analysis in which I calculated Z-scores and looked for the genes in my patient that did not look like controls More info: https://bioinformatics.stackexchange.com/questions/2180/rnaseq-z-score-intensity-and-resources

ADD COMMENTlink written 6 months ago by unawaz40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1824 users visited in the last hour