RNA-seq enrichment analysis with only one sample
1
0
Entering edit mode
7.3 years ago

what is the possibility of performing gene enrichment analysis when only one sample RNA-seq data is available. This is in hibernation study where no previous study exist on my organism. Can the data be compared to existing work in bear as an example. I am trying to see if certain genes that enable hibernation in bear can be found in my data and in what quantity. I read here that using up/down regulation is a misconception and should use gene enrichment. Thank you

RNA-Seq gene • 2.0k views
ADD COMMENT
0
Entering edit mode

Where did it say that using up/down regulation is a misconception? In your case, that would actually not be possible since you only have one group, so you might be misinterpreting that.

ADD REPLY
0
Entering edit mode

It says that it is misleading to use the term here A: What Does Up-Regulated And Down-Regulated Mean?. If I have one group, I guess I will not be able to determine the genes that maybe expressed. Someone suggested using GFOLD. Not sure of the possibility of doing the analysis that is why I am asking. Thank you for your response

ADD REPLY
0
Entering edit mode

As it says in that post:

When comparing transcript concentrations in two samples on an array, all one can really say is that the transcript is enriched in one relative to the other. Whether up-regulation or down regulation is going on, is a hypothesis for further exploration in subsequent experiments.

I agree with that statement. However, they are questioning the proper term to use, not the general concept. In gene expression studies, you are comparing one group relative to another group. If you have one group, that is not possible.

I am assuming this the GFOLD you are referring to: https://www.ncbi.nlm.nih.gov/pubmed/22923299

We present the GFOLD (generalized fold change) algorithm to produce biologically meaningful rankings of differentially expressed genes from RNA-seq data. GFOLD assigns reliable statistics for expression changes based on the posterior distribution of log fold change. In this way, GFOLD overcomes the shortcomings of P-value and fold change calculated by existing RNA-seq analysis methods and gives more stable and biological meaningful gene rankings when only a single biological replicate is available.

Notice that GFOLD uses the distribution of fold changes. There is no fold change with only one group.

ADD REPLY
1
Entering edit mode
7.3 years ago

Enrichment analysis is performed on sets. So if you can define some sets of genes in your data, you can test whether these sets are enriched in some genes known to be involved your process of interest using for example the hypergeometric distribution.

ADD COMMENT
0
Entering edit mode

Thank you for your comment. So you mean if I can define some sets of genes in my data and get some genes which are enriched during hibernation from other hibernating organisms, I should be able to use hypergeometric distribution to determine the enrichment level of my selected genes? I am not sure how to go about doing that. That means it is going to be a Pass/Fail test for the genes. Is that what you are implying?

ADD REPLY
0
Entering edit mode

Let's say you've selected 10 genes from your data using whatever criterion. You can now count how many of these 10 genes are orthologs of, for example, bear hibernation genes. Let's say you find 3. The hypergeometric distribution can tell you how likely it is to find 3 genes of interest in a group of 10 by chance.

ADD REPLY
0
Entering edit mode

Gotcha. I will try that and see how it turns out. I will give an update if that works. I am assuming that I will be able to get the probability of those genes getting expressed in my organism. This is just a preliminary study for further analysis.

ADD REPLY

Login before adding your answer.

Traffic: 1711 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6