A Question About Gene Expression Time-Series In A Two-Sample Problem Setup.
2
3
Entering edit mode
13.7 years ago
User 0179 ▴ 50

When we have two time-series of gene expression, each one comprised of, say, 10 time-points; one from control and one from treatment; then the problem of figuring out if they are truly different despite noise is called a two-sample problem (as in samples from both cases).

Is that also what people refer to as a two-sample experimental setup?

If the two samples were directly hybridized on a microarray, do they form a one-sample time-series? Thank you.

gene microarray • 3.6k views
ADD COMMENT
0
Entering edit mode

I think you will have to try to rephrase the question more clearly. I assume that when you talk about two-sample gene expression data, you mean the type where you simultaneously hybridize two samples on a chip. But it is not clear to me what you mean by "separate time series from control and treatment"; do you mean that control was in one channel (e.g. Cy3) and treatment in the other (e.g. Cy5)? Or do you mean that control and treatment time courses were run separately with a constant, common reference in the second channel? And what you later mean by "two directly-bybridized samples"?

ADD REPLY
0
Entering edit mode

My bad. The two samples do not refer to time-points. The data are measurements of differences in the expression levels between treated and control samples of N genes at n time-points. This setup can be realized by the direct hybridization of two samples on microarrays and the repetition of the hybridization process at different time points after the treatment. The results is a time-series of differences of length n and the problem of deciding whether this profile is differentially expressed with a low p-value depends only on that single time-series. Would that be called a one-sample problem?

ADD REPLY
4
Entering edit mode
13.7 years ago
Michael 54k

If it just about whether or not you can call this a time-series: The minimum requirement would be you measured >= 2 timepoints to have series, even though it would sound exaggerated because in gene expression time series analyses there are often much more timepoints for which measurements exist, and in general time-series analysis (e.g. on financial data) there are many more.

However, in the case of direct comparison of two time-points on a two-channel microarray you get only one single measurement which is the normally the normalized log fold-change between the channels. Leaving you with one relative measurement. Thus, I wouldn't call that a time-series design. Anyways, language is flexible, but to me

one-sample time-series

makes no sense.

Of course, there can also be, depending on your definition, the trivial cases of a time-series of length 1 or even the empty time series.

ADD COMMENT
0
Entering edit mode

My bad. The two samples do not refer to time-points. The data are measurements of differences in the expression levels between "treated" and "control" samples of N genes at n time-points. This setup can be realized by the direct hybridization of two samples on cDNA microarrays and the repetition of the hybridization process at different time points after the treatment.

ADD REPLY
0
Entering edit mode

My bad. The two samples do not refer to time-points. The data are measurements of differences in the expression levels between treated and control samples of N genes at n time-points. This setup can be realized by the direct hybridization of two samples on microarrays and the repetition of the hybridization process at different time points after the treatment.

The results is a time-series of differences of length n and the problem of deciding whether this profile is differentially expressed with a low p-value depends only on that single time-series. Would that be called a one-sample problem?

ADD REPLY
0
Entering edit mode

I see, now you clearly have a time-series design.

ADD REPLY
3
Entering edit mode
13.7 years ago

As far as I can tell, this is a semantic question and not a technical question. You have a sample that is either treated or not. You take 10 timepoint measurements for each condition, for a total of 20 measurements. The "two-sample problem" part of your question refers, I assume, to the statistical approach you'll take (e.g. see http://en.wikipedia.org/wiki/Statistical_hypothesis_testing ). A two-sample test such as Student's t-test would determine how likely it is that two samples are in fact drawn from the same population. There are various approaches for time-series analysis to determine if there is a significant difference in how genes behave over time (e.g. Storey's EDGE, http://www.genomine.org/edge/) but you wouldn't call this a "one-sample" problem.

I think the confusion comes from your proposal to reformulate the problem by representing the result of the time series as the difference between timepoints rather than the actual timepoint values. This would preserve your ability to detect differences in change (e.g. gene A is changed in treatment but not control) but not differences in overall expression level (e.g. gene A is downregulated in treatment, but the change over time is identical to control).

ADD COMMENT

Login before adding your answer.

Traffic: 2037 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6