**40**wrote:

I have a table like this:

```
Age5 Age5 Age5 Age22 Age22 Age22
Gene1 1.2 2.3 4.5 3.4 4.5 1.3
Gene2 2.4 -2.3 1.3 1.2 3.4 4.5
```

i.e two age groups (5 and 22), for multiple genes and the values are log2 transformed gene expression data.

for one gene, for example, I did a linear regression (so the x axis is age, and the y axis is log2 expression values). The statistics from the output from that regression are:

```
equation type: linear
co-efficient = -0.127
intercept = 4.85
data transformation = log2
% change between age group 5 and 22 = 51%
```

The problem, I do not understand how they calculated the % change as 51% using the information.

For example, I said: y = b0 + b1(x)

For expression data at age 5;

```
y = 4.85 + (5)(-0.127)
y = 4.85-0.635
y = 4.215
```

but since the y(gene expression) is log2 transformed, I log2 transformed 4.215; so the expression data at age 5 (i.e. y) is 2.075.

Then I did the same for age 22:

```
y = 4.85 + (22)(-0.127)
y = 4.85 -3.74
y = 1.11
```

and similarly, since y is log2 transformed, I log2 transformed 1.11, so the expression at age 22 (i.e. y) = 0.151.

Then, I cannot seem to combine the two expression values (i.e. 2.075 and 0.151) in a way that will give me a 51% change in gene expression between the two age groups as calculated from a linear regression. Can someone show me how this calculation is done?

In case anyone is interested, this is where I got all the above numbers used in my calculations from.

Sorry but I cannot follow and understand your question exaclty... Did you do linear regression yourself, if yes please show your code and data. If not, what do you exactly want to know? How to transform log2 data back to non-transformed data?

You can undo log2 transformation as follow:

7.9kThank you for taking the time to reply. I downloaded a table from a database, digital ageing atlas. So one example of a gene is here.I can see that for the gene in the example link; they did a linear regression for 5 and 22 month old mice, and found the slope/co-efficient of the linear regression to be -0.127, the intercept to be 4.85, the expression data was log2 transformed and they found a 51% decrease in expression for this gene, between the ages of 5 and 22 months (all information in link). I do not understand how they calculated 51%. I checked using R that I get the same values (i.e. slope, co-efficient) when I do a linear regression myself (let me know if posting the code would make a difference?)

Can you show me a calculation how they used the above numbers (e.g. the slope, intercept, ages, log2) to calculate a 51% decrease in expression?

40I think it would help if you show your code and data.

7.9kSo the code to conduct the linear regression (and F test) looks like this:

The table is too long to post, as it is a gene expression matrix, the columns are log2 transformed gene expression data, the rows are 19,000 genes, and the column names are the age categories e.g. "Age5 Age 5 Age5 Age22 Age22 Age22"? But regardless, in the link I have shown, is it possible to calculate 51% using only the information provided in that link (i.e. intercept, slope etc?)

40It would help to give the values for your gene of interest, but I think the 51% is the difference between the means of the groups. Like a fold change.

7.9kThank you for the reply. So I understand that the 51% is the difference between the groups yes.

My question is, in my example is it possible, knowing ONLY the elements of the linear regression, i.e. the slope(-0.127), intercept(4.85), age range (i.e. 5-22 months), the data transform for the gene expression (log2), to calculate a 51% value, WITHOUT needing the raw data. and if this is possible, what is the calculation to obtain this?

40No, you need means of both groups.

7.9kOk great, thanks I appreciate that!

So I was told that I could calculate 51% using only the above digits. So then, instead, for this(the exact same as previous example, except 10% increase instead of 51%, I've just changed it because I can quickly find the gene expression data for this)

I have a gene expression file like this (line 1 is sample names, line 2 is sample ages, and line 3 is log2 gene expression for this particular gene):

I put this line into the above code, and I get the exact slope, intercept, p values etc as described in the hyperlink in this comment. Can you tell me how I could change the R code I've given above, to extract the percentage change with age? Note that in this particular case, it is not the mean between two groups, but rather a change in age overall (and I know that the answer should be a 10% increase in gene expression for this gene with age).

40Seems to me that the website you are referring to has poor description of methods. If you don't know how they come to a percentage or value, you also don't know if you can use it or not. So my advice would be to analyze it yourself, so that you know what you are doing. Good luck!

7.9k