I have four samples（A,B,C,D） and 2 class gene lists(E,F).
for example, one gene, belonged to E, in A showed positivie(referred to as **1**), but it in B,C and D showed negative(referred to as **0**). how to use glm to predict the role of the samples and gene list on values(1 or 0)?
like this one:

```
pc list id gene
1 E A AIFM1
0 E B AIFM1
1 E C AIFM1
NA E D AIFM1
0 F A ARAF
0 F B ARAF
1 F C ARAF
1 F D ARAF
```

thanks in advance！

so sorry.

I have 4 samples (A,B,C and D) and 2 gene lists (E and F). for example, expression value of one gene in E cluster showed 1.1,2.0,0.5 and 0 in A, B, C and D samples, respectively. if the expression value was less than 1.0, then I assigned

0, otherwise,1.11001101my question: how to use glm or other methods to predict the role of the samples and gene list on values(1 or 0)?

Do you mean a model like

If that's the case, you can just run glm in R with

Where data contains the data.frame. R should automatically dummy coded the categorical variables.

thanks a lot. I did that. but the result showed that the sample was not significant. I am not sure whether glm is suitable for 2 variables containing 2 and 4 categories.

I hope your dataset is larger than 8 rows. I would not do dichotomizations. Do not switch from the raw values to this zero one coding. You loose power, and not only this is bad.

Indeed, you need a larger dataset.