**0**wrote:

I have been trying to consider how current methods for studying gene interactions address multiallelic SNPs and am struggling to find explicit published discussion of the issue. I believe the joint distribution for the genotypic data following interaction between two SNPs can be expressed with the table:

```
SNP2
BB Bb bb
SNP1 AA AABB AABb AAbb
Aa AaBB AaBb Aabb
aa aaBB aaBb aabb
```

To give a concrete example. Consider two biallelic SNPs rs1200 (A and G variants) and rs801 (C and G variants). The joint distribution for these SNPs is therefore:

```
rs801
CC CG GG
rs1200 AA AACC AACG AAGG
AG AGCC AGCC AGGG
GG GGCC GGCG GGGG
```

Assuming we now seek to compare rs1029256 to a triallelic SNP rs1029256 with variants A, C and T. I believe the following joint distribution is required for unphased genotypes:

```
rs1029256
AA AC AT CC CT TT
rs1200 AA AAAA AAAC AAAT AACC AACT AATT
AG AGAA AGAC AGAT AGCC AGCT AGTT
GG GGAA GGAC GGAT GGCC GGCT GGTT
```

The large number of possible combinations must quickly increase the complexity of the problem and for many methods I imagine it is not possible to deal with them as for biallelic SNPs. Are these generally dropped from the analysis or re-coded so that all minor SNPs are grouped?

Thanks for any help you can provide.