My data looks as follows:

```
1 rs987435 0 2 1 2 2 1 1
2 rs345783 0 0 0 0 0 0 0
```

I removed the snps with MAF < 5% using the following code

data <- datasnp[rowSums(datasnp==2)/ncol(datasnp) > 0.05, ] , is this correct ??

now I want to test for HWE

The data for the HWE exact should be as follows

```
0 1 2
rs987435 #of zero #of 1 #of 2
```

and so on for the rest of snps, I would like to have a code in R to transform the data as I mentioned can any one help

so what is the correct code for keeping the snps with MAF > 5% ?

What do you think it is?

I'm not sure what did you mean by adding the heterozygous counts? do you mean replace the rowSum(datasnp==2) with 0.5*rowSums(datasnp==1)

My presumption is that

`0`

indicates homozygous for the reference,`1`

heterozygous, and`2`

homozygous for the alternate allele. So assuming the reference is the major allele, then the MAF is the number of`2`

s plus have the number of`1`

s divided by the number of columns.Thank you, my question is after I calculate the MAF the correct way , I will remove the row (SNP) according to the code below, right??

data <- datasnp[rowSums(datasnp==2)+0.5*rowSums(datasnp==1) /ncol(datasnp) > 0.05, ]

Note the extra set of parentheses.

Thank you for answer, I need to understand why the MAF is calculate this way could you please send me a link that explains why

Thank you again

I guess you could just google around for "allele frequency". But frankly this is simply the definition.