Question: Gwas Simulation
2
gravatar for int11ap1
5.8 years ago by
int11ap1380
Barcelona
int11ap1380 wrote:

Hi people,

I have my plink files, and I would like to simulate a GWAS. First of all, I do:

1) Obtaining a list of causal SNPs (here, 10 SNPs are selected randomly as causal).

awk '{print $2}' plink.map | shuf -n 10 > CAUSAL_LIST

2) Phenotypes are estimated according to the causal list.

./bin/gcta64 --bfile plink8 --simu-qt --simu-causal-loci CAUSAL_LIST --simu-hsq 0.5 --simu-rep 3 --out qphenotype

3) Introducing the estimated phenotypes to the plink.ped file.

awk 'FNR==NR{a[NR]=$3;next}{$6=a[FNR]}1' qphenotype.phen plink.ped > temp.ped

cp temp.ped plink.ped

4) Association analysis.

./bin/plink --noweb --file plink --assoc --allow-no-sex

Now, a new file is generated with the P-values. They are plotted -log10 using R. Ordering the column of p-values in an ascending way, I realize that there are many false positives (some of my causal SNPs are at the top, and others not). It does not help too much applying a FDR of 0.05.

My question is why my causal SNPs are not at the top, all of them?

gwas simulation • 2.6k views
ADD COMMENTlink written 5.8 years ago by int11ap1380

How many SNPs and samples do you have? What is the lowest P you observe.

ADD REPLYlink written 5.8 years ago by Maxime Lamontagne2.2k

400 SNPs. The lowest: 3.4ยท10^-11.

ADD REPLYlink written 5.8 years ago by int11ap1380

If you do the analysis with 10 new SNPs, do you still have false positive?

ADD REPLYlink written 5.8 years ago by Maxime Lamontagne2.2k

Yes I still have false positives. However, I've noticed that I had stratification among my population size by doing a PCA. So, I've just removed this bias. Now I've done another GWAS simulation (for the quantitative trait) without this bias. But now, there are not any significant SNP. Might it be because of my population size (120 individuals)?

ADD REPLYlink modified 5.8 years ago • written 5.8 years ago by int11ap1380
1

Yeah it's possible. 120 individuals for a GWAS is a small dataset. Maybe you could try new options when you estimate phenotypes.

ADD REPLYlink written 5.8 years ago by Maxime Lamontagne2.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 945 users visited in the last hour