Question: How does "missing" (3rd column) of sample file in SNPTEST affect the results?
2
gravatar for lanjinglingxiaoni
5.8 years ago by
Singapore
lanjinglingxiaoni20 wrote:

I have a question regarding the MISSING column (3rd column) of SAMPLE file for SNPTEST.

In the webpage of SNPTEST (http://www.stats.ox.ac.uk/~marchini/software/gwas/file_format.html), it said that: 

The sample file has three parts (a) a header line detailing the names of the columns in the file, (b) a line detailing the types of variables stored in each column, and (c) a line for each individual detailing the information for that individual. Here is an example of the start of a sample file for reference 

ID_1 ID_2 missing cov_1 cov_2 cov_3 cov_4 pheno1 bin1
0 0 0 D D C C P B
1 1 0.007 1 2 0.0019 -0.008 1.233 1
2 2 0.009 1 2 0.0022 -0.001 6.234 0
3 3 0.005 1 2 0.0025 0.0028 6.121 1
4 4 0.007 2 1 0.0017 -0.011 3.234 1
5 5 0.004 3 2 -0.012 0.0236 2.786 0

 

This missing refers the sample call rate of certain number of SNPs.

I wonder how does "missing" affect association results? 

When handling big data,  you often break down into 22 chromosomes. The missing value varied in each chromosomes. 

If "missing" does affect results, what should we use?

If "missing" does not affect results, why SNPTEST require this for analysis?

 

 

ADD COMMENTlink written 5.8 years ago by lanjinglingxiaoni20

hi, did you manage to calculate this, I don't know how to calculate the missing for creating a sample file

ADD REPLYlink written 2.5 years ago by jfertaj90
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1737 users visited in the last hour