Entering edit mode
7.9 years ago
ludongsheng • 0
I downloaded the GEO processed matrix file (GSE66157) and extracted the GType column for each individual, the missing genotypes (NC in GType column) for each individual had reached 69%. It's incredible for Illumina HumanOmni1-quad beadchip having such high missing genotypes.
My questions are,
- Why does it have such high missing genotypes? The author declared less than 1.5% missing genotypes for each individual in the original publication. My guess is that the author submitted the 'signal_intensities' file to GEO and the GEO used this file as a input to re-calculate the 'matrix file' with default threshold. Anyway, I have no experience of submitting genotypes to GEO.
- If I still want to get the individual genotypes, how can I set a lower threshold to get the appropriate genotypes? Or can I call the genotypes from the signal_intensities file?
The processed matrix file: ftp://ftp.ncbi.nlm.nih.gov/geo/series/GSE66nnn/GSE66157/suppl/GSE66157_1M_Matrix_processed.txt.gz
The signal_intensities file: ftp://ftp.ncbi.nlm.nih.gov/geo/series/GSE66nnn/GSE66157/suppl/GSE66157_1M_Matrix_signal_intensities.txt.gz