69% missing genotypes in GEO processed matrix file
Entering edit mode
7.9 years ago

Dear all,

I downloaded the GEO processed matrix file (GSE66157) and extracted the GType column for each individual, the missing genotypes (NC in GType column) for each individual had reached 69%. It's incredible for Illumina HumanOmni1-quad beadchip having such high missing genotypes.

My questions are,

  1. Why does it have such high missing genotypes? The author declared less than 1.5% missing genotypes for each individual in the original publication. My guess is that the author submitted the 'signal_intensities' file to GEO and the GEO used this file as a input to re-calculate the 'matrix file' with default threshold. Anyway, I have no experience of submitting genotypes to GEO.
  2. If I still want to get the individual genotypes, how can I set a lower threshold to get the appropriate genotypes? Or can I call the genotypes from the signal_intensities file?

The processed matrix file: ftp://ftp.ncbi.nlm.nih.gov/geo/series/GSE66nnn/GSE66157/suppl/GSE66157_1M_Matrix_processed.txt.gz

The signal_intensities file: ftp://ftp.ncbi.nlm.nih.gov/geo/series/GSE66nnn/GSE66157/suppl/GSE66157_1M_Matrix_signal_intensities.txt.gz

SNP • 1.5k views

Login before adding your answer.

Traffic: 1561 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6