Question: What does SNP mean in data from an Illumina MetaboChip final report?
6 months ago
wrote:

I am analyzing some data from 2015 and the original postdoc has moved on. The file is named "072312_FinalReport.txt" and has ~197k SNP measurements for 60 people. I have figured out the meaning of all of the columns, except for the last column.

Can someone explain what "SNP" in the last column ([T/G] et cetera) means? I can't find documentation for what this file format is (the "txt" file extension is not very helpful). Thank you!

SNP Name    Sample ID   Allele1 - Top   Allele2 - Top   GC Score    SNP
chr1:109457160  2   C   C   0.8609  [T/G]
chr1:109457233  2   C   C   0.7725  [T/G]
chr1:109457614  2   -   -   0.0000  [T/C]
chr1:109457618  2   A   A   0.5787  [T/C]

UPDATE: In case it would be helpful to have a bigger example here are 30 random rows:

       SNP Name Sample ID Allele1 - Top Allele2 - Top GC Score   SNP
     rs10838158        11             A             A   0.9435 [A/G]
 chr1:170643885        83             A             G   0.8454 [A/G]
     rs17868322        13             G             G   0.8427 [T/C]
      rs1983076         8             G             G   0.8717 [C/G]
 chr17:65847329        83             G             G   0.9091 [T/C]
     rs12123144        59             G             G   0.8699 [T/C]
  chr6:43861303         8             G             G   0.8048 [T/C]
 chr10:44140519        40             A             A   0.9412 [T/C]
  chr7:71808639         5             G             G   0.6323 [G/C]
 chr15_60123726        77             C             C   0.5374 [T/G]
      rs4821651        98             A             A   0.8674 [A/G]
 chr16:19623531        83             A             A   0.7984 [A/G]
      rs4856138        62             A             A   0.8110 [A/G]
      rs1918761        24             G             G   0.9187 [A/G]
     rs13074345        80             G             G   0.9386 [T/C]
  chr1:63006481         3             -             -   0.0000 [T/C]
     rs12349196        74             G             G   0.9474 [A/G]
chr11:100142349        91             A             A   0.9136 [T/C]
 chr6:134235603        98             A             A   0.9410 [T/C]
 chr7:150279883         2             G             G   0.8980 [T/C]
  chr3:12220516        43             C             C   0.4148 [T/G]
 chr1:109619829       102             -             -   0.0000 [T/G]
 chr5:156402386        34             G             G   0.9539 [A/G]
 chr11:47962357        11             G             G   0.9013 [A/G]
 chr11:46698484        96             A             C   0.9002 [T/G]
       rs217454        65             A             G   0.8505 [T/C]
 chr11:27530273        77             A             A   0.8543 [T/C]
     rs13201824        37             C             C   0.7432 [T/G]
 chr6:118655523        36             A             A   0.9507 [A/G]
      rs7791822        13             A             A   0.9600 [T/C]


I heard back from Illumina:

"You will need the manifest file in order to determine the SNP. The manifest describes the SNP or probe content on a BeadChip.

Since the data that you have is from an older array the manifest is not currently listed online. In the link below I have sent the manifest file to you."

So the SNP represents the probe content and it is chosen to detect a SNP based on the data in the "TopGenomicSeq" column in the cardio-metabo_chip_11395247_a.csv file provided by Illumina. Thanks!

snp
modified 6 months ago • written 6 months ago
6 months ago
wrote:

The last columns are the reference and alternative alleles. For example, for the first SNP (chr1:109457160) see:

written 6 months ago
