Illumina GSA array CM map file errors
2
1
Entering edit mode
7.8 years ago
jamesm2 ▴ 30

Hi all using Illumnia GenomeStudio 2.0.2 software, specifically the Genotyping 2.0.2 module, I have independently processed data from three different SNP arrays:

  • HumanOmniExpress-24v1.1
  • HumanOmniExpress-24v1.2
  • GSA arrays

Using the PLINK Input Report Plug-in v2.1.4 for GenomeStudio Genotyping module I have generated output files from my SNP array data -> .bat, .map, .ped, .phenotype, and .script files

The .map file is of concern here, it contains:

Column 1 -> Chromosome Column 2 -> SNP name Column 3 -> Genetic distance (in cM) Column 4 -> Physical position of SNP (according to hg19, as this is the manifest file that I have used when loading my GenomeStudio project).

The problem that I have encountered:

Regardless of the SNP array being used, there is a section of markers for each chromosome where cM positions suddenly start decreasing in value incorrectly. As far as I can tell, the CM values should only increase or stay the same (ie if two SNPs are in close proximity) when sorted by the physical position of the SNP.

-> This has occurred in output files from independent projects of different array types.

-> I have tried outputting the data again on another occasion with the same issue occurring.

-> I have output files exactly like these from a HumanOmniExpress-24v1.1 chip that were outputted using a plink plugin with the previous version of GenomeStudio and this issue not be evident. Is there a problem with Illumina's Plink plugin?

Has anyone using the Illumina GSA arrays in particular encountered this? or found any clever ways around it? Thanks in advance.

J.

SNP • 4.7k views
ADD COMMENT
2
Entering edit mode
7.8 years ago
jamesm2 ▴ 30

here is some feedback from Illumina for anyone else who find this cM issue or is wanting to run a linkage analysis using the GSA arrays with Genome Studio and the Plink plugin....

Unfortunately, we are unable to provide updates to the PLINK Input Report plug-in in the near term. However, perhaps the customer can modify the supplied genetic distance file, and then change the setting in the file PLINKInputReport.dll.config to point to the updated genetic distance file. Here's some info below on where the genetic distance file comes from. The genetic distance file included in the PLINK Input report is based on the stsMap data from UCSC Genome Browser tables / deCODE Genetics that was put into the public domain at some point in the past, based on build 36. I made a newer version for build 37, based again on same method, a number of years ago. Its included with ver 2.1.4 of the PLINK Input Report plug-in. Here’s an old link to the raw genetic map data – it may not be current, but a Google search might help finding the source data. (Actually, I just re-tested the link, and it still works) http://genome.ucsc.edu/cgi-bin/hgTables?hgsid=338840383&clade=mammal&org=&db=hg19&hgta_group=map&hgta_track=stsMap&hgta_table=stsMap&hgta_regionType=genome&position=&hgta_outputType=primaryTable&hgta_outFileName=out.txt I hope this helps. Please let us know if we can provide additional help. Thank you.

ADD COMMENT
0
Entering edit mode
7.8 years ago
jamesm2 ▴ 30

example of CM posn decreasing when physical posn of SNP on CHR3 increases Here is an example of what I am talking about, has anyone else looked at their .map files and seen this problem at all? Or does anyone have a .map file from their GSA arrays exported from GS with the plink plugin they could send me to compare? I've looked at 4 chromosomes so far and the same issue arises on each one.

Cheers

ADD COMMENT
0
Entering edit mode

Just in case anyone is following this, I think I have a basic solution... I was able to use the GSA Physical and Genetic Coordinates .txt file from the Illumina website and merge it in R with the PLINK plugin output replacing the incorrect cM values in the .map file provided by the PLINK plugin for the correct ones in the .txt file.

ADD REPLY
0
Entering edit mode

Hi James, Have been struggling with this for a while now. Glad to see this thread. Please can you share the merged/updated file?

ADD REPLY
0
Entering edit mode

sure, leave your email address and ill send it through.

ADD REPLY
0
Entering edit mode

I did install the PLINK plugin but it doesn't see to be in the drop down menu in the report as suggested in the illumina resource page, so how do we export these formats?

.bat, a batch file that runs PLINK with a default set of input parameters listed in the script file.

.map, the map file, showing base pair and position for each marker.

.ped, the LINKAGE format input file.

.phenotype, a text file which lists the quantitative trait data for eachsample.

.script, a text file which lists the input parameters to be used by thePLINK executable.
ADD REPLY

Login before adding your answer.

Traffic: 2751 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6