Entering edit mode
14 months ago
vasank1958
•
0
I am currently trying to convert a Illumina final report generated from GSA3.0 Build 38 into 23&me.
The Illumina GSA3.0 has below columns
['SNP Name',
'Sample ID',
'Allele1 - Top',
'Allele2 - Top',
'GC Score',
'Sample Name',
'Sample Group',
'Sample Index',
'SNP Index',
'SNP Aux',
'Allele1 - Forward',
'Allele2 - Forward',
'Allele1 - Design',
'Allele2 - Design',
'Allele1 - AB',
'Allele2 - AB',
'Allele1 - Plus',
'Allele2 - Plus',
'Chr',
'Position',
'GT Score',
'Cluster Sep',
'SNP',
'ILMN Strand',
'Customer Strand',
'Top Genomic Sequence',
'Plus/Minus Strand',
'Theta',
'R',
'X',
'Y',
'X Raw',
'Y Raw',
'B Allele Freq',
'Log R Ratio',
'CNV Value',
'CNV Confidence']
I have managed to format the data into 23&me form as below
Genotype -> combine Allele1 - plus and Allele2 - plus
chromosome -> Chr
position -> using pyliftover (hg38,hg19) converted positions and stored here
I got stuck in RSID-> ?
There is a file that maps Illumina SNP names to RSID, please refer Loci Name to Rsid
problem is some SNP names have more than 1 rsid and some SNP name has '-' . How to handle this ? can anyone please advice ?