Question

GWAS data in .txt

0

Entering edit mode

3.6 years ago

Sam ▴ 10

I received some GWAS data in .txt format but it is in a format I am unfamiliar with.

SNP Sample_ID Chr Position X Y B_Allele_Freq Log_R_Ratio
200003 05T80 9 139026180 1.047 0.049 0.0009 0.1329
200006 05T80 9 139046223 1.058 0.918 0.5016 0.0155
200047 05T80 2 219793146 0.577 0.074 0.0132 0.2509
200050 05T80 2 219797929 0.009 1.414 1.0000 0.2460
200052 05T80 2 219783037 0.000 0.980 1.0000 0.0843

Can anyone tell me what is this format ? and how should i go about converting this to .map and .ped format?

Thanks

SNP • 756 views

ADD COMMENT • link updated 13 months ago by Ram 43k • written 3.6 years ago by Sam ▴ 10

score 0 · Answer 1 · 2020-10-02

0

Entering edit mode

3.6 years ago

Kevin Blighe 87k

This format is, to me, 'no format', and there is no way to convert it to MAP or PED format in its current form. Why not ask [to the person who gave this to you] for the origin and an explanation of the data? The days when a bioinformatician is handed some data and told to make magic from it should be long in the past.

Some things to help (for asking):

program / script used to produce this data
information on the sample cohort
genome build used
desired analysis to perform

Kevin

ADD COMMENT • link 3.6 years ago by Kevin Blighe 87k

0

Entering edit mode

That's what i'm afraid of. i was expecting to receive the .iDAT files but i got this instead. Apparently the iDAT files are long gone.

ADD REPLY • link 3.6 years ago by Sam ▴ 10

0

Entering edit mode

I see, they just eliminated the files? To even start to do anything here, you would need to know:

the reference base at each position
the Illumina array type and version used
what are X and Y?

Even with all of this, some reverse engineering would be needed. The data that you have presented is basically the signal data prepared for, for example, copy number profiling. So, it does not say anything directly about the underlying genotypes.

ADD REPLY • link 3.6 years ago by Kevin Blighe 87k

0

Entering edit mode

This data was generate a long time ago, the iDAT was probably deleted to make space for something else, i cannot ask anyone because the people who handled this data have long since left the lab.

I know which chip was used and the version number. If i were to hazard a guess, X and Y should be referring to intensity for allele X and allele Y. I was hoping i can use this data to call genotype for each sample. Seems to be a job for GenomeStudio, but it does not take txt files.

ADD REPLY • link 3.6 years ago by Sam ▴ 10