Question: Convert Genotype SNP Matrix Into Plink Format, allelic form
0
gravatar for Kian
3 months ago by
Kian40
Kian40 wrote:

I have a genotype matrix (near 3000 animal with 50 000 SNP in columns). It's coded as 0/1/2 or NA. I want to convert this into plink format in form allelic format for example 0 to 0 0, 1 to 1 1 and 2 to 2 2. this is a format for PLINK for quaity control my data, What's the best way to do this in R?

snp plink allelic format R • 278 views
ADD COMMENTlink modified 3 months ago • written 3 months ago by Kian40

What data format do you've: dosage, ped or some other?

ADD REPLYlink written 3 months ago by Bioinformatics_NewComer300

Thanks for response, i have a ped file but with codes 0 1 2, But plink needs codes by allelic format and 0 should be 0 0, and 1 should be 1 1 also 2 should be 2 2 for example. i didnat access allelic format and question is how i can prepare this ped file?

ADD REPLYlink written 3 months ago by Kian40
1
gravatar for Kevin Blighe
3 months ago by
Kevin Blighe21k
University College London Cancer Institute
Kevin Blighe21k wrote:

Why should 1 be 11 and 2 be 22? You currently have the data in 012 format, which relates to:

  • 0 (zero) minor alleles (ref)
  • 1 minor allele (het)
  • 2 minor alleles (hom)

To produce PLINK data in 012 format, you first have to recode it using the 012 flag (see HERE), i.e., within Plink itself. So, from where did you get the file? You (or the source from where you got it) should already have the data in the format that you require.

In Plink PED format, genotypes can be encode numerically or as characters, as follows:

  • A=1
  • C=2
  • G=3
  • T=4

-----------------------------

So, as you can see, in order to connect the 012 format to the original PED format, you need mapping information in order to understand which allele (ACGT or 1234) was the minor allele and which was the major. Without that mapping, you cannot convert back. You need that extra information.

...of course, as I have already mentioned, 012 format is produced from PED (or BED) in Plink itself. So, either you or your source has the original file that you need.

Kevin

ADD COMMENTlink modified 3 months ago • written 3 months ago by Kevin Blighe21k

Thanks Dear Kevin for response This is example a file that i have, the markers i think should convert to allelic format require for plink.

           id rs147433 rs146888 rs146888 rs146888 rs146887
           1 0200s1         -9          1          2          1          2
           2 0200s1005         -9          0          1          2          2
           3 0200s1021         -9          1          1          1          0
           4 0200s1028         -9          0          1          1          2
           5 0200s103         -9          0          1          1          1
ADD REPLYlink written 8 weeks ago by Kian40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1362 users visited in the last hour