recode SNPs ATGC illumina
2
0
Entering edit mode
5.8 years ago
ami23.sarr • 0

hello, , i have a snp typing file whose genotypes are in ATGC format that i want to convert to 0,1,2(Genotypes must be coded as dosage of allele 'b' 0, 1, 2.) your help is welcome Genotype calls: 0: A1A1 1: A1A2 or A2A1 2: A2A2 5: missing i use plink and my file is .csv format. is it possible to load files in .csv .txt format on plinkenter image description here

snp plink • 1.9k views
ADD COMMENT
0
Entering edit mode

How to add images to a Biostars post

It would be helpful to show an example of the data you have.

ADD REPLY
0
Entering edit mode

enter image description here

ADD REPLY
0
Entering edit mode

I have shown how to use imgbb in that post. Please follow the guide carefully.

ADD REPLY
0
Entering edit mode

enter image description here

ADD REPLY
0
Entering edit mode
  1. Please do not add answers unless you're answering the top level question.
  2. Edit your post and make this image URL change there.
ADD REPLY
0
Entering edit mode

The 012 encoding is based on the tabulation of the number of minor alleles. Do you know which it is the minor allele for each of your variants?

ADD REPLY
0
Entering edit mode

Allele 1 is minor allélé and Alléle 2 major alléle. if you also knew how to load a csv file via plink thanks

ADD REPLY
0
Entering edit mode
5.8 years ago

Okay, that's a good start. You appear to have enough information to create a PLINK object.

PLINK will not directly accept a CSV file, though. You will have to reformat your data to create a PED and MAP file, and then you will be able to create a PLINK object. Have you taken a look here:

To assist you, your PED file should have the following columns (but do not include these as a header - the file should be 'headerless'):

FID IID PID MID SEX PHENO
  • FID, family ID
  • IID, individual (sample) ID
  • PID, paternal ID
  • MID, maternal ID
  • SEX, gender (1, male; 2, female)
  • PHENO, phenotype (1, control; 2, case)

The information in this file can be space- or tab-delimited.

Your MAP file should have:

  • Chromosome code (can be anything...)
  • Variant identifier (e.g. SNP rs ID)
  • Position in morgans or centimorgans (can just leave as 0)
  • Base-pair coordinate

This can also be space- or tab-delimited.

If you then have both of these files, you can create a PLINK object with:

plink --file PlinkDataSet --ped MyData.tsv --map MyDataMap.tsv

If you only have the IID, then you can specify the following parameters along with the command (above), or choose one o more of these depending on what info you have:

--no-fid --no-parents --no-sex --no-pheno

Kevin

ADD COMMENT
0
Entering edit mode

Hello Kevin I can not create the file ped and maf I spent the day on Sunday without advanced could you if I send you one of my files by mail that you help me for the rest I'll manage. I am really stuck

ADD REPLY
0
Entering edit mode

Okay, please feel free to join the Biostars Slack channel: biostar.slack.com: Chat for the biostars community

I am usually active there each day, and you could send me a private message.

ADD REPLY
0
Entering edit mode

I sent you an email on the outlook account with my account student thanks

ADD REPLY
0
Entering edit mode

I was able to create the PLINK dataset, and I have sent the information to you.

  • lfile.lfile was produced from TYP_CRG.csv
  • Map.map was produced from Manifest_ext_5+v3.xlsx
  • FranceWorldCup.fam was produced from lfile.lfile using just sample IDs

Missing genotypes are encoded as '0'

The final command to create the plink dataset was:

/Programs/plink1.90/plink --lfile FranceWorldCup --lgen lfile.lfile --map Map.map --cow --fam FranceWorldCup.fam

Your next task should be to populate the FAM file. The columns are:

  • Family ID (can be 0 for everything)
  • Sample ID ('Individal ID')
  • Paternal ID
  • Maternal ID
  • SEX / Gender (1, male; 2, female)
  • Phenotype (1, control; 2, case)

When you edit / create your FAM file, you can then re-run the PLINK command (above) so that it will be linked to your data for all future commands. If you populate it after you create the PLINK objects, then, in all future PLINK commands, you will have to specify the FAM manually with --fam FranceWorldCup.fam

ADD REPLY

Login before adding your answer.

Traffic: 2437 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6