Question: recode SNPs ATGC illumina
0
gravatar for ami23.sarr
2.7 years ago by
ami23.sarr0
ami23.sarr0 wrote:

hello, , i have a snp typing file whose genotypes are in ATGC format that i want to convert to 0,1,2(Genotypes must be coded as dosage of allele 'b' 0, 1, 2.) your help is welcome Genotype calls: 0: A1A1 1: A1A2 or A2A1 2: A2A2 5: missing i use plink and my file is .csv format. is it possible to load files in .csv .txt format on plinkenter image description here

snp plink • 1.0k views
ADD COMMENTlink modified 2.6 years ago by Kevin Blighe71k • written 2.7 years ago by ami23.sarr0

How to add images to a Biostars post

It would be helpful to show an example of the data you have.

ADD REPLYlink written 2.7 years ago by WouterDeCoster45k

enter image description here

ADD REPLYlink modified 2.6 years ago • written 2.6 years ago by ami23.sarr0

I have shown how to use imgbb in that post. Please follow the guide carefully.

ADD REPLYlink written 2.6 years ago by Ram32k

enter image description here

ADD REPLYlink written 2.6 years ago by ami23.sarr0
  1. Please do not add answers unless you're answering the top level question.
  2. Edit your post and make this image URL change there.
ADD REPLYlink written 2.6 years ago by Ram32k

The 012 encoding is based on the tabulation of the number of minor alleles. Do you know which it is the minor allele for each of your variants?

ADD REPLYlink written 2.6 years ago by Kevin Blighe71k

Allele 1 is minor allélé and Alléle 2 major alléle. if you also knew how to load a csv file via plink thanks

ADD REPLYlink written 2.6 years ago by ami23.sarr0
0
gravatar for Kevin Blighe
2.6 years ago by
Kevin Blighe71k
Republic of Ireland
Kevin Blighe71k wrote:

Okay, that's a good start. You appear to have enough information to create a PLINK object.

PLINK will not directly accept a CSV file, though. You will have to reformat your data to create a PED and MAP file, and then you will be able to create a PLINK object. Have you taken a look here:

To assist you, your PED file should have the following columns (but do not include these as a header - the file should be 'headerless'):

FID IID PID MID SEX PHENO
  • FID, family ID
  • IID, individual (sample) ID
  • PID, paternal ID
  • MID, maternal ID
  • SEX, gender (1, male; 2, female)
  • PHENO, phenotype (1, control; 2, case)

The information in this file can be space- or tab-delimited.

Your MAP file should have:

  • Chromosome code (can be anything...)
  • Variant identifier (e.g. SNP rs ID)
  • Position in morgans or centimorgans (can just leave as 0)
  • Base-pair coordinate

This can also be space- or tab-delimited.

If you then have both of these files, you can create a PLINK object with:

plink --file PlinkDataSet --ped MyData.tsv --map MyDataMap.tsv

If you only have the IID, then you can specify the following parameters along with the command (above), or choose one o more of these depending on what info you have:

--no-fid --no-parents --no-sex --no-pheno

Kevin

ADD COMMENTlink written 2.6 years ago by Kevin Blighe71k

Hello Kevin I can not create the file ped and maf I spent the day on Sunday without advanced could you if I send you one of my files by mail that you help me for the rest I'll manage. I am really stuck

ADD REPLYlink written 2.6 years ago by ami23.sarr0

Okay, please feel free to join the Biostars Slack channel: biostar.slack.com: Chat for the biostars community

I am usually active there each day, and you could send me a private message.

ADD REPLYlink modified 2.6 years ago • written 2.6 years ago by Kevin Blighe71k

I sent you an email on the outlook account with my account student thanks

ADD REPLYlink written 2.6 years ago by ami23.sarr0

I was able to create the PLINK dataset, and I have sent the information to you.

  • lfile.lfile was produced from TYP_CRG.csv
  • Map.map was produced from Manifest_ext_5+v3.xlsx
  • FranceWorldCup.fam was produced from lfile.lfile using just sample IDs

Missing genotypes are encoded as '0'

The final command to create the plink dataset was:

/Programs/plink1.90/plink --lfile FranceWorldCup --lgen lfile.lfile --map Map.map --cow --fam FranceWorldCup.fam

Your next task should be to populate the FAM file. The columns are:

  • Family ID (can be 0 for everything)
  • Sample ID ('Individal ID')
  • Paternal ID
  • Maternal ID
  • SEX / Gender (1, male; 2, female)
  • Phenotype (1, control; 2, case)

When you edit / create your FAM file, you can then re-run the PLINK command (above) so that it will be linked to your data for all future commands. If you populate it after you create the PLINK objects, then, in all future PLINK commands, you will have to specify the FAM manually with --fam FranceWorldCup.fam

ADD REPLYlink written 2.6 years ago by Kevin Blighe71k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1363 users visited in the last hour
_