Question: covert the file to .ped file
0
gravatar for mms140130
3 months ago by
mms14013050
mms14013050 wrote:

I have the following data

ID  Affection   rs3117294   rs2747453   rs2747454   rs2747457   rs3131888
D0024949    0   C_C A_G A_G A_A A_G
D0024302    0   A_C A_A A_G A_A A_A
D0023151    0   C_C A_G A_A A_A G_G
D0022042    0   A_C A_A G_G A_A A_A
D0021275    0   C_C A_G A_G A_A A_G
D0021163    0   A_A A_A G_G A_A A_A
D0020795    0   A_A A_A G_G A_C A_G
D0020691    0   A_A A_A G_G A_C A_G
D0019121    0   A_A A_A G_G C_C G_G

and I want to creat a .ped file for plink how can I do that?

Thanks

R • 258 views
ADD COMMENTlink modified 3 months ago by Kevin Blighe13k • written 3 months ago by mms14013050
3
gravatar for Kevin Blighe
3 months ago by
Kevin Blighe13k
Kevin Blighe13k wrote:

This is possible but you are missing a lot of information, namely:

  • Family ID (FID)
  • Paternal ID (PID)
  • Maternal ID (MID)
  • gender/sex

You are also missing a map file. See the map file format here.

You can create a temporary (and incomplete) map with the following code, which is specific for your dataset:

head -1 plink.raw | sed 's/ \+/\n/g' | sed '1,2d' | awk '{print "0\t"$0"\t0\t0"}' > plink.map

cat plink.map
0   rs3117294   0   0
0   rs2747453   0   0
0   rs2747454   0   0
0   rs2747457   0   0
0   rs3131888   0   0

Next, you have to edit your main data to get it into a pseudo-PED format. Read about PED files here, and their input here.

sed '1d' plink.raw | sed 's/_/ /g' > plinkv2.raw

cat plinkv2.raw 
D0024949    0   C C A G A G A A A G
D0024302    0   A C A A A G A A A A
D0023151    0   C C A G A A A A G G
D0022042    0   A C A A G G A A A A
D0021275    0   C C A G A G A A A G
D0021163    0   A A A A G G A A A A
D0020795    0   A A A A G G A C A G
D0020691    0   A A A A G G A C A G
D0019121    0   A A A A G G C C G G

Then you can input your data, but you have to specify that you're missing FID, PID, MID, and gender/sex.

/Programs/plink1.90/plink --file --ped plinkv2.raw --map plink.map --no-fid --no-sex --no-parents

PLINK v1.90b3.38 64-bit (7 Jun 2016)       https://www.cog-genomics.org/plink2
(C) 2005-2016 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to plink.log.
Options in effect:
  --file
  --map plink.map
  --no-fid
  --no-parents
  --no-sex
  --ped plinkv2.raw

15037 MB RAM detected; reserving 7518 MB for main workspace.
.ped scan complete (for binary autoconversion).
Performing single-pass .bed write (5 variants, 9 people).
--file: plink.bed + plink.bim + plink.fam written.
ADD COMMENTlink modified 3 months ago • written 3 months ago by Kevin Blighe13k

Thank you for your help but when I apply the following code to my data it gives me empty file

head -1 plink.raw | sed 's/ +/\n/g' | sed '1,2d' | awk '{print "0\t"$0"\t0\t0"}' > plink.map

ADD REPLYlink written 3 months ago by mms14013050

Are you using MAC? I use linux (Ubuntu).

All you need for the map is a list of your SNP IDs, surround by 1 column of zeros on the left, and 2 columns of zeros on the right.

What is the aim of your analysis, by the way? - are you just doing association testing?

ADD REPLYlink written 3 months ago by Kevin Blighe13k

yes I'm trying to do association

ADD REPLYlink written 3 months ago by mms14013050

yes I use MAC,anyway I solved the .map issue

ADD REPLYlink written 3 months ago by mms14013050

after i solved the issue of .map, I got the following error

A problem with line 1 in [ plinkv2.raw ] Expecting 2 + 2 * 199 = 400 columns, but found 392

what should I do here?

ADD REPLYlink written 3 months ago by mms14013050

Looks like your 'PED' file is incomplete. PLINK found 199 variants in your map, and therefore expected 2 * 199 genotypes in the PED file, plus 2 extra columns for sample ID and phenotype.

Just look over the files to ensure that there are no formatting issues. If you only have 199 SNPs, this should not take long

ADD REPLYlink written 3 months ago by Kevin Blighe13k

I think the ped file is not complete as you said I don't know why? I followed your coding but still having this issue

ADD REPLYlink modified 3 months ago • written 3 months ago by mms14013050

My code is only based on the small sample that you provided, which may not be applicable to the entire dataset.

One more thing that you could try is opening your file with the vi editor and checking to see if there is a ^M at the end of each line. In that case, use the dos2unix command to get rid of these, and then retry the code.

ADD REPLYlink written 3 months ago by Kevin Blighe13k
0
gravatar for Hussain Ather
3 months ago by
Hussain Ather700
National Institutes of Health, Bethesda, MD
Hussain Ather700 wrote:
plink --bfile input --recode --tab --out output
ADD COMMENTlink written 3 months ago by Hussain Ather700

it gave me the following error

No file [data1pr3.fam] exists

ADD REPLYlink written 3 months ago by mms14013050
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 886 users visited in the last hour