Question: covert the file to .ped file
0
gravatar for mms140130
21 days ago by
mms14013050
mms14013050 wrote:

I have the following data

ID  Affection   rs3117294   rs2747453   rs2747454   rs2747457   rs3131888
D0024949    0   C_C A_G A_G A_A A_G
D0024302    0   A_C A_A A_G A_A A_A
D0023151    0   C_C A_G A_A A_A G_G
D0022042    0   A_C A_A G_G A_A A_A
D0021275    0   C_C A_G A_G A_A A_G
D0021163    0   A_A A_A G_G A_A A_A
D0020795    0   A_A A_A G_G A_C A_G
D0020691    0   A_A A_A G_G A_C A_G
D0019121    0   A_A A_A G_G C_C G_G

and I want to creat a .ped file for plink how can I do that?

Thanks

R • 186 views
ADD COMMENTlink modified 21 days ago by Kevin Blighe9.0k • written 21 days ago by mms14013050
3
gravatar for Kevin Blighe
21 days ago by
Kevin Blighe9.0k
Europe/Americas
Kevin Blighe9.0k wrote:

This is possible but you are missing a lot of information, namely:

  • Family ID (FID)
  • Paternal ID (PID)
  • Maternal ID (MID)
  • gender/sex

You are also missing a map file. See the map file format here.

You can create a temporary (and incomplete) map with the following code, which is specific for your dataset:

head -1 plink.raw | sed 's/ \+/\n/g' | sed '1,2d' | awk '{print "0\t"$0"\t0\t0"}' > plink.map

cat plink.map
0   rs3117294   0   0
0   rs2747453   0   0
0   rs2747454   0   0
0   rs2747457   0   0
0   rs3131888   0   0

Next, you have to edit your main data to get it into a pseudo-PED format. Read about PED files here, and their input here.

sed '1d' plink.raw | sed 's/_/ /g' > plinkv2.raw

cat plinkv2.raw 
D0024949    0   C C A G A G A A A G
D0024302    0   A C A A A G A A A A
D0023151    0   C C A G A A A A G G
D0022042    0   A C A A G G A A A A
D0021275    0   C C A G A G A A A G
D0021163    0   A A A A G G A A A A
D0020795    0   A A A A G G A C A G
D0020691    0   A A A A G G A C A G
D0019121    0   A A A A G G C C G G

Then you can input your data, but you have to specify that you're missing FID, PID, MID, and gender/sex.

/Programs/plink1.90/plink --file --ped plinkv2.raw --map plink.map --no-fid --no-sex --no-parents

PLINK v1.90b3.38 64-bit (7 Jun 2016)       https://www.cog-genomics.org/plink2
(C) 2005-2016 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to plink.log.
Options in effect:
  --file
  --map plink.map
  --no-fid
  --no-parents
  --no-sex
  --ped plinkv2.raw

15037 MB RAM detected; reserving 7518 MB for main workspace.
.ped scan complete (for binary autoconversion).
Performing single-pass .bed write (5 variants, 9 people).
--file: plink.bed + plink.bim + plink.fam written.
ADD COMMENTlink modified 21 days ago • written 21 days ago by Kevin Blighe9.0k

Thank you for your help but when I apply the following code to my data it gives me empty file

head -1 plink.raw | sed 's/ +/\n/g' | sed '1,2d' | awk '{print "0\t"$0"\t0\t0"}' > plink.map

ADD REPLYlink written 21 days ago by mms14013050

Are you using MAC? I use linux (Ubuntu).

All you need for the map is a list of your SNP IDs, surround by 1 column of zeros on the left, and 2 columns of zeros on the right.

What is the aim of your analysis, by the way? - are you just doing association testing?

ADD REPLYlink written 21 days ago by Kevin Blighe9.0k

yes I'm trying to do association

ADD REPLYlink written 21 days ago by mms14013050

yes I use MAC,anyway I solved the .map issue

ADD REPLYlink written 21 days ago by mms14013050

after i solved the issue of .map, I got the following error

A problem with line 1 in [ plinkv2.raw ] Expecting 2 + 2 * 199 = 400 columns, but found 392

what should I do here?

ADD REPLYlink written 21 days ago by mms14013050

Looks like your 'PED' file is incomplete. PLINK found 199 variants in your map, and therefore expected 2 * 199 genotypes in the PED file, plus 2 extra columns for sample ID and phenotype.

Just look over the files to ensure that there are no formatting issues. If you only have 199 SNPs, this should not take long

ADD REPLYlink written 21 days ago by Kevin Blighe9.0k

I think the ped file is not complete as you said I don't know why? I followed your coding but still having this issue

ADD REPLYlink modified 21 days ago • written 21 days ago by mms14013050

My code is only based on the small sample that you provided, which may not be applicable to the entire dataset.

One more thing that you could try is opening your file with the vi editor and checking to see if there is a ^M at the end of each line. In that case, use the dos2unix command to get rid of these, and then retry the code.

ADD REPLYlink written 21 days ago by Kevin Blighe9.0k
0
gravatar for Hussain Ather
21 days ago by
Hussain Ather510
National Institutes of Health, Bethesda, MD
Hussain Ather510 wrote:
plink --bfile input --recode --tab --out output
ADD COMMENTlink written 21 days ago by Hussain Ather510

it gave me the following error

No file [data1pr3.fam] exists

ADD REPLYlink written 21 days ago by mms14013050
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1336 users visited in the last hour