Question: PCA using plink files
0
gravatar for MAPK
2.6 years ago by
MAPK1.5k
United States
MAPK1.5k wrote:

I have .bed, .bim and .fam files. I have tried shellfish.py program from

http://www.stats.ox.ac.uk/~davison/software/shellfish/shellfish.php

using this command /shellfish/shellfish.py --pca --numpcs 10 --maxprocs 8 --file myfile --out qcvcf --ignore-sge which keeps spitting this error:

16:58:39 256 16:58:39 shellfish error: The command 'which lines' exited with code 256: please ensure that the program 'lines' is present in the current directory (or on your $PATH)

Is there any other ways to do PCA using set of .bed, .bim and .fam files? Also, what could be the cause of the error using shellfish.py as I would prefer to use shellfish.py?

pca • 4.5k views
ADD COMMENTlink modified 2.6 years ago by chrchang5236.2k • written 2.6 years ago by MAPK1.5k

The error message is clear :-) It seems you have to install (or add to your path) a program called "lines"

An alternative to shellfish is smartpca

ADD REPLYlink written 2.6 years ago by abascalfederico1.1k

I agree the message is clear, but the program "lines" is already in shellfish package and still generates this error.

ADD REPLYlink written 2.6 years ago by MAPK1.5k

I'd try to add "." to the PATH, or just the full path to "lines"

ADD REPLYlink written 2.6 years ago by abascalfederico1.1k
3
gravatar for chrchang523
2.6 years ago by
chrchang5236.2k
United States
chrchang5236.2k wrote:

plink 2.0 (http://www.cog-genomics.org/plink/2.0 ) has efficient PCA implementations. For smaller datasets (let's say <5000 samples),

plink2 --bfile myfile --pca 10 --out qcvcf

works; larger datasets can be handled with

plink2 --bfile myfile --pca approx 10 --out qcvcf
ADD COMMENTlink written 2.6 years ago by chrchang5236.2k
1
gravatar for sbk
2.6 years ago by
sbk40
sbk40 wrote:

@MAPK,

I haven't used shellfish.py so couldnt comment on it. You can probably use GCTA tool to compute PCs. Using the plink files first generate GRM files and the use the GRM file to compute PCs.

# compute GRM file
gcta64 --bfile  $bfile  --autosome --maf 0.01 --make-grm-bin --out $bfile --thread-num 28 

# compute principal components using GCTA
 gcta64 --grm-bin $bfile --pca 20 --out $bfile --thread-num 28

Here is the link to GCTA tool: http://cnsgenomics.com/software/gcta/

ADD COMMENTlink written 2.6 years ago by sbk40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1167 users visited in the last hour