Question: Creating snpmatrix object from a flat file
gravatar for Paula Sanchez
5.5 years ago by
Paula Sanchez0 wrote:

Dear all,

I am pretty new to genomics and I just received a genotype file. I was using other commands in R and it was too slow. I have decided to use SnpStats package, but I am not being able to read my file.

My file is a dataframe file with 10,000 rows (animals) and 600,000 columns (SNPs) coded as 0,1 and 2. I found several functions to transform it to SNPstats object, but all of them do not apply to my case e.g. read.snps.long is for one call per row, etc.

Any help for me to get started?

Thanks in advance.

snp R • 1.8k views
ADD COMMENTlink written 5.5 years ago by Paula Sanchez0

What is the objective of your analysis? 

ADD REPLYlink written 5.5 years ago by alesssia570

I want to create the genomic relationship matrix, PCA and genomic predictions. Thanks.

ADD REPLYlink written 5.5 years ago by Paula Sanchez0

If you want to use SNPstats you should format the data as pedigree file, or as a PLINK file, that is a kind of standard for genomic analysis. To transform the file you should master a bit of scripting (in any language: R, bash, python...). However, to the best of my knowledge SNPstats only deal with diallelic data.

There are other softwares that allow you to generate a GRM (e.g., GCTA, PLINK, LDAK) and some that allow you to evaluate the PCA (e.g., PLINK). However, I think that all of them require diallelic data (but it is worth checking).

ADD REPLYlink modified 13 months ago by _r_am31k • written 5.5 years ago by alesssia570
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1990 users visited in the last hour