Question: How to handle 30000 SNPs genotyping data of 500 individuals in R?
0
gravatar for blacktomato27
8 months ago by
United States
blacktomato2740 wrote:

Dear All Good Morning

I have 30000 SNP genotyping data on 500 individuals and it is quite big data for me (approx. 80mb size) and i have only 4GB RAM and 1TG HDD. My final objective to diversity analysis and grouping these individuals based on their similarity. I tried to calculate genetic distance in R using POPPR package but i am getting memory insufficient error and i tried to convert my data to Genlight Object but no success. Are there any package in R to handle this size of data? Here i am requesting you to share your expertise and suggestion to solve this this. Any help in this regard is highly appreciated. This is my example data file :https://www.dropbox.com/s/phq1szmqlttsdja/test.txt?dl=0 Thanks in Advance Regards

snp R • 327 views
ADD COMMENTlink written 8 months ago by blacktomato2740

Yes, the resulting distance matrix would be enormous. What are the exact commands and error messages that you're using and receiving? There are ways of increasing memory in R but there is always going to be a limit somewhere along the line. Will pick this up in the morning (night here).

ADD REPLYlink written 8 months ago by Kevin Blighe41k

Dear kevin good morning thank you very much for your reply. I using Poppr package to calculate genetic distance. i am not getting any error from code but getting error like "can not allocate 24.5GB data". Thanking you

ADD REPLYlink written 7 months ago by blacktomato2740

Then, it is trying to allocate 24.5GB of RAM. Most likely, your computer does not have that. Do you have access to a super-compute cluster?

ADD REPLYlink written 7 months ago by Kevin Blighe41k

Dear Kevin unfortunately no thanks for your reply

ADD REPLYlink written 7 months ago by blacktomato2740

Please paste the exact command that you're using. Perhaps I can modify the code such that it could work.

ADD REPLYlink written 7 months ago by Kevin Blighe41k

Dear Kevin Good Afternoon sorry for my late reply and i am using following code to calculate GD please find my small code here https://www.dropbox.com/s/6wlftpd253fw2x3/code.txt?dl=0 and here is my example data https://www.dropbox.com/s/jw509iylda59xfv/example%20data.csv?dl=0 With this limitation i am even not able to do DAPC (descriminative analysis pf principle component) to identify groups, i mean how these individuals are grouped based on their relatedness Thanking you very much for your help Regards

ADD REPLYlink written 6 months ago by blacktomato2740
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 752 users visited in the last hour