Entering edit mode
5.0 years ago
tothepoint ▴ 760
I am currently working on co-expression analysis for a data set of around 60,000 genes and running the job on server. I am currently waiting to complete TOM calculation (matrix multiplication BLAS) as it already taken few days and still running. If anyone have any idea about how much time it generally take to complete please do share your experience. Pardon me, If I am not able to express my question. I am new to R and RNAseq data analysis.
The most number of genes I have ever used for WGCNA was about 9000 genes (as my laptop can't afford bigger matrices), that takes less than 5 min to calculate TOM.
You may want to consider reducing your dataset. For example, are all of those 60,000 genes expressed at a high level? Have you done any level of filtering?
WGCNA builds it's network based on correlation distances, but to do this it has to compute the distance between each pairwise gene combination, meaning 60,000 X 60,000 = 3,600,000,000 data points
Dear Kelvin Blighe thanks for your suggestion. The data I received already filtered and it simulated for a period of over 5 days and rest analysis part begins now. Thankyou for your key suggestion and inputs.
Hi tothepoint! I know it's been a long time, but I couldn't find an answer to this question and I'm now having the same question. So do you remember how long it took approximately to run your 60,000 genes?
I remeber I was working on highmem cluster and it took around a week to finish. Although, I consider Kevin suggestion and filtered the reads again and rerun the analysis with a subset data. I hope you got your answer.