I am using DESeq2 for differential gene expression. I have done alignment using STAR aligner and GRCh38 reference genome. I have generated count files using HTSeq. I have 3 controls and 4 cases and the size of each count file varies from 8 GB to 11 GB which makes to total 73 GB of data (7 Count files).
ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable, directory = directory, design = ~ condition)
directory where I kept the count files.
This command is taking more than 12 hours and consumed memory of 46 GB. I have a maximum memory limit of 64 GB in my System.
Could anyone help how to solve this problem? ![Screenshot is given here...] : https://ibb.co/3s48HRw
That is not likely to be correct. I assume these are original BAM alignment files. Even with 10 samples a count matrix file for featureCounts for mouse samples is < 30 MB.
Something is not right here. Please share all relevant command lines. Is this normal RNA-seq?
Yes. This is a normal RNA-seq.
As genomax says these files are way to big for count matrices. The loading process should take a couple of seconds actually and should be possible on any standard laptop. Check that file paths are correct and really direct to the count matrices, not to the bam files.
Thanks, Problem is solved.
Please add the solution to the answer field if others encounter the same problem.