DESeqDataSetFromHTSeqCount function taking long time and utilizing more RAM
0
0
Entering edit mode
2.2 years ago

I am using DESeq2 for differential gene expression. I have done alignment using STAR aligner and GRCh38 reference genome. I have generated count files using HTSeq. I have 3 controls and 4 cases and the size of each count file varies from 8 GB to 11 GB which makes to total 73 GB of data (7 Count files).

ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable, directory = directory, design = ~ condition) directory where I kept the count files. This command is taking more than 12 hours and consumed memory of 46 GB. I have a maximum memory limit of 64 GB in my System.

Could anyone help how to solve this problem? ![Screenshot is given here...][1] [1]: https://ibb.co/3s48HRw

RNA-Seq DESeq2 R Tool • 855 views
ADD COMMENT
3
Entering edit mode

size of each count file varies from 8 GB to 11 GB

That is not likely to be correct. I assume these are original BAM alignment files. Even with 10 samples a count matrix file for featureCounts for mouse samples is < 30 MB.

ADD REPLY
0
Entering edit mode

Something is not right here. Please share all relevant command lines. Is this normal RNA-seq?

ADD REPLY
0
Entering edit mode

Yes. This is a normal RNA-seq.

directory="~/Desktop/deseq2/"
    setwd(directory)
    sampleNames<- c("AN_d83", "AN_d85","AN_d86", "Tum_t83","Tum_t84","Tum_t85","Tum_t86")
    sampleCondition<- c("Adjacent Normal","Adjacent Normal","Adjacent Normal","Tumor","Tumor","Tumor","Tumor")
    sampleTable <- data.frame(sampleName = sampleNames, fileName = sampleFiles, condition = sampleCondition)
    treatments<- c("Adjacent Normal","Tumor")
    library("DESeq2")
    ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable, directory = directory, design = ~ condition)
ADD REPLY
1
Entering edit mode

As genomax says these files are way to big for count matrices. The loading process should take a couple of seconds actually and should be possible on any standard laptop. Check that file paths are correct and really direct to the count matrices, not to the bam files.

ADD REPLY
1
Entering edit mode

Thanks, Problem is solved.

ADD REPLY
0
Entering edit mode

Please add the solution to the answer field if others encounter the same problem.

ADD REPLY

Login before adding your answer.

Traffic: 1743 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6