Question: HT-seq count memory error
0
gravatar for s1469060
2.9 years ago by
s146906010
s146906010 wrote:

Hi all

I have been trying to use HT-seq count on paired end RNA-seq data but have been running into a memory error, which seems to be to do with ht-seq not the directory. I was wondering whether anyone has a solution to this? I am using python 2.7.1, and the input is sorted by position, however I have also tried sorting by name to no avail.

Command:

htseq-count --mode=union --stranded=yes --order=pos Mutant1_align_filtered_sorted.sam genes.gtf > list 

Output:
100000 GFF lines processed.
200000 GFF lines processed.
300000 GFF lines processed.
400000 GFF lines processed.
500000 GFF lines processed.
600000 GFF lines processed.
671983 GFF lines processed.
100000 SAM alignment record pairs processed.
200000 SAM alignment record pairs processed.
300000 SAM alignment record pairs processed.
400000 SAM alignment record pairs processed.
500000 SAM alignment record pairs processed.
600000 SAM alignment record pairs processed.
700000 SAM alignment record pairs processed.
800000 SAM alignment record pairs processed.
900000 SAM alignment record pairs processed.
1000000 SAM alignment record pairs processed.
1100000 SAM alignment record pairs processed.
1200000 SAM alignment record pairs processed.
1300000 SAM alignment record pairs processed.
1400000 SAM alignment record pairs processed.
1500000 SAM alignment record pairs processed.
1600000 SAM alignment record pairs processed.
1700000 SAM alignment record pairs processed.
1800000 SAM alignment record pairs processed.
Error occured when processing SAM input (line 5693558 of file Mutant1_align_filtered_sorted.sam):

  [Exception type: MemoryError, raised in _HTSeq.pyx:1398]

Thanks!

rna-seq • 1.6k views
ADD COMMENTlink modified 2.9 years ago • written 2.9 years ago by s146906010
3

While you wait for someone to provide a solution I suggest that you give featureCounts a try. It is much faster and will take sorted or unsorted BAM/SAM files.

ADD REPLYlink written 2.9 years ago by genomax80k

Hi genomax

Thanks for the tip. Whilst featureCounts did work much faster and the count worked fine, I cant't figure out how to then input the count output file into DESeq2 downstream. I've tried some thing along this line but to be honest I really can't figure it out :

countsTable <- DESeqDataSetFromMatrix(countData="/Volumes/igmm/hill-lab/Zoe/RNA-seq/E10.5_G2-67/DEseq/FeatureCountAll", colData= colData, design =  ~ genotype)
Error in DESeqDataSet(se, design = design, ignoreRank) : 
  some values in assay are negative
ADD REPLYlink modified 2.9 years ago by genomax80k • written 2.9 years ago by s146906010

Read the counts with counts <- read.table() and examine the data with summary(counts). With some luck the problem will stand out easily.

ADD REPLYlink written 2.9 years ago by h.mon29k
1
gravatar for s1469060
2.9 years ago by
s146906010
s146906010 wrote:

Just to post the solution to this. I'm not sure why it was a problem seeming as HTseq should work with order specified as pos or name and with sam or bam input, but it seemed to work fine when I changed to bam input format and sorted by name. Not sure why this mattered but it worked!

Thanks. Zoe

ADD COMMENTlink written 2.9 years ago by s146906010
0
gravatar for h.mon
2.9 years ago by
h.mon29k
Brazil
h.mon29k wrote:

Although using featureCounts as genomax suggested is better, if you want to use HTseq, I suggest you sort your sam file by name and set --orderaccordingly. This memory error for position-sorted files is known and old.

P.S.: you said which version of Python, but not which version of HTseq. Don't you think this is important as well?

ADD COMMENTlink written 2.9 years ago by h.mon29k

Hi h.mon

The version of HTseq is version 0.7.2 I have sorted by name and set the parameter order but it still seems to sporadically encounter HTseq issues.

ADD REPLYlink written 2.9 years ago by s146906010
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1285 users visited in the last hour