Question: HT-seq count memory error
0
gravatar for s1469060
24 months ago by
s146906010
s146906010 wrote:

Hi all

I have been trying to use HT-seq count on paired end RNA-seq data but have been running into a memory error, which seems to be to do with ht-seq not the directory. I was wondering whether anyone has a solution to this? I am using python 2.7.1, and the input is sorted by position, however I have also tried sorting by name to no avail.

Command:

htseq-count --mode=union --stranded=yes --order=pos Mutant1_align_filtered_sorted.sam genes.gtf > list 

Output:
100000 GFF lines processed.
200000 GFF lines processed.
300000 GFF lines processed.
400000 GFF lines processed.
500000 GFF lines processed.
600000 GFF lines processed.
671983 GFF lines processed.
100000 SAM alignment record pairs processed.
200000 SAM alignment record pairs processed.
300000 SAM alignment record pairs processed.
400000 SAM alignment record pairs processed.
500000 SAM alignment record pairs processed.
600000 SAM alignment record pairs processed.
700000 SAM alignment record pairs processed.
800000 SAM alignment record pairs processed.
900000 SAM alignment record pairs processed.
1000000 SAM alignment record pairs processed.
1100000 SAM alignment record pairs processed.
1200000 SAM alignment record pairs processed.
1300000 SAM alignment record pairs processed.
1400000 SAM alignment record pairs processed.
1500000 SAM alignment record pairs processed.
1600000 SAM alignment record pairs processed.
1700000 SAM alignment record pairs processed.
1800000 SAM alignment record pairs processed.
Error occured when processing SAM input (line 5693558 of file Mutant1_align_filtered_sorted.sam):

  [Exception type: MemoryError, raised in _HTSeq.pyx:1398]

Thanks!

rna-seq • 1.2k views
ADD COMMENTlink modified 23 months ago • written 24 months ago by s146906010
3

While you wait for someone to provide a solution I suggest that you give featureCounts a try. It is much faster and will take sorted or unsorted BAM/SAM files.

ADD REPLYlink written 24 months ago by genomax67k

Hi genomax

Thanks for the tip. Whilst featureCounts did work much faster and the count worked fine, I cant't figure out how to then input the count output file into DESeq2 downstream. I've tried some thing along this line but to be honest I really can't figure it out :

countsTable <- DESeqDataSetFromMatrix(countData="/Volumes/igmm/hill-lab/Zoe/RNA-seq/E10.5_G2-67/DEseq/FeatureCountAll", colData= colData, design =  ~ genotype)
Error in DESeqDataSet(se, design = design, ignoreRank) : 
  some values in assay are negative
ADD REPLYlink modified 24 months ago by genomax67k • written 24 months ago by s146906010

Read the counts with counts <- read.table() and examine the data with summary(counts). With some luck the problem will stand out easily.

ADD REPLYlink written 24 months ago by h.mon25k
1
gravatar for s1469060
23 months ago by
s146906010
s146906010 wrote:

Just to post the solution to this. I'm not sure why it was a problem seeming as HTseq should work with order specified as pos or name and with sam or bam input, but it seemed to work fine when I changed to bam input format and sorted by name. Not sure why this mattered but it worked!

Thanks. Zoe

ADD COMMENTlink written 23 months ago by s146906010
0
gravatar for h.mon
24 months ago by
h.mon25k
Brazil
h.mon25k wrote:

Although using featureCounts as genomax suggested is better, if you want to use HTseq, I suggest you sort your sam file by name and set --orderaccordingly. This memory error for position-sorted files is known and old.

P.S.: you said which version of Python, but not which version of HTseq. Don't you think this is important as well?

ADD COMMENTlink written 24 months ago by h.mon25k

Hi h.mon

The version of HTseq is version 0.7.2 I have sorted by name and set the parameter order but it still seems to sporadically encounter HTseq issues.

ADD REPLYlink written 24 months ago by s146906010
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1626 users visited in the last hour