Question: RSeQC: Problems calling on a directory
gravatar for fr2259
6.1 years ago by
United States
fr22590 wrote:

I'd like to use the script in RseQC to analyze some bam files. Because I have a lot of bam files, I'd like to be able to do this in a batch. The geneBody_coverage documentation states that the script can be called on an entire directory containing bam files and will generate a single graph showing all gene body coverage curves. I have tested the script on an individual bam file as:

$  -r /Users/fr2259/Py_lib/hg19.bed -i accepted_hits_0.bam  -o output


This ran with no errors and produced the appropriate graph.

When I try to run the script on a directory containing all the bam files (and nothing but bam files), as directed by the documentation, I run into an error. Here's the command:

$  -r /Users/fr2259/Py_lib/hg19.bed -i /Users/fr2259/SeqData/bamFiles/  -o output 

And here's the error:

Traceback (most recent call last):

  File "/Users/fr2259/Py_lib/RSeQC-2.3.9/scripts/", line 86, in <module>


  File "/Users/fr2259/Py_lib/RSeQC-2.3.9/scripts/", line 70, in main

    obj = SAM.ParseBAM(options.input_file)

  File "/Library/Python/2.7/site-packages/RSeQC-2.3.9-py2.7-macosx-10.9-intel.egg/qcmodule/", line 2320, in __init__

    self.samfile = pysam.Samfile(inputFile,'r')

  File "csamtools.pyx", line 597, in pysam.csamtools.Samfile.__cinit__ (pysam/csamtools.c:6532)

  File "csamtools.pyx", line 760, in pysam.csamtools.Samfile._open (pysam/csamtools.c:8303)

ValueError: file header is empty (mode='r') - is it SAM/BAM format?


In case this was related to one file having a problem with its header, I called the script as part of a for loop. No errors came up, but each  resulting gene body coverage graph was overwritten by the subsequent one. This is likely a problem of me being a novice at running commands from the terminal.


Why would the same script work perfectly fine when given an individual file, but throw an error that suggests that there is a problem with the input file formats when called to operate on a batch?


rna-seq • 4.4k views
ADD COMMENTlink modified 3.9 years ago by Biostar ♦♦ 20 • written 6.1 years ago by fr22590
gravatar for Coryza
6.1 years ago by
Coryza390 wrote:

You could try to email the author of the script;


Edit: Have you tried this?


get a clean install of python 2.7
install cython by: easy_install cython
download pysam-0.7 from here
unpack pysam
Patch csamtools.pyx like described here:
run python install in the pysam directory
This will install a version of pysam that can read ion torrent files.

Then install RSeQC 2.3.4. Then go to the python site-packages directory of RSeQC and delete all modules that are from the pysam install of RSeQC so RSeQC uses the psam version we installed into the python installation. (just delete the files that are also present in your python dist-packages directory)"




ADD COMMENTlink modified 6.1 years ago • written 6.1 years ago by Coryza390
gravatar for fr2259
6.1 years ago by
United States
fr22590 wrote:

Thanks for the reply. I will actually shoot the author an email. 

I think that the problem is related to the version of RSeQC that I was using (2.4). There seem to be problems associated with calling the script on multiple files. Maybe a bug related to running it on my OS (OSX 10.9.4)? It works just fine with a single files, so I'll see what I can do with that. =P

ADD COMMENTlink written 6.1 years ago by fr22590
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 852 users visited in the last hour