Error using HTSEq_count
0
0
Entering edit mode
15 months ago
felipead66 ▴ 120

Hello,

I am trying to use HTSeq_count but when I run the command I get the following error:

> Traceback (most recent call last):
  File "/usr/local/bin/htseq-count", line 5, in <module>
    HTSeq.scripts.count.main()
  File "/usr/local/lib/python3.7/site-packages/HTSeq/scripts/count.py", line 473, in main
    args = _parse_sanitize_cmdline_arguments()
  File "/usr/local/lib/python3.7/site-packages/HTSeq/scripts/count.py", line 465, in _parse_sanitize_cmdline_arguments
    _check_sam_files(args.samfilenames)
  File "/usr/local/lib/python3.7/site-packages/HTSeq/scripts/count.py", line 158, in _check_sam_files
    with pysam.AlignmentFile(sam_filename, "r") as sf:
  File "pysam/libcalignmentfile.pyx", line 751, in pysam.libcalignmentfile.AlignmentFile.__cinit__
  File "pysam/libcalignmentfile.pyx", line 1000, in pysam.libcalignmentfile.AlignmentFile._open
ValueError: file has no sequences defined (mode='r') - is it SAM/BAM format? Consider opening with check_sq=False

The command I used is:

htseq-count -s no -a 10 -i gene_id sample1.sam genes.gtf > sample1.txt

I have to say that the command worked perfectly with previous runs until I started messing with Python issues I needed for another task. I have both Python 2.6 and Python 3.7 installed and when I type

python --version

I get 3.7.6

Any ideas?

HTSeq python • 2.0k views
ADD COMMENT
0
Entering edit mode

can you (double) check that both the SAM and gtf file are as expected and valid? (eg, same contig IDs and such)

ADD REPLY
0
Entering edit mode

I am pretty sure they are OK because the exact same SAM and gtf files were working perfectly before I started messing with Python and nothing changed since. Moreover, I run the command with previously created SAM files (which were used without problems) and I get the same error

ADD REPLY
0
Entering edit mode

Can you check which python and then using that full path (not just name) checkand show the --version.

featureCounts may be easier to use and will generate a matrix of counts from a set of BAM files.

ADD REPLY
0
Entering edit mode

When I type which python I get usr/local/bin/python

How should I modify the command

htseq-count -s no -a 10 -i gene_id sample1.sam genes.gtf > sample1.txt

based on this?

I would like to avoid using featureCounts as I would like to compare with existing results generated with htseq_counts

ADD REPLY
0
Entering edit mode

Can you show us what you get with /usr/local/bin/python --version? Just want to make sure that is actually the v.3.x python.

ADD REPLY
0
Entering edit mode

I get 3.7.6

ADD REPLY
0
Entering edit mode

So in that case the problem must be with your SAM file. Have you confirmed that the file is in right format? You could also try adding check_sq=False as suggested in message above.

ADD REPLY
0
Entering edit mode

since the python will be called from inside the htseq-count shell script it will invoke python via #!/usr/bin/env so to check the python version run:

/usr/bin/env python --version

Now if you have several pythons the scripts can get flaky, as different versions may come sooner in the path and some scripts may even override the import paths. In that case a common workaround to ensure you run the "right python with the right module" is to run the package directly (well-designed Python packages will support this behavior), in this case the command would be:

python -m HTSeq.scripts.count

In the above case, we are asking the Python we run to figure out the import path to the package, bypassing whatever may be encoded in the shell launching script. This way you could launch different Pythons and each will look in the right location.

But it is a also a good idea to do a samtools flagstat on the BAM file to make sure it is correct.

Finally, I will also mention that unless you used some sort of virtual environment, either venv or conda, installing multiple Pythons can lead to endless recurring problems. So if you haven't done the above make sure to do it like that, otherwise, problems will only compound ....

ADD REPLY
0
Entering edit mode
  1. When I run /usr/bin/env python --version I get Python 3.7.6

  2. When I run python -m HTSeq.scripts.count -s no -a 10 -i gene_id sample1.sam genes.gtf > sample1.txt I get

Traceback (most recent call last): File "/usr/local/Cellar/python/3.7.6_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py", line 183, in _run_module_as_main mod_name, mod_spec, code = _get_module_details(mod_name, _Error) File "/usr/local/Cellar/python/3.7.6_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py", line 109, in _get_module_details __import__(pkg_name) File "/Users/Desktopi/HTSeq-0.6.1/HTSeq/__init__.py", line 161 raise ValueError, "The attribute string seems to contain mismatched quotes." ^ SyntaxError: invalid syntax

ADD REPLY
0
Entering edit mode

It looks like you have installed htseq into a desktop directory, why is that? Next time use proper installation approaches. This line here

/Users/Desktopi/HTSeq-0.6.1/HTSeq/__init__.py

indicates that you have a really old version 0.6 vs the current 0.11

Version 0.6 appears to be a version that only runs with Python 2 ... and you are running it via python 3 so clearly, the problems will abound

as I said, learn about environment management with conda otherwise, problems like this will keep on coming

ADD REPLY
0
Entering edit mode

Is there a way to completely uninstall HTSeq from my Mac (and then install it back with conda) ?

ADD REPLY
0
Entering edit mode

The first 15 lines of the sam file are the following:

VH00375:1:AAAMYHTM5:1:1109:15978:29303  0   chr1    4315    50  89M *   0   0   TCTTAAGAACACAGTGGCACAGGCTGGGTGGAGCCGTCCCCCCATGGAGCACAGGCAGACAGAAGTCCCCGCCCCAGCTGTGTGGCCTCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC  XA:i:1  MD:Z:18G70  NM:i:1  XS:A:-  NH:i:1
ADD REPLY
0
Entering edit mode

sorry while editing for format, I accidentally cut out the data, just make sure to format it correctly and add it back in

ADD REPLY
0
Entering edit mode

felipead66 : I assume this output is from a simple more/head/cat etc? If so it looks like your SAM file is missing a header. That may be the problem.

If you did it using samtools view (which would not be needed for text format SAM file) let us know.

ADD REPLY

Login before adding your answer.

Traffic: 1946 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6