Entering edit mode
9 weeks ago
felipead66 ▴ 110
I am trying to use HTSeq_count but when I run the command I get the following error:
> Traceback (most recent call last): File "/usr/local/bin/htseq-count", line 5, in <module> HTSeq.scripts.count.main() File "/usr/local/lib/python3.7/site-packages/HTSeq/scripts/count.py", line 473, in main args = _parse_sanitize_cmdline_arguments() File "/usr/local/lib/python3.7/site-packages/HTSeq/scripts/count.py", line 465, in _parse_sanitize_cmdline_arguments _check_sam_files(args.samfilenames) File "/usr/local/lib/python3.7/site-packages/HTSeq/scripts/count.py", line 158, in _check_sam_files with pysam.AlignmentFile(sam_filename, "r") as sf: File "pysam/libcalignmentfile.pyx", line 751, in pysam.libcalignmentfile.AlignmentFile.__cinit__ File "pysam/libcalignmentfile.pyx", line 1000, in pysam.libcalignmentfile.AlignmentFile._open ValueError: file has no sequences defined (mode='r') - is it SAM/BAM format? Consider opening with check_sq=False
The command I used is:
htseq-count -s no -a 10 -i gene_id sample1.sam genes.gtf > sample1.txt
I have to say that the command worked perfectly with previous runs until I started messing with Python issues I needed for another task. I have both Python 2.6 and Python 3.7 installed and when I type
can you (double) check that both the SAM and gtf file are as expected and valid? (eg, same contig IDs and such)
I am pretty sure they are OK because the exact same SAM and gtf files were working perfectly before I started messing with Python and nothing changed since. Moreover, I run the command with previously created SAM files (which were used without problems) and I get the same error
Can you check
which pythonand then using that full path (not just name) checkand show the
featureCountsmay be easier to use and will generate a matrix of counts from a set of BAM files.
When I type
which pythonI get
How should I modify the command
based on this?
I would like to avoid using featureCounts as I would like to compare with existing results generated with htseq_counts
Can you show us what you get with
/usr/local/bin/python --version? Just want to make sure that is actually the v.3.x python.
So in that case the problem must be with your SAM file. Have you confirmed that the file is in right format? You could also try adding
check_sq=Falseas suggested in message above.
since the python will be called from inside the
htseq-countshell script it will invoke python via
#!/usr/bin/envso to check the python version run:
Now if you have several pythons the scripts can get flaky, as different versions may come sooner in the path and some scripts may even override the import paths. In that case a common workaround to ensure you run the "right python with the right module" is to run the package directly (well-designed Python packages will support this behavior), in this case the command would be:
In the above case, we are asking the Python we run to figure out the import path to the package, bypassing whatever may be encoded in the shell launching script. This way you could launch different Pythons and each will look in the right location.
But it is a also a good idea to do a samtools flagstat on the BAM file to make sure it is correct.
Finally, I will also mention that unless you used some sort of virtual environment, either
conda, installing multiple Pythons can lead to endless recurring problems. So if you haven't done the above make sure to do it like that, otherwise, problems will only compound ....
When I run
/usr/bin/env python --versionI get
When I run
python -m HTSeq.scripts.count -s no -a 10 -i gene_id sample1.sam genes.gtf > sample1.txtI get
It looks like you have installed htseq into a desktop directory, why is that? Next time use proper installation approaches. This line here
indicates that you have a really old version 0.6 vs the current 0.11
Version 0.6 appears to be a version that only runs with Python 2 ... and you are running it via python 3 so clearly, the problems will abound
as I said, learn about environment management with
condaotherwise, problems like this will keep on coming
Is there a way to completely uninstall HTSeq from my Mac (and then install it back with conda) ?
The first 15 lines of the sam file are the following:
sorry while editing for format, I accidentally cut out the data, just make sure to format it correctly and add it back in
felipead66 : I assume this output is from a simple
more/head/catetc? If so it looks like your SAM file is missing a header. That may be the problem.
If you did it using
samtools view(which would not be needed for text format SAM file) let us know.