Issue with htseq-count in Ubuntu
0
0
Entering edit mode
13 months ago

I am trying to run the following code:

htseq-count \
-m intersection-nonempty \
-s no \
--samout={file}.aligned.genecount.sam \{file}.aligned.sam \
mm10.ncbiRefSeq.gtf \
> \${file}.aligned.sam.genecount


But I receive the following error:

Error occured when processing input (record #108 in file /media/sf_UbuntuSharing/CS1_R1.aligned.sam):
'NoneType' object has no attribute 'encode'
[Exception type: AttributeError, raised in _HTSeq.pyx:1379]


For reference, this is HTSeq version 0.13.5 installed on Ubuntu.

htseq-count htseq NoneType ubuntu encode • 685 views
1
Entering edit mode

Have you looked at line #108 of that file? Can you get that file to work outside your loop? If you make a version of that sam file that is only 75 lines long, will it run?

1
Entering edit mode

Thank you for the clever idea. I think I identified the problem.

The same error was produced outside of the loop so that wasn't the issue.

It took a while to find it but I think it was actually the 109th read that was giving the issue (when I truncated the sam file to a certain point it worked and processed 108 alignments but if I included one more alignment it produced that error). I am working with a not-straightforward sequencing experiment where after adapter trimming and processing some reads may actually be empty. The 109th read of the same file ended up being empty:

ReadID  4   *   0   0   *   *   0   0   *   *   YT:Z:UU YF:Z:LN


I was using HISAT2 for the alignment and I hadn't considered that it would try to process these reads, I just thought it would toss them out. Looks like I will have to remove empty reads prior to HISAT2.

Thanks again for making me look at this more closely!

0
Entering edit mode

If anyone should be looking for how to remove blank reads from a fasta file: Removing All Empty Fasta Sequences From A File (Was: Editing The Headers Of The Fasta Format Sequence)

0
Entering edit mode

Well, now you know. When an error message tells you exactly where to look for the problem...go look there!

0
Entering edit mode

This seems like a valid unmapped SAM record, and HTSeq should not complain about it - it certainly looks like a bug. Maybe you coild open an issue at the HTSeq development repository ( https://github.com/htseq/htseq ).

0
Entering edit mode

How did you install HTSeq? Can you check the version of pysam? The underlying issue may be the pysam parsing of the SAM file.