Dear Fellows,
I have RNA-seq from mycobacteria, I got rid of rRNA, and finally run it against the genome of Mycobacterium tuberculosis using SOAP Aligner. I converted to SAM the aligned output file. When I try to use HTSeq count to get read counts I get the following problem:
Lucy@Lucy:~/Documents/programs$ htseq-count -m intersection-nonempty -s no -t gene -i ID -o /home/Lucy/Documents/FOR_SOAP/S2_samout /home/Lucy/Documents/FOR_SOAP/S2_merged/mapped_MTB/O2_S2_MTB.sam /home/Lucy/Documents/GFF_FILES/MTB_transcripts.gff3
23962 GFF lines processed.
Error occured when reading first line of sam file.
Error: ("Malformed SAM line: MRNM == '*' although flag bit &0x0008 cleared", 'line 1 of file /home/joas/Documents/FOR_SOAP/S2_merged/mapped_MTB/O2_S2_MTB.sam')
[Exception type: ValueError, raised in _HTSeq.pyx:1321]
Please, help me to find a solution. P.S I am a biologist and have just started working with RNA-seq
Thanks