Entering edit mode
                    10.4 years ago
        Dejian
        
    
        ★
    
    1.3k
    When I apply htseq-count to bam files generated from STAR, I encounter the same error message repeatedly (see examples below). I extracted the corresponding line from bam and found that they all contained soft clipping. I thought htseq-count could correctly handle soft clipping (http://www-huber.embl.de/users/anders/HTSeq/doc/alignments.html#cigar-strings). Does anybody encounter the same problem? And how do you solve the problem?
EXAMPLE 1:
Error occured when processing SAM input (record #66220 in file ../SRR1974799.sorted.dedup.bam):
  unsigned byte integer is less than minimum
  [Exception type: OverflowError, raised in csamtools.pyx:2308]
samtools view ../SRR1974799.sorted.dedup.bam | sed -n '66220p'
SRR1974799.1020660.1    147     chr1    1549493 255     66M9S   =       1549429 -130    TGAACAGCAGGTACTCAATCATGAAGAGCTAAGCCTGATTTCATCACGACAGCTGTGAAAGTTGCACCCATGTAC     <FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF<FFFFFFFFFFFFFFAAAAA    RG:Z:SRR1974799 NH:i:1  HI:i:1  jI:B:i,-1       jM:B:c,-1       nM:i:0  AS:i:139
EXAMPLE 2:
Error occured when processing SAM input (record #174801 in file ../SRR1974808.sorted.dedup.bam):
  unsigned byte integer is less than minimum
  [Exception type: OverflowError, raised in csamtools.pyx:2308]
samtools view ../SRR1974808.sorted.dedup.bam | sed -n '174801p'
SRR1974808.1497057.1    83      chr1    40149760        255     67M8S   =       40148296        -1531   CCGTTCTTGTCGAAGGTGCGGAAAGCGTGCTGCGCGAACTTGGAGGCGTCGCCGTAGGGGAAGAACTTGATGTAG    FFFAAFFFFFFFFF7F.FFFFFF7FFF)FFFFFFF<FFF<7FFFFFFF<FFFFAF<FFFFFFFAFFFFFFAA<AA     PG:Z:MarkDuplicates     RG:Z:SRR1974808 NH:i:1  HI:i:1  jI:B:i,-1       jM:B:c,-1     nM:i:0   AS:i:139
EXAMPLE 3:
Error occured when processing SAM input (record #77098 in file ../SRR1974802.sorted.dedup.bam):
  unsigned byte integer is less than minimum
  [Exception type: OverflowError, raised in csamtools.pyx:2308]
samtools view ../SRR1974802.sorted.dedup.bam | sed -n '77098p'
SRR1974802.1214351.1    99      chr1    16045055        255     13S62M  =       16046228        1221    GAGTACATGGGAAGATCACCTGACGCTCTTCCTGACATTGGTGTCCGGGCTAGAGTTCATTCGTTCCGAGCTGGA    A)AAA)AFA.FF)FFF<7.)FFF.F<FFFF..F..F)FA.)F<7FA<F))F<FFFAFF.FFF<F)FA.<FFF7FF     PG:Z:MarkDuplicates     RG:Z:SRR1974802 NH:i:1  HI:i:1  jI:B:i,-1       jM:B:c,-1     nM:i:2   AS:i:103
EXAMPLE 4:
Error occured when processing SAM input (record #153985 in file ../SRR1974806.sorted.dedup.bam):
  unsigned byte integer is less than minimum
  [Exception type: OverflowError, raised in csamtools.pyx:2308]
samtools view ../SRR1974806.sorted.dedup.bam | sed -n '153985p'
SRR1974806.735761.1     99      chr1    45469184        255     68M7S   =       45469375        380     TGTCAGTGTCGATGGCCACGCAGTTGTAGGCCGCATAGCGGAGCTTCTCCTCGCATACCTTGGCACTGGCATAGT    <<AAAFFFFFFFFFFF<)FFFFF<FAFFFAFAFFFFFFFFFFFFAAFF.FAF<<F7<AFFFFFF.<FFFFA7FFA     PG:Z:MarkDuplicates     RG:Z:SRR1974806 NH:i:1  HI:i:1  jI:B:i,-1       jM:B:c,-1     nM:i:0   AS:i:141        XS:A:-
This is actually a pysam error that I've seen a few others run into (though with different programs). What version of pysam do you have installed and can you try upgrading it?