Error in htseq-count: "Malformed SAM line: MRNM == '*' although flag bit &0x0008 cleared"
1
0
Entering edit mode
9.7 years ago
komal.rathi ★ 4.1k

Hi everyone,

I am working on mouse heart RNAseq data (paired-end) and using GSNAP aligner. Here is my commandline:

output of GSNAP: sample.bam

On the output of gsnap I use samtools fixmate to fix mate information

samtools fixmate sample.bam sample_unsortfix.bam

run htseq-count on sample_unsortfix.bam (using -f 0x0002 for fetching only alignments that are mapped in proper pair)

samtools view -f 0x0002 sample_unsortfix.bam | htseq-count -s reverse - merged.gtf > sample.counts

On running this command I get the following error:

Error occured when processing SAM input (line 1630237):
  ("Malformed SAM line: MRNM == '*' although flag bit &0x0008 cleared", 'line 1630237')
  [Exception type: ValueError, raised in _HTSeq.pyx:1323]

So I printed out the line 1630237 where error is caused:

samtools view -f 0x0002 sample_unsortfix.bam | awk "NR==1630237{print;exit}"

Line 1630237:

D5N1JJN1:213:C4HW1ACXX:6:1101:16916:99540    355    chr2    174330382    2    1S53M3777N10M35H    *    0    128    GGCTGCAGAAGGACAAGCAGGTCTACCGGGCCACGCACCGCCTGCTGCTGCTGGGTGCTGGAGA    CCFFFFFHHHHHGIIJJJIJJJJJJJJJJJIJJJJJJJJJIJJHHHHHFFFFFEDECBDDDBDC    RG:Z:heart_adultmale    MD:Z:63    NH:i:3    HI:i:2    NM:i:0    SM:i:2    XQ:i:40    X2:i:40    XO:Z:CM    XS:A:-

To remove these lines, I changed my command line to this:

samtools view -f 0x0002 sample_unsortfix.bam | awk '!/\\t\\*\\t/' - | htseq-count -s reverse - merged.gtf > sample.counts

It worked before but now it is not working.

Does anybody know why am I getting this error and how can this be fixed?

UPDATE: I added an extra \ in the awk command before. I removed it and now it is working. This is the new command line that works and fixes the error:

samtools view -f 0x0002 sample_unsortfix.bam | awk '!/\t\*\t/' - | htseq-count -s reverse - merged.gtf > sample.counts

(I have added this as an answer too)

htseq-count • 4.6k views
ADD COMMENT
1
Entering edit mode

Line 1630237 from samtools view sample_unsortfix.bam won't be the same as that line from samtools view -f 0x2 sample_unsortfix.bam.

ADD REPLY
0
Entering edit mode

Oh sorry! Let me check and update the question. Thanks for pointing that out.

ADD REPLY
1
Entering edit mode
9.7 years ago
komal.rathi ★ 4.1k

UPDATE: I added an extra \ in the awk command before. I removed it and now it is working. This is the new command line that works and fixes the error:

samtools view -f 0x0002 sample_unsortfix.bam | awk '!/\t\*\t/' - | htseq-count -s reverse - merged.gtf > sample.counts
ADD COMMENT

Login before adding your answer.

Traffic: 3310 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6