Question: Error in htseq-count: "Malformed SAM line: MRNM == '*' although flag bit &0x0008 cleared"
0
gravatar for komal.rathi
4.7 years ago by
komal.rathi3.4k
Children's Hospital of Philadelphia, Philadelphia, PA
komal.rathi3.4k wrote:

Hi everyone,

I am working on mouse heart RNAseq data (paired-end) and using GSNAP aligner. Here is my commandline:

# output of GSNAP: sample.bam

# On the output of gsnap I use samtools fixmate to fix mate information

samtools fixmate sample.bam sample_unsortfix.bam 

# run htseq-count on sample_unsortfix.bam (using -f 0x0002 for fetching only alignments that are mapped in proper pair)

samtools view -f 0x0002 sample_unsortfix.bam | htseq-count -s reverse - merged.gtf > sample.counts

On running this command I get the following error:

Error occured when processing SAM input (line 1630237):
  ("Malformed SAM line: MRNM == '*' although flag bit &0x0008 cleared", 'line 1630237')
  [Exception type: ValueError, raised in _HTSeq.pyx:1323]

# So I printed out the line 1630237 where error is caused:

samtools view -f 0x0002 sample_unsortfix.bam | awk "NR==1630237{print;exit}"

Line 1630237:

D5N1JJN1:213:C4HW1ACXX:6:1101:16916:99540    355    chr2    174330382    2    1S53M3777N10M35H    *    0    128    GGCTGCAGAAGGACAAGCAGGTCTACCGGGCCACGCACCGCCTGCTGCTGCTGGGTGCTGGAGA    CCFFFFFHHHHHGIIJJJIJJJJJJJJJJJIJJJJJJJJJIJJHHHHHFFFFFEDECBDDDBDC    RG:Z:heart_adultmale    MD:Z:63    NH:i:3    HI:i:2    NM:i:0    SM:i:2    XQ:i:40    X2:i:40    XO:Z:CM    XS:A:-

 

To remove these lines, I changed my commandline to this:

samtools view -f 0x0002 sample_unsortfix.bam | awk '!/\\t\\*\\t/' - | htseq-count -s reverse - merged.gtf > sample.counts

It worked before but now it is not working. 

Does anybody know why am I getting this error and how can this be fixed? 

UPDATE: I added an extra '\' in the awk command before. I removed it and now it is working. This is the new commandline that works and fixes the error:

samtools view -f 0x0002 sample_unsortfix.bam | awk '!/\t\*\t/' - | htseq-count -s reverse - merged.gtf > sample.counts

(I have added this as an answer too)

 

 

 

 

htseq-count • 3.0k views
ADD COMMENTlink modified 4.7 years ago • written 4.7 years ago by komal.rathi3.4k
1

Line 1630237 from samtools view sample_unsortfix.bam won't be the same as that line from samtools view -f 0x2 sample_unsortfix.bam.

ADD REPLYlink written 4.7 years ago by Devon Ryan89k

Oh sorry! Let me check and update the question. Thanks for pointing that out.

ADD REPLYlink written 4.7 years ago by komal.rathi3.4k
1
gravatar for komal.rathi
4.7 years ago by
komal.rathi3.4k
Children's Hospital of Philadelphia, Philadelphia, PA
komal.rathi3.4k wrote:

UPDATE: I added an extra '\' in the awk command before. I removed it and now it is working. This is the new commandline that works and fixes the error:

samtools view -f 0x0002 sample_unsortfix.bam | awk '!/\t\*\t/' - | htseq-count -s reverse - merged.gtf > sample.counts

ADD COMMENTlink written 4.7 years ago by komal.rathi3.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1118 users visited in the last hour