Question: Cuffquant Error:- BAM error: CIGAR op has zero length
0
gravatar for komal.rathi
4.7 years ago by
komal.rathi3.4k
Children's Hospital of Philadelphia, Philadelphia, PA
komal.rathi3.4k wrote:

Hi everyone, 

I am trying to run cuffquant using the mm10 genome. Here is my command:

cuffquant -o ./cuffquant_out/heart_adultmale -p 4 --frag-bias-correct mm10.fa --multi-read-correct --library-type fr-firststrand Mus_musculus.GRCm38.75.protein_linc.gtf ./gsnap_alignment/heart_adultmale_filter_sort.bam

[09:40:24] Loading reference annotation and sequence.
[09:40:53] Inspecting maps and determining fragment length distributions.
BAM error: CIGAR op has zero length (D5N1JJN1:213:C4HW1ACXX:6:1313:14379:56769)
BAM error: CIGAR op has zero length (D5N1JJN1:213:C4HW1ACXX:6:1311:11078:8414)
BAM error: CIGAR op has zero length (D5N1JJN1:213:C4HW1ACXX:6:1306:6005:41627)
BAM error: CIGAR op has zero length (D5N1JJN1:213:C4HW1ACXX:6:1102:12593:36794)
BAM error: CIGAR op has zero length (D5N1JJN1:213:C4HW1ACXX:6:1105:20343:33163)
BAM error: CIGAR op has zero length (D5N1JJN1:213:C4HW1ACXX:6:1105:3726:86180)
BAM error: CIGAR op has zero length (D5N1JJN1:213:C4HW1ACXX:6:1108:10176:6681)
BAM error: CIGAR op has zero length (D5N1JJN1:213:C4HW1ACXX:6:1108:2918:39583)

My bam file (heart_adultmale_filter_sort.bam) is filtered using bamtools filter to check if "cigar" : "*M*" or "isProperPair" : "true",  just like it is given here: bamtools filter . The file is also sorted using samtools sort command. I don't know what is going on here. I used the same command for other data and it worked fine at that time. 

Update: Thank you for the comments. Surprisingly, cuffquant is still running despite throwing out these errors, so I am letting it run. In the meanwhile I am looking at the alignments and trying to figure out what is wrong.

I tried to look at the alignments of the first read that throws out error:

samtools view heart_adultmale_filter_sort.bam | grep "D5N1JJN1:213:C4HW1ACXX:6:1313:14379:56769"

D5N1JJN1:213:C4HW1ACXX:6:1313:14379:56769    163    chr2    77314799    3    3S68M28H    =    77314867    2095    GGTCATGAGTTTGTTTTTCAGCATCCGTGGGCTGCTCACAATCACCAGTAACTAGTTCCAGGGAATCGGCC    CCFFDFFFHFHHIHIIJIIJJJJJJJIGIJJJJJJJJJJJIJJIIJJJHBFHJIJIIJIJJIIIJJHFHFB    RG:Z:heart_adultmale    MD:Z:68    NH:i:2    HI:i:1    NM:i:0    SM:i:3XQ:i:40    X2:i:40    XO:Z:CM    XS:A:+    CT:Z:2F3S68M28H0T1R14H12M1942N73M
D5N1JJN1:213:C4HW1ACXX:6:1313:14379:56769    419    chr2    77314799    3    3S80M0N2M14S    =    77316821    2098    GGTCATGAGTTTGTTTTTCAGCATCCGTGGGCTGCTCACAATCACCAGTAACTAGTTCCAGGGAATCGGCCCCCTCTTCTTCTCTCACTAGTAGAGAAA    CCFFDFFFHFHHIHIIJIIJJJJJJJIGIJJJJJJJJJJJIJJIIJJJHBFHJIJIIJIJJIIIJJHFHFBCBBDDDDDDDDDDDDDDDCCC:CDDDCD    RG:Z:heart_adultmale    MD:Z:82    NH:i:2    HI:i:2    NM:i:0    SM:i:3    XQ:i:40    X2:i:40    XO:Z:CM    XS:A:+
D5N1JJN1:213:C4HW1ACXX:6:1313:14379:56769    83    chr2    77314867    3    14H12M1942N73M    =    77314799    -2095    CCCTCTTCTTCTCTCACTAGTAGAGAAATCTTTTGAAGGACAGAAGAACTTCTGCTCGGAGGACCCGTTGAACTCTGAGTGTGGC    DDBCCCA@>=>FFFFDFEHEEHHHJIIIHGGJIHJJHIIIJIIIIIIGHHFIIHBFIJIGIHGJJJJJIIHGJHHHHHFFFFF@C    RG:Z:heart_adultmale    MD:Z:85    NH:i:2HI:i:1    NM:i:0    SM:i:3    XQ:i:40    X2:i:40    XO:Z:CM    XS:A:+
D5N1JJN1:213:C4HW1ACXX:6:1313:14379:56769    339    chr2    77316821    3    26S73M    =    77314799    -2098    CCAGGGAATCGGCCCCCTCTTCTTCTCTCACTAGTAGAGAAATCTTTTGAAGGACAGAAGAACTTCTGCTCGGAGGACCCGTTGAACTCTGAGTGTGGC    BCCC>CBB@B@DBDDDBCCCA@>=>FFFFDFEHEEHHHJIIIHGGJIHJJHIIIJIIIIIIGHHFIIHBFIJIGIHGJJJJJIIHGJHHHHHFFFFF@C    RG:Z:heart_adultmale    MD:Z:73    NH:i:2    HI:i:2    NM:i:0    SM:i:3    XQ:i:40    X2:i:40    XO:Z:CM    XS:A:+    XA:Z:73,0

Another read which throws the error:

samtools view heart_adultmale_filter_sort.bam | grep "D5N1JJN1:213:C4HW1ACXX:6:1311:11078:8414"

D5N1JJN1:213:C4HW1ACXX:6:1311:11078:8414    163    chr2    77314841    40    38M0N25M36H    =    77316857    2079    CAGTAACTAGTTCCAGGGAATCGGCCCCCTCTTCTTCTCTCACTAGTAGAGAAATCTTTTGAA    =8B=BBAH?F?FFDHEH>DHG@CAEGDF@DHGB?DHB?<??<?8?99BBB8FHAHCEHAFCAD    RG:Z:heart_adultmale    MD:Z:40G1AGG1G1A1C1C1GTGG1AT1C    NH:i:1    HI:i:1    NM:i:15SM:i:40    XQ:i:40    X2:i:0    XO:Z:CU    XS:A:+    CT:Z:2F38M0N25M36H1953T1R36H63M
D5N1JJN1:213:C4HW1ACXX:6:1311:11078:8414    83    chr2    77316857    40    36H63M    =    77314841    -2079    GGACAGAAGAACTTCTGCTCGGAGGACCCGTTGAACTCTGAGTGTGGCATGCCACTCTCTTGT    @EEA;EAC@77;@3AGG@@?3?98@00@CFD99C?CGACFG@EBGGEBIHE;AABHDB;AD@@    RG:Z:heart_adultmale    MD:Z:63    NH:i:1    HI:i:1    NM:i:0    SM:i:40    XQ:i:40    X2:i:0    XO:Z:CUXS:A:+

Some reads that don't throw the error (with MAPQ=0):

samtools view heart_adultmale_filter_sort.bam | head

D5N1JJN1:213:C4HW1ACXX:6:1105:8310:40979    419    chr1    3018434    0    56M43H    =    3018490    115    CTTTTTTTATTCCTTCCTTGACCAAGGTATCATTGAGAAGAGTGTTGTTCAGTTTC    B@FFFFFHHHHHJJIJJJJJIJIJJJJCFGHJJJJJJJJJIIDFHIJIGHGIJJJJ    RG:Z:heart_adultmale    MD:Z:4C51    NH:i:28    HI:i:15    NM:i:1    SM:i:0    XQ:i:40    X2:i:40    XO:Z:CM
D5N1JJN1:213:C4HW1ACXX:6:1210:9786:30088    419    chr1    3018456    0    56M43H    =    3018512    113    CAAGGTATCATTGAGAAGAGTGTTGTTCAGTTTCCACGTGAATGTTGGCTTTCTAT    @;ADDDDHDDHDFDGFGABB<AHHIGEHIHHHGHIIICHGGGGI?BE@DBBFCHDB    RG:Z:heart_adultmale    MD:Z:56    NH:i:31    HI:i:16    NM:i:0    SM:i:0    XQ:i:40    X2:i:40    XO:Z:CM    PG:Z:A
D5N1JJN1:213:C4HW1ACXX:6:1105:8310:40979    339    chr1    3018490    0    40H59M    =    3018434    -115    CACGTGAATGTTGGCTTTCTATTATTTATGTTGTTATTGAGGATCAGCCTTAGTCCATG    HDJJJJJJJJJJJJGHGEJJJJJJJJJJJJIJJJJJJIJJJJIIJIJGHGHHFFFFFCC    RG:Z:heart_adultmale    MD:Z:40A18    NH:i:28    HI:i:15    NM:i:1    SM:i:0    XQ:i:40    X2:i:40    XO:Z:CM
D5N1JJN1:213:C4HW1ACXX:6:1210:9786:30088    339    chr1    3018512    0    42H57M    =    3018456    -113    TATTTATGTTGTTATTGAAGATCAGCCTTAGTCCATGGTGATCTGATAGGATGCATG    GGGGHGJIHGDF@EHGDFDGGBHCGHBHEIHCGHEHGHEGEBFCDFDDFDDDDDF@C    RG:Z:heart_adultmale    MD:Z:57    NH:i:31    HI:i:16    NM:i:0    SM:i:0    XQ:i:40    X2:i:40    XO:Z:CM    PG:Z:A
D5N1JJN1:213:C4HW1ACXX:6:1208:10624:67054    163    chr1    3019723    0    66M33H    =    3019789    133    GGGCGTCTCTTTCTTTAGATCTGGGAAGTTTTCTTCAATAATTTTGTTGAAGATGTTTGCTGGTCC    CCFFFFFHHHHHJJJJJJIJJJJJJJJJIIJJJJJJIJJJJIJJJJJJJHIJJJJJJJJJJJJJJJ    RG:Z:heart_adultmale    MD:Z:4A15T45    NH:i:71    HI:i:1    NM:i:2    SM:i:0    XQ:i:40    X2:i:40    XO:Z:CMPG:Z:A    CT:Z:2F66M33H0T1R32H67M
D5N1JJN1:213:C4HW1ACXX:6:1208:10624:67054    83    chr1    3019789    0    32H67M    =    3019723    -133    TTTGAGTTGAAAATCTTCATTCTCATCCACTCCTATTATCTGTAGGTTTGGTCTTCTCATTGTGTCC    IIIJJJJIJJJIJIJJJIIIIIHIJJIGDJJJJJJJJIIIJJJJJJJJJJJJJJJHHHHHFFFFFCC    RG:Z:heart_adultmale    MD:Z:67    NH:i:71    HI:i:1    NM:i:0    SM:i:0    XQ:i:40    X2:i:40    XO:Z:CM    PG:Z:A

 

I don't think the MAPQ should be causing any problems. 

cuffquant gsnap bam mm10 ensembl75 • 2.3k views
ADD COMMENTlink modified 3.7 years ago by Biostar ♦♦ 20 • written 4.7 years ago by komal.rathi3.4k
1

this sounds like a problem in the BAM file, find the read that is reported and post the CIGAR string of it

ADD REPLYlink written 4.7 years ago by Istvan Albert ♦♦ 80k

how does your bam file looks like (samtools view)

ADD REPLYlink written 4.7 years ago by Manvendra Singh2.1k

Most probably the problem has to do sth with the way GSNAP stores spliced alignments. Please paste all the records (alignments) for some of these reads. 

ADD REPLYlink written 4.7 years ago by Ashutosh Pandey11k

Check if these are actually unmapped, or poorly mapped reads. If that's the case, just cut them out of the BAM altogether for the purposes of this analysis step.

ADD REPLYlink written 4.7 years ago by Dan D6.8k
1
gravatar for Ashutosh Pandey
4.7 years ago by
Philadelphia
Ashutosh Pandey11k wrote:

My guess: I think the problem has to do with hard-clipped reads. If a read is hard-clipped the corresponding sequence should not appear in the "SEQ" column of the SAM file but Cuffquant doesn't know that. So it is throwing the error. Additionally, Cuffquant only inspects primary alignments with MAPQ >0  for the CIGAR inconsistency (my assumption), the error will only come from such reads or alignments.

Cuffquant simply ignores the secondary alignments and alignments with MAPQ of 0 and won't check them for the CIGAR inconsistency. This is the reason that reads with hard-clipping but MAPQ of 0 don't throw any error.

ADD COMMENTlink modified 4.7 years ago • written 4.7 years ago by Ashutosh Pandey11k

But there are reads that are hard-clipped and do not cause any error like:

D5N1JJN1:213:C4HW1ACXX:6:1105:8310:40979 and D5N1JJN1:213:C4HW1ACXX:6:1210:9786:30088

 

ADD REPLYlink modified 4.7 years ago • written 4.7 years ago by komal.rathi3.4k

But they have MAPQ of zero plus these are secondary alignments. Seems you just skimmed through my answer :-)

ADD REPLYlink written 4.7 years ago by Ashutosh Pandey11k

Oh sorry! I missed the second part of your answer. I just got really curious reading the first part and started looking for a pattern in the CIGAR strings. Thanks for pointing that out.

ADD REPLYlink written 4.7 years ago by komal.rathi3.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1487 users visited in the last hour