Question: Why "No real operator (M|I|D|N)" in picard?
gravatar for dariober
4.8 years ago by
WCIP | Glasgow | UK
dariober9.9k wrote:


Using picard CollectAlignmentSummaryMetrics I get the error "No real operator (M|I|D|N) in CIGAR". I guess this happens when an operator other than M, I, D, N is encountered (and in fact I have soft clipped reads). I can override the error by setting VALIDATION_STRINGENCY=SILENT.

If my guess is correct, I would like to know why CollectAlignmentSummaryMetrics/picard is set to throw an error with operators other than M, I, D, N.


If relevant here's the offending output:

java -jar -Xmx2g ~/applications/picard/picard-tools-1.92/CollectAlignmentSummaryMetrics.jar \
>     INPUT=$bam \
>     OUTPUT=${bam%.bam}.AlnSmryMetr.txt \
Exception in thread "main" net.sf.samtools.SAMFormatException: SAM validation error: ERROR: Read name M00886:11:000000000-A88VV:1:1109:11516:16954, No real operator (M|I|D|N) in CIGAR
    at net.sf.samtools.SAMUtils.processValidationErrors(
    at net.sf.samtools.BAMRecord.getCigar(
    at net.sf.samtools.SAMRecord.getAlignmentEnd(
    at net.sf.samtools.SAMRecord.computeIndexingBin(
    at net.sf.samtools.SAMRecord.isValid(
    at net.sf.samtools.BAMFileReader$BAMFileIterator.advance(
    at net.sf.samtools.BAMFileReader$
    at net.sf.samtools.BAMFileReader$
    at net.sf.samtools.SAMFileReader$
    at net.sf.samtools.SAMFileReader$
    at net.sf.picard.analysis.SinglePassSamProgram.makeItSo(
    at net.sf.picard.analysis.SinglePassSamProgram.doWork(
    at net.sf.picard.cmdline.CommandLineProgram.instanceMain(
    at net.sf.picard.cmdline.CommandLineProgram.instanceMainWithExit(
    at net.sf.picard.analysis.CollectAlignmentSummaryMetrics.main(

And this is the problematic read:
samtools view $bam | grep 'M00886:11:000000000-A88VV:1:1109:11516:16954'
M00886:11:000000000-A88VV:1:1109:11516:16954    83    chr10    3012200    33    49M19S    =    3012200    -49    TAAACAAAATTATAACAAACATCAAACTCTAAATTTAAATAAAAGACCTACAAAAAACATACACTAAA    FGGGFGGGGGGGGFGGGGGGFCGGGGGGGGGGGGGGGGGFGFFGGGGGFGGGGFGGGGGGGGECCCCC    NM:i:0    MD:Z:49    AS:i:49    XS:i:46    RG:Z:grm029_pb_DALIHP.140422.DALIHPplas1_S1_L001_R_001_val_    YC:Z:CT    YD:Z:r
M00886:11:000000000-A88VV:1:1109:11516:16954    163    chr10    3012200    33    68S    =    3012200    49    TAAACAAAATTATAACAAACATCAAACTCTAAATTTAAATAAAAGACCTACAAAAAACATACACTAAA    66ACCGGCFGEGGFGFGFGGGGGGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFFGGGG    AS:i:49    MD:Z:49    NM:i:0    RG:Z:grm029_pb_DALIHP.140422.DALIHPplas1_S1_L001_R_001_val_    XS:i:46    YC:Z:GA    YD:Z:r


picard cigar • 2.7k views
ADD COMMENTlink modified 4.8 years ago by Pierre Lindenbaum118k • written 4.8 years ago by dariober9.9k
gravatar for Pierre Lindenbaum
4.8 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum118k wrote:

2nd line: 68S mean that all your read is *ONLY* soft clipped : soft clipped bases of the reads are in 5' or 3' of the read and are not part of the alignment.

An aligned read with a cigar string `68S` makes no sense.

There should have one 'M' or a '=' operator.

ADD COMMENTlink written 4.8 years ago by Pierre Lindenbaum118k

Thanks! Of course! I was mislead by the error message. Completely soft clipped reads are the result of clipping overlapping pairs. If overlap is complete one of the two pairs is essentially ignored.

ADD REPLYlink written 4.8 years ago by dariober9.9k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2180 users visited in the last hour