Question: In Cigar String, What Is The Difference Between 'N' And 'D' ?
4.9 years ago
Chen770 wrote:

In SAM file, the CIGAR string has the following options:

op    Description
M    Alignment match (can be a sequence match or mismatch
I    Insertion to the reference
D    Deletion from the reference
N    Skipped region from the reference
S    Soft clip on the read (clipped sequence present in <seq>)
H    Hard clip on the read (clipped sequence NOT present in <seq>)
P    Padding (silent deletion from the padded reference sequence)

I can not tell the difference between 'D' and 'N' when I analyze split read mapping. could someone give me an example to illustrate the difference?

cigar bam sam
written 4.9 years ago by Chen770

Can you show a CIGAR string of such type you encountered??

written 4.9 years ago by Varun Gupta1.1k
4.9 years ago
Russian Federation
lomereiter430 wrote:

Usage of 'N' is explained in SAM format documentation as follows:

For mRNA-to-genome alignment, an N operation represents an intron. For other types of alignments, the interpretation of N is not defined.

written 4.9 years ago by lomereiter430
3.8 years ago
United Kingdom
amblina70 wrote:
QUERY:    ATC-ATCG-------------ATCAT

The query aligned to the reference would have the cigar: 3M1D4M13N5M if the N operation was being used.  This is to distinguish between deletions in exons and large skips due to introns.  This only makes sense when you're aligning things like cDNA/expression data.  Genomic reads would just have the alignment 3M1D4M13D3M.  Does that make things clearer?

written 3.8 years ago by amblina70
