Question: In Cigar String, What Is The Difference Between 'N' And 'D' ?
gravatar for Chen
4.9 years ago by
Chen770 wrote:

In SAM file, the CIGAR string has the following options:

op    Description
M    Alignment match (can be a sequence match or mismatch
I    Insertion to the reference
D    Deletion from the reference
N    Skipped region from the reference
S    Soft clip on the read (clipped sequence present in <seq>)
H    Hard clip on the read (clipped sequence NOT present in <seq>)
P    Padding (silent deletion from the padded reference sequence)

I can not tell the difference between 'D' and 'N' when I analyze split read mapping. could someone give me an example to illustrate the difference?

cigar bam sam • 6.5k views
ADD COMMENTlink modified 3.8 years ago by amblina70 • written 4.9 years ago by Chen770

Can you show a CIGAR string of such type you encountered??

ADD REPLYlink written 4.9 years ago by Varun Gupta1.1k
gravatar for lomereiter
4.9 years ago by
Russian Federation
lomereiter430 wrote:

Usage of 'N' is explained in SAM format documentation as follows:

For mRNA-to-genome alignment, an N operation represents an intron. For other types of alignments, the interpretation of N is not defined.

ADD COMMENTlink written 4.9 years ago by lomereiter430
gravatar for amblina
3.8 years ago by
United Kingdom
amblina70 wrote:
QUERY:    ATC-ATCG-------------ATCAT

The query aligned to the reference would have the cigar: 3M1D4M13N5M if the N operation was being used.  This is to distinguish between deletions in exons and large skips due to introns.  This only makes sense when you're aligning things like cDNA/expression data.  Genomic reads would just have the alignment 3M1D4M13D3M.  Does that make things clearer?

ADD COMMENTlink written 3.8 years ago by amblina70
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1973 users visited in the last hour