Entering edit mode
11.8 years ago
Ya
▴
300
First the defintion of the Sequence Alignment/Map (SAM). It is aTAB-delimited. Apart from the header lines, which are started with the ‘@’ symbol, each alignment line consists of:
Column Fields Description
QNAME
Query template/pair NAMEFLAG
bitwise FLAGRNAME
Reference sequence NAMEPOS
1-based leftmost POSition/coordinate of clipped sequenceMAPQ
MAPping Quality (Phred-scaled)CIGAR
extended CIGAR stringMRNM
Mate Reference sequence NaMe (‘=’ if same as RNAME)MPOS
1-based Mate POSistionLEN
inferred Template LENgth (insert size)SEQ
query SEQuence on the same strand as the referenceQUAL
query QUALity (ASCII-33 gives the Phred base quality)OPT
variable OPTional fields in the format TAG:VTYPE:VALUE
Let's use this thread to add information on the SAM format that may not always be obvious or well documented.