Question: What's the difference between 'clipping' and 'unmapped' and 'skipped'?
0
gravatar for 2012secondseason
16 months ago by
2012secondseason20 wrote:

Hi . I am using 'TABLET SEQUENCE VIEWER' now. but I don't really understand about difference between 'clipping' and 'unmapped' and 'skipped' with CIGAR. And if any CIGAR value is '36S76M" , then what's going on 37~75 base?? Thank you.

cigar alignment tablet • 374 views
ADD COMMENTlink modified 16 months ago by Zhixue10 • written 16 months ago by 2012secondseason20
1
gravatar for said3427
16 months ago by
said342780
Mexico
said342780 wrote:

36S76M means 36 nucleotides are soft-clipped (nts that are not part of the alignment but present in the read) + 76 alignment matches.

ADD COMMENTlink written 16 months ago by said342780
1
gravatar for d-cameron
16 months ago by
d-cameron2.1k
Australia
d-cameron2.1k wrote:

I suggest reading the SAM specifications document. Page 1 and 2 have a good example that addresses your question about alignment and CIGAR operators.

ADD COMMENTlink modified 16 months ago • written 16 months ago by d-cameron2.1k
0
gravatar for Zhixue
16 months ago by
Zhixue10
China/Shanghai/TJ
Zhixue10 wrote:

Unmapped information always shows in FLAG with 0x4, with no information in CIGAR.

clipping(S/H) and skipped(N) information shows in CIGAR.


Taking an example,

  1. if read A is unmapped, it's FLAG includes 4(Bit) and it's CIGAR is '*'.

  2. if read A is mapped, but only part of it is mapped, it's FLAG does not include 4(Bit) and it's CIGAR is like '36S76M'(means 36nts is unmapped to reference but these soft-chipped sequences are stored in SEQ).

  • A simaliar concept is hard clipping(H) (clipped sequences NOT present in SEQ).So H can only be present as the first and/or last operation.S may only have H operations between them, or be present as the end of the CIGAR string(such as 3S89M,67M43S,31S56M21S).
  1. if read A is mapped, but parts of it are mapped to different position, which include a long gap between each of part,it's CIGAR is like 56M1200N63M.
  • For mRNA, an N operation always represents an intron.
ADD COMMENTlink written 16 months ago by Zhixue10

Unmapped information always shows in FLAG with 0x4, with no information in CIGAR.

If a read is unmapped (0x4), the CIGAR can be any legal CIGAR. The SAM specifications do not require a CIGAR of * for unmapped reads.

If 0x4 is set, no assumptions can be made about RNAME, POS, CIGAR, MAPQ, and bits 0x2, 0x100, and 0x800.
ADD REPLYlink modified 16 months ago • written 16 months ago by d-cameron2.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 536 users visited in the last hour