Forum: Review of the CIGAR string format
12
gravatar for Juke-34
4 months ago by
Juke-341.8k
Sweden
Juke-341.8k wrote:

--EDIT-- For a corrected review see next answer below
I had difficulties to retrieve the meaning of R in a cigar string from the annotation pipeline MAKER. Looking around on internet I realised how confused was the information related to the CIGAR format. Different toosl use different operators.

Here is the most shared and known one, the one related to the sam format:
enter image description here

I'm using Exonerate and here is its CIGAR format:
--EDIT-- THIS is not CIGAR but VULGAR format
enter image description here

Still no R... so (Helped by the MAKER developer) I finally found an old resource from FlyBase described the CIGAR format like that:
enter image description here

So, in order to gather all the information in one place I did a union of the different operators and end-up with this last table, hoping it would help some lost souls like I was:
--EDIT-- THIS table contains VULGAR format operators and the H from the CIGAR format is missing
enter image description here

I haven't checked carefully if some definitions can be contradictory (e.g for F and I), so any comment or correction is very welcome.

alignment forum • 401 views
ADD COMMENTlink modified 3 months ago • written 4 months ago by Juke-341.8k
1

Thank you. Could you please post data as text and not as images, or put it as text file to GitHub and share the link.

ADD REPLYlink modified 4 months ago • written 4 months ago by zx87546.6k
1

Here is a link where you can have access to the tables.

ADD REPLYlink written 4 months ago by Juke-341.8k

Thank you for the post. Could you add/create a section on how to catch aligned bases according to the CIGAR, like shown in this post or this one. For example to highlight the fact that H and S do not impact the start position of the alignment.

Ref :

    GACTGTC----GTATGCTC

Query :

ccccGACTGTGAAAA-----CTC

CIGAR :

4S6M1X4I5D3M

With a starting position at 1 on the reference, the bases covered on the reference are : (1,7), (13,15)

ADD REPLYlink modified 4 months ago • written 4 months ago by Bastien HervĂ©3.3k
2
gravatar for Juke-34
3 months ago by
Juke-341.8k
Sweden
Juke-341.8k wrote:

Finally it was much more complex than what I thought. I have done a complete review of the format accessible here: https://github.com/NBISweden/GAAS/blob/master/annotation/CheatSheet/cigar.md

A resume of the review would be that table: enter image description here

ADD COMMENTlink modified 3 months ago • written 3 months ago by Juke-341.8k
1

= and X appeared in SAM v1.3 in July 2009, in https://sourceforge.net/p/samtools/mailman/message/23194888/ and the surrounding thread.

ADD REPLYlink written 3 months ago by John Marshall1.5k

Thank you for the feedback

ADD REPLYlink modified 3 months ago • written 3 months ago by Juke-341.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 823 users visited in the last hour