Forum:Review of the CIGAR string format
2
12
Entering edit mode
2.5 years ago
Juke34 ★ 5.3k

--EDIT-- For a corrected review see next answer below
I had difficulties to retrieve the meaning of R in a cigar string from the annotation pipeline MAKER. Looking around on internet I realised how confused was the information related to the CIGAR format. Different toosl use different operators.

Here is the most shared and known one, the one related to the sam format:
enter image description here

I'm using Exonerate and here is its CIGAR format:
--EDIT-- THIS is not CIGAR but VULGAR format
enter image description here

Still no R... so (Helped by the MAKER developer) I finally found an old resource from FlyBase described the CIGAR format like that:
enter image description here

So, in order to gather all the information in one place I did a union of the different operators and end-up with this last table, hoping it would help some lost souls like I was:
--EDIT-- THIS table contains VULGAR format operators and the H from the CIGAR format is missing
enter image description here

I haven't checked carefully if some definitions can be contradictory (e.g for F and I), so any comment or correction is very welcome.

alignment Forum • 1.9k views
ADD COMMENT
1
Entering edit mode

Thank you. Could you please post data as text and not as images, or put it as text file to GitHub and share the link.

ADD REPLY
1
Entering edit mode

Here is a link where you can have access to the tables.

ADD REPLY
0
Entering edit mode

Thank you for the post. Could you add/create a section on how to catch aligned bases according to the CIGAR, like shown in this post or this one. For example to highlight the fact that H and S do not impact the start position of the alignment.

Ref :

    GACTGTC----GTATGCTC

Query :

ccccGACTGTGAAAA-----CTC

CIGAR :

4S6M1X4I5D3M

With a starting position at 1 on the reference, the bases covered on the reference are : (1,7), (13,15)

ADD REPLY
2
Entering edit mode
2.5 years ago
Juke34 ★ 5.3k

Finally it was much more complex than what I thought. I have done a complete review of the format accessible here: https://github.com/NBISweden/GAAS/blob/master/annotation/knowledge/cigar.md

A resume of the review would be that table: enter image description here

ADD COMMENT
1
Entering edit mode

= and X appeared in SAM v1.3 in July 2009, in https://sourceforge.net/p/samtools/mailman/message/23194888/ and the surrounding thread.

ADD REPLY
0
Entering edit mode

Thank you for the feedback

ADD REPLY

Login before adding your answer.

Traffic: 2286 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6