Question: How to convert extended CIGAR to regular CIGAR
1
gravatar for artemd
3.1 years ago by
artemd10
artemd10 wrote:

So I have a sam file with extended CIGAR format like this:

17H87=1D12=1D2=3D32=1D4=2D5=2D13=....

And I want to convert it to a sam file containing regular CIGAR format that will look like this:

17S83M1D14M1D4M3D31M1D2M2D6M...

(they do not represent the same reads... I want only to visualize the problem)

Anyone knows some software/script that will convert sam/bam with extended CIGAR to a sam/bam with regular CIGAR? It seems like a standard problem that might already have a solution so I want to ask here before solving it the hard way(coding)

sam bam cigar • 1.5k views
ADD COMMENTlink modified 3.1 years ago by Pierre Lindenbaum126k • written 3.1 years ago by artemd10

Assuming you have a .sam named old.sam

 samtools view -Sh old.sam |  awk '/^@/ {print $0; next;} {$6=gensub(/[=X]/, "M", "g", $6); print $0;}' > new.sam

Disclaimer: I haven't tested this on a real samfile, as I don't have one available right now

Argh. No. Wrong. Sorry. Disregard the first version.

Edits: Misread what you wanted at first, now it should replace = and X in extended with M.

ADD REPLYlink modified 3.1 years ago • written 3.1 years ago by cschu1812.1k

Thanks for the reply, looks very elegant and probably will work for times when rapid testing is needed.

ADD REPLYlink written 3.1 years ago by artemd10
7
gravatar for genomax
3.1 years ago by
genomax78k
United States
genomax78k wrote:

reformat.sh from BBMap suite can do this: reformat.sh in=orig.sam out=new.sam sam=1.3

ADD COMMENTlink modified 3.1 years ago • written 3.1 years ago by genomax78k

Forgot to say thanks, so thanks. Will be using this (I used BBmap for trimming/filtering, strangely in their manual for reformat there is no mention of CIGAR at all) manual

ADD REPLYlink written 3.1 years ago by artemd10
1

Developers probably get to the manual updates last. I think @Brian pretty much single-handedly develops BBMap suite so we should cut him a bit of slack :)

ADD REPLYlink written 3.1 years ago by genomax78k

Sure, I'm not complaining or berating. Just was stating that although I used the tool I didn't know I could use it for this purpose as well. Great tool overall. Thanks for your help.

ADD REPLYlink written 3.1 years ago by artemd10
4
gravatar for Pierre Lindenbaum
3.1 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum126k wrote:

I've quickly written one: https://github.com/lindenb/jvarkit/wiki/Biostar234081

 $ cat toy.sam 
@SQ SN:ref  LN:45
@SQ SN:ref2 LN:40
r001    163 ref 7   30  1M2X5=4I4M1D3M  =   37  39  TTAGATAAAGAGGATACTG*XX:B:S,12561,2,20,112

 $ java -jar dist/biostar234081.jar toy.sam 
@HD VN:1.5  SO:unsorted
@SQ SN:ref  LN:45
@SQ SN:ref2 LN:40
r001    163 ref 7   30  8M4I4M1D3M  =   37  39  TTAGATAAAGAGGATACTG*XX:B:S,12561,2,20,112
ADD COMMENTlink written 3.1 years ago by Pierre Lindenbaum126k

Thanks, I checked the code and it seems like it would handle this problem in a robust fashion. I expected something exactly like this to exist somewhere, and now that you have built it (very fast) it exists here.

Thanks again.

ADD REPLYlink written 3.1 years ago by artemd10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1467 users visited in the last hour