aligning to circular sequence
3.3 years ago
yaximik • 0

In regard to 7 yrs old discussion about aligning linear to circular sequences - is anyone aware of aligners that can handle such alignment properly? For alignment purposes, for example, CLC-Bio Genomic Workbench does not seem to be aware if sequence is circular, although it can show rCRS mtDNA as circular. Yet it cannot properly align to it a linear sequence that is should align between pos. 16022-origin-1280 to circular rCRS. Nor can Geneious, for another example.

Yes, I guess they are scratching pumpkins... I just wanted to know if anyone in the community is aware of other possibilities, like open source...

Is it NGS data ? Because i did it on times on NGS data with bwa and to make this possible i just merge 2 references together ....

No, this is just alignment of a contig previously assembled separately. CLC Bio developers confirmed that their Genomics Workbench aligner cannot handle circularity. As workaround they suggested either to use their Map to a Reference tool, which can handle circularity, or to align to a duplicated circular sequence.

Please use ADD COMMENT/ADD REPLY when responding to existing posts to keep threads logically organized.

BTW, have you looked at this: Aligning Circular Sequences (old thread but has ideas/tools).

Have you considered Circlator?

2.1 years ago

I know this is an old post - but I wanted to clarify that CLC Genomics Workbench does indeed support mapping reads to circular references. The issue is that (currently) our track viewer is only linear - something we plan to improve on in 2020 as a number of customers have requested CIRCOS like viewers (for a variety of use cases).

Thus - you can indeed map (correctly) to a circular reference in CLC, and reads that align to the origin will be correctly mapped. But when you export them in SAM/BAM format (for example to visualize in a tool that supports circularized coverage maps) you run into issues with the SAM format not supporting circular references.

We have a FAQ on this specific topic actually (mapping in CLC, exporting to BAM, and reimporting into CLC you'll see that there's a loss of information due to the SAM/BAM format not supporting these origin spanning reads correctly. however, if the original CLC data in the same view is correct). Link is here: https://secure.clcbio.com/helpspot/index.php?pg=kb.printer.friendly&id=11#p419

Happy to answer any other questions on this.