Question: A Map(?) Of Sequence Alignment
0
6.2 years ago by
Czech Republic
Karol Pal jr.20 wrote:

Hi,

I'm finding it hard to express my problem in very few words (so I may have missed the solution to my problem while googling for it). I need to align two sequences, but as an output I want to know which base of my sequence translates to which base of the reference sequence, so I would get a sort of a map. I don't need to see two lines of letters under each other.

To explain, I want to use this map as a part of my code, to bridge some information about the sequence at the input side and some at the reference side (SNP's). I have a hunch this kind of mapping is used somewhere along while doing a classical alignment, but I wouldn't mind your advice before I start digging in the code of aligners.

Nevertheless I still need to keep this alignment 'classical' in the sense to know where the gaps and variations are.

Thanks for any hints/suggestions.

sequence aligner alignment • 1.3k views
modified 6.2 years ago by zam.iqbal.genome1.7k • written 6.2 years ago by Karol Pal jr.20

Not sure that this is what you are looking for, but look into biopython, slice alignments. (http://biopython.org/DIST/docs/api/Bio.Align.MultipleSeqAlignment-class.html)

Do you look for something like a .vcf file? It includes the variants of a sequence in reference to another. (http://www.1000genomes.org/node/101)

0
6.2 years ago by
United Kingdom
zam.iqbal.genome1.7k wrote:

You are just asking for a mapping between coordinates in the two strings? eg for

AACGT

AC_TT

is your map (in the mathematical sense)

1->1

2->2

3->?

4->3

5->4

where left hand number is coordinate in AACGT and right-hand is coordinate in ACTT? How do you want to define the value which it assigns to 3? Sounds like you could write something very easily which takes output from a standard aligner?