Why insertion and deletion are represented as gaps ?
And why the gap score dependents of the length of the sequence ?
Why insertion and deletion are represented as gaps ?
And why the gap score dependents of the length of the sequence ?
The "gap" part comes from comparing the resulting pairwise alignment:
ACGTACGTAGCTAGCT sequence 1
ACGTACG---CTAGCT sequence 2
So a deletion/insertion appears as a gap in the similarity. People tend to use an "affine gap penalty" since it leads to the most reasonable result. Basically, if you don't penalize creating gaps enough then mismatches turn into gaps in the alignment...which doesn't really make biological sense. But of course once you've committed to adding a gap anyway then its length starts not mattering as much (e.g., due to the insertion of some element or normal variation).
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
How would you represent them otherwise?
Gap penalties are based on the length of gap being introduced (not the sequence per se). Too many gaps will make an alignment meaningless.
How else you could represent it?
And which kind of alignment do you mean? Never saw that the length influence the penalty