Deletion and Duplication mutations alignment
1
0
Entering edit mode
3.9 years ago
yoosefyud ▴ 40

Hello everyone We have analysed a family with rare genetic syndrome and we have been found new mutations. we want to align these results with other similar spices like chimpanzee, gorilla etc. In case of transition mutations results are promising. But we have duplication and deletion mutations which because of difference in number of nucleotides, databases like clustal omega can't align them correctly. so how we must align deletion/duplication mutations with other spices in order to find out how conservative they are? Thanks a lot

alignment gene • 1.0k views
1
Entering edit mode

What do you mean they can't align them correctly? A couple of INDELS should be no problem for (e.g.) CLUSTAL. Duplications might cause issues, but it would depend on the exact nature of those mutations.

0
Entering edit mode

can you send me a reference or instruction that how should i perform alignment for Duplication or deletion mutation?
as you can see below, when i insert same exon between different species which all have the same length, but my mutant exon has one more nucleotide compared to other species so i can't show the exact change which the mutation created.

mutant          GTTGGGAGGCTATGTGTTGACTGGAAGGACGTCCTGTCGGGTGGCGAGAAGCAGAGAATC    60
gorrila         -GTTGGGAGGCTATGTGTGACTGGAAGGACATCCTGTCGGGTGGCGAGAAGCAGAGAATC    59
chimpanze       -GTTGGGAGGCTGTGTGTGACTGGAAGGACGTCCTGTCGGGTGGCGAGAAGCAGAGAATC    59
human           -GTTGGGAGGCTATGTGTGACTGGAAGGACGTCCTGTCGGGTGGCGAGAAGCAGAGAATC    59
olive           -GTTGGGAGGCTATGTGTGACTGGAAGGACGTCCTGTCGGGTGGCGAGAAGCAAAGAATC    59

0
Entering edit mode

Please use the formatting bar (especially the code option) to present your post better. I've done it for you this time.

0
Entering edit mode

Sure, thanks a lot.I'm new to your community.

1
Entering edit mode

spices

Surely you mean species

2
Entering edit mode
3.9 years ago
Joe 20k

I got a different alignment from clustal which broke it at a different place, one of the mismatches between nucleotides (weirdly) and didn’t pick up the insertion. I’ve heard clustal performs better with proteins than DNA typically anyway, but could perhaps be made to work by tweaking the gap penalties etc. Perhaps try muscle instead, I get what seems to be a sensible alignment from it:

\$ muscle -in seqs.fa
MUSCLE v3.8.31 by Robert C. Edgar

http://www.drive5.com/muscle
This software is donated to the public domain.
Please cite: Edgar, R.C. Nucleic Acids Res 32(5), 1792-97.

seqs 5 seqs, max length 60, avg  length 59
00:00:00     22 MB(2%)  Iter   1  100.00%  K-mer dist pass 1
00:00:00     22 MB(2%)  Iter   1  100.00%  K-mer dist pass 2
00:00:00     23 MB(2%)  Iter   1  100.00%  Align node
00:00:00     23 MB(2%)  Iter   1  100.00%  Root alignment
00:00:00     23 MB(2%)  Iter   2  100.00%  Refine tree
00:00:00     23 MB(2%)  Iter   2  100.00%  Root alignment
00:00:00     23 MB(2%)  Iter   2  100.00%  Root alignment
00:00:00     23 MB(2%)  Iter   3  100.00%  Refine biparts
>gorrila
GTTGGGAGGCTATGTG-TGACTGGAAGGACATCCTGTCGGGTGGCGAGAAGCAGAGAATC
>chimpanze
GTTGGGAGGCTGTGTG-TGACTGGAAGGACGTCCTGTCGGGTGGCGAGAAGCAGAGAATC
>olive
GTTGGGAGGCTATGTG-TGACTGGAAGGACGTCCTGTCGGGTGGCGAGAAGCAAAGAATC
>mutant
GTTGGGAGGCTATGTGTTGACTGGAAGGACGTCCTGTCGGGTGGCGAGAAGCAGAGAATC
>human
GTTGGGAGGCTATGTG-TGACTGGAAGGACGTCCTGTCGGGTGGCGAGAAGCAGAGAATC

0
Entering edit mode

I really appreciate your help and guidance. I am still trying to work with muscle. did you insert the gap (-) in your data or muscle inserted it itself? the data you have sent me is completely usable in my article.

1
Entering edit mode

If you've installed muscle the command to replicate the output I got above is shown:

muscle -in seqs.fa


Where seqs.fa looks like:

>mutant
GTTGGGAGGCTATGTGTTGACTGGAAGGACGTCCTGTCGGGTGGCGAGAAGCAGAGAATC
>gorrila
GTTGGGAGGCTATGTGTGACTGGAAGGACATCCTGTCGGGTGGCGAGAAGCAGAGAATC
>chimpanze
GTTGGGAGGCTGTGTGTGACTGGAAGGACGTCCTGTCGGGTGGCGAGAAGCAGAGAATC
>human
GTTGGGAGGCTATGTGTGACTGGAAGGACGTCCTGTCGGGTGGCGAGAAGCAGAGAATC
>olive
GTTGGGAGGCTATGTGTGACTGGAAGGACGTCCTGTCGGGTGGCGAGAAGCAAAGAATC


muscle added the gap. It's an aligner. That's literally it's job.

0
Entering edit mode

Also , In clustal omega it shows me the correct alignment with * symbole. But i can't interpret the reported resulted in muscle. How should i interpret them or is there a way to how alignment with * symbole?

1
Entering edit mode

The command is muscle -in seqs.fa -clw. Please spend some time reading MUSCLEs manuals and help. If you want to get read of the header info about iterations etc, just redirect STDERR to null: muscle -in seqs.fa -clw 2>/dev/null

MUSCLE v3.8.31 by Robert C. Edgar

http://www.drive5.com/muscle
This software is donated to the public domain.
Please cite: Edgar, R.C. Nucleic Acids Res 32(5), 1792-97.

seqs 5 seqs, max length 60, avg  length 59
00:00:00     22 MB(2%)  Iter   1  100.00%  K-mer dist pass 1
00:00:00     22 MB(2%)  Iter   1  100.00%  K-mer dist pass 2
00:00:00     23 MB(2%)  Iter   1  100.00%  Align node
00:00:00     23 MB(2%)  Iter   1  100.00%  Root alignment
00:00:00     23 MB(2%)  Iter   2  100.00%  Refine tree
00:00:00     23 MB(2%)  Iter   2  100.00%  Root alignment
00:00:00     23 MB(2%)  Iter   2  100.00%  Root alignment
00:00:00     23 MB(2%)  Iter   3  100.00%  Refine biparts
MUSCLE (3.8) multiple sequence alignment

gorrila         GTTGGGAGGCTATGTG-TGACTGGAAGGACATCCTGTCGGGTGGCGAGAAGCAGAGAATC
chimpanze       GTTGGGAGGCTGTGTG-TGACTGGAAGGACGTCCTGTCGGGTGGCGAGAAGCAGAGAATC
olive           GTTGGGAGGCTATGTG-TGACTGGAAGGACGTCCTGTCGGGTGGCGAGAAGCAAAGAATC
mutant          GTTGGGAGGCTATGTGTTGACTGGAAGGACGTCCTGTCGGGTGGCGAGAAGCAGAGAATC
human           GTTGGGAGGCTATGTG-TGACTGGAAGGACGTCCTGTCGGGTGGCGAGAAGCAGAGAATC
*********** **** ************* ********************** ******

0
Entering edit mode

Thanks a lot. Your help completely solved my problem. If i want to report this alignment in my article, how many species are required to have concise and discussable results? are these three species enough or you suggest to add more animals?

1
Entering edit mode

As a general rule I’d say the more the merrier, but it’s really dependent on what point you’re trying to make about the sequences. If you just want to show that that mutant has an extra base, that’s probably enough to make the point.

1
Entering edit mode

Yes i only want to show there is different nucleotide in that position compared to other animals. I really appreciate your help and guidance.

1
Entering edit mode

Ok great, I’ve gone ahead and moved the comment thread to an answer so you can accept and provide some thread closure, by clicking the green check mark at the side.