Question

Deletion and Duplication mutations alignment

0

Entering edit mode

5.4 years ago

Yoosef ▴ 60

Hello everyone We have analysed a family with rare genetic syndrome and we have been found new mutations. we want to align these results with other similar spices like chimpanzee, gorilla etc. In case of transition mutations results are promising. But we have duplication and deletion mutations which because of difference in number of nucleotides, databases like clustal omega can't align them correctly. so how we must align deletion/duplication mutations with other spices in order to find out how conservative they are? Thanks a lot

alignment gene • 1.6k views

ADD COMMENT • link 5.4 years ago by Yoosef ▴ 60

1

Entering edit mode

What do you mean they can't align them correctly? A couple of INDELS should be no problem for (e.g.) CLUSTAL. Duplications might cause issues, but it would depend on the exact nature of those mutations.

ADD REPLY • link 5.4 years ago by Joe 21k

0

Entering edit mode

can you send me a reference or instruction that how should i perform alignment for Duplication or deletion mutation?
as you can see below, when i insert same exon between different species which all have the same length, but my mutant exon has one more nucleotide compared to other species so i can't show the exact change which the mutation created.

mutant          GTTGGGAGGCTATGTGTTGACTGGAAGGACGTCCTGTCGGGTGGCGAGAAGCAGAGAATC    60
gorrila         -GTTGGGAGGCTATGTGTGACTGGAAGGACATCCTGTCGGGTGGCGAGAAGCAGAGAATC    59
chimpanze       -GTTGGGAGGCTGTGTGTGACTGGAAGGACGTCCTGTCGGGTGGCGAGAAGCAGAGAATC    59
human           -GTTGGGAGGCTATGTGTGACTGGAAGGACGTCCTGTCGGGTGGCGAGAAGCAGAGAATC    59
olive           -GTTGGGAGGCTATGTGTGACTGGAAGGACGTCCTGTCGGGTGGCGAGAAGCAAAGAATC    59

ADD REPLY • link updated 5.4 years ago by Ram 43k • written 5.4 years ago by Yoosef ▴ 60

0

Entering edit mode

Please use the formatting bar (especially the code option) to present your post better. I've done it for you this time.
code_formatting

ADD REPLY • link 5.4 years ago by Ram 43k

0

Entering edit mode

Sure, thanks a lot.I'm new to your community.

ADD REPLY • link 5.4 years ago by Yoosef ▴ 60

1

Entering edit mode

spices

Surely you mean species

ADD REPLY • link 5.4 years ago by Ram 43k

score 2 · Accepted Answer · 2018-11-14

2

Entering edit mode

5.4 years ago

Joe 21k

I got a different alignment from clustal which broke it at a different place, one of the mismatches between nucleotides (weirdly) and didn’t pick up the insertion. I’ve heard clustal performs better with proteins than DNA typically anyway, but could perhaps be made to work by tweaking the gap penalties etc. Perhaps try muscle instead, I get what seems to be a sensible alignment from it:

$ muscle -in seqs.fa 
MUSCLE v3.8.31 by Robert C. Edgar

http://www.drive5.com/muscle
This software is donated to the public domain.
Please cite: Edgar, R.C. Nucleic Acids Res 32(5), 1792-97.

seqs 5 seqs, max length 60, avg  length 59
00:00:00     22 MB(2%)  Iter   1  100.00%  K-mer dist pass 1
00:00:00     22 MB(2%)  Iter   1  100.00%  K-mer dist pass 2
00:00:00     23 MB(2%)  Iter   1  100.00%  Align node       
00:00:00     23 MB(2%)  Iter   1  100.00%  Root alignment
00:00:00     23 MB(2%)  Iter   2  100.00%  Refine tree   
00:00:00     23 MB(2%)  Iter   2  100.00%  Root alignment
00:00:00     23 MB(2%)  Iter   2  100.00%  Root alignment
00:00:00     23 MB(2%)  Iter   3  100.00%  Refine biparts
>gorrila 
GTTGGGAGGCTATGTG-TGACTGGAAGGACATCCTGTCGGGTGGCGAGAAGCAGAGAATC
>chimpanze 
GTTGGGAGGCTGTGTG-TGACTGGAAGGACGTCCTGTCGGGTGGCGAGAAGCAGAGAATC
>olive
GTTGGGAGGCTATGTG-TGACTGGAAGGACGTCCTGTCGGGTGGCGAGAAGCAAAGAATC
>mutant
GTTGGGAGGCTATGTGTTGACTGGAAGGACGTCCTGTCGGGTGGCGAGAAGCAGAGAATC
>human
GTTGGGAGGCTATGTG-TGACTGGAAGGACGTCCTGTCGGGTGGCGAGAAGCAGAGAATC

ADD COMMENT • link 5.4 years ago by Joe 21k

0

Entering edit mode

I really appreciate your help and guidance. I am still trying to work with muscle. did you insert the gap (-) in your data or muscle inserted it itself? the data you have sent me is completely usable in my article.

ADD REPLY • link 5.4 years ago by Yoosef ▴ 60

1

Entering edit mode

If you've installed muscle the command to replicate the output I got above is shown:

muscle -in seqs.fa

Where seqs.fa looks like:

>mutant
GTTGGGAGGCTATGTGTTGACTGGAAGGACGTCCTGTCGGGTGGCGAGAAGCAGAGAATC
>gorrila
GTTGGGAGGCTATGTGTGACTGGAAGGACATCCTGTCGGGTGGCGAGAAGCAGAGAATC
>chimpanze
GTTGGGAGGCTGTGTGTGACTGGAAGGACGTCCTGTCGGGTGGCGAGAAGCAGAGAATC
>human
GTTGGGAGGCTATGTGTGACTGGAAGGACGTCCTGTCGGGTGGCGAGAAGCAGAGAATC
>olive
GTTGGGAGGCTATGTGTGACTGGAAGGACGTCCTGTCGGGTGGCGAGAAGCAAAGAATC

muscle added the gap. It's an aligner. That's literally it's job.

ADD REPLY • link 5.4 years ago by Joe 21k

0

Entering edit mode

Also , In clustal omega it shows me the correct alignment with * symbole. But i can't interpret the reported resulted in muscle. How should i interpret them or is there a way to how alignment with * symbole?

ADD REPLY • link 5.4 years ago by Yoosef ▴ 60

1

Entering edit mode

The command is muscle -in seqs.fa -clw. Please spend some time reading MUSCLEs manuals and help. If you want to get read of the header info about iterations etc, just redirect STDERR to null: muscle -in seqs.fa -clw 2>/dev/null

MUSCLE v3.8.31 by Robert C. Edgar

http://www.drive5.com/muscle
This software is donated to the public domain.
Please cite: Edgar, R.C. Nucleic Acids Res 32(5), 1792-97.

seqs 5 seqs, max length 60, avg  length 59
00:00:00     22 MB(2%)  Iter   1  100.00%  K-mer dist pass 1
00:00:00     22 MB(2%)  Iter   1  100.00%  K-mer dist pass 2
00:00:00     23 MB(2%)  Iter   1  100.00%  Align node
00:00:00     23 MB(2%)  Iter   1  100.00%  Root alignment
00:00:00     23 MB(2%)  Iter   2  100.00%  Refine tree
00:00:00     23 MB(2%)  Iter   2  100.00%  Root alignment
00:00:00     23 MB(2%)  Iter   2  100.00%  Root alignment
00:00:00     23 MB(2%)  Iter   3  100.00%  Refine biparts
MUSCLE (3.8) multiple sequence alignment


gorrila         GTTGGGAGGCTATGTG-TGACTGGAAGGACATCCTGTCGGGTGGCGAGAAGCAGAGAATC
chimpanze       GTTGGGAGGCTGTGTG-TGACTGGAAGGACGTCCTGTCGGGTGGCGAGAAGCAGAGAATC
olive           GTTGGGAGGCTATGTG-TGACTGGAAGGACGTCCTGTCGGGTGGCGAGAAGCAAAGAATC
mutant          GTTGGGAGGCTATGTGTTGACTGGAAGGACGTCCTGTCGGGTGGCGAGAAGCAGAGAATC
human           GTTGGGAGGCTATGTG-TGACTGGAAGGACGTCCTGTCGGGTGGCGAGAAGCAGAGAATC
                *********** **** ************* ********************** ******

ADD REPLY • link 5.4 years ago by Joe 21k

0

Entering edit mode

Thanks a lot. Your help completely solved my problem. If i want to report this alignment in my article, how many species are required to have concise and discussable results? are these three species enough or you suggest to add more animals?

ADD REPLY • link 5.4 years ago by Yoosef ▴ 60

1

Entering edit mode

As a general rule I’d say the more the merrier, but it’s really dependent on what point you’re trying to make about the sequences. If you just want to show that that mutant has an extra base, that’s probably enough to make the point.

ADD REPLY • link 5.4 years ago by Joe 21k

1

Entering edit mode

Yes i only want to show there is different nucleotide in that position compared to other animals. I really appreciate your help and guidance.

ADD REPLY • link 5.4 years ago by Yoosef ▴ 60

1

Entering edit mode

Ok great, I’ve gone ahead and moved the comment thread to an answer so you can accept and provide some thread closure, by clicking the green check mark at the side.

ADD REPLY • link 5.4 years ago by Joe 21k