Question: What Would You Recommend For Great Examples Of Sequence Alignment In Biology?
12
gravatar for The Original Gtk
8.1 years ago by
Singapore
The Original Gtk190 wrote:

In teaching an undergraduate bioinformatics module, I've been contemplating how to engage the students using examples of historic milestones in the application of sequence alignment. I have a few good examples in mind, but I'd be interested in recommendations and suggestions. But here's what I don't want: Examples that show off algorithmic cleverness or bioinformatics computing power as such. These are life sciences students, and I want to provide examples of real biological advances that have utilized sequence alignment.

sequence alignment • 2.4k views
ADD COMMENTlink modified 8.1 years ago by jli99150 • written 8.1 years ago by The Original Gtk190
11
gravatar for Larry_Parnell
8.1 years ago by
Larry_Parnell16k
Boston, MA USA
Larry_Parnell16k wrote:

I can offer a great example: Discovery of the APOA5 gene encoding an important apolipoprotein involved in cholesterol and triglyceride homeostasis. This gene was discovered by aligning the human and mouse genomic seqs and noticing peak regions of higher than expected similarity. These turned out to be the APOA5 exons. This work is elegantly described by Pennacchio et al in 2001 in which the gene's role in triglyceride (TG) homeostasis is elucidated. It has been shown subsequently in numerous populations the world over that variation in the human APOA5 gene leads to differential TG levels. In some populations, the SNP-TG association is modified by intake of certain dietary fats. In other words, the risk allele is not really risk until the diet contains too much or too little of a certain component.

It is rare that such an important gene was unknown (not simply ill described, but completely unknown) prior to 2001. Furthermore, this is a nice concrete example of how such was discovered by a simple alignment of genomic sequences and is the basis for discovery of regulatory elements (along the lines of ENCODE, Jim Noonan's excellent work and Kate Pollard's HARs).

ADD COMMENTlink written 8.1 years ago by Larry_Parnell16k

This is a very nice example, exactly along the lines I was looking for! Thanks!

ADD REPLYlink written 8.1 years ago by The Original Gtk190
7
gravatar for Casey Bergman
8.1 years ago by
Casey Bergman18k
Athens, GA, USA
Casey Bergman18k wrote:

There is a pretty famous story from the early 1990s about yeast proteins (like RAD51) that are involved in recombination and DNA repair showing striking similarity to bacterial RecA proteins, providing evidence that these processes share a common origin across eukaryotes and prokaryotes. See for example (there are others that came out around the same times): http://www.ncbi.nlm.nih.gov/pubmed/1581961

I remember this being told by Doug Bishop in graduate school that this was one of the first examples of database searches/sequence alignment successfully finding a common biological process across eukaryotes and prokaryotes, and that sequence similarity really drove the biological discovery.

ADD COMMENTlink written 8.1 years ago by Casey Bergman18k
1

I like that. I was thinking about using the first publications (ca 1984-1985) of inferred proto-oncogene activation of a receptor tyrosine kinase. I think these were the first sequence alignments published in Science

ADD REPLYlink written 8.1 years ago by The Original Gtk190

Some of the first alignments go back to Margaret Dayhoff and are far earlier than 1984/85.

ADD REPLYlink written 8.1 years ago by Larry_Parnell16k

@Larry, right alignment itself clearly goes back further, but I think Dayhoff's alignments all assumed homology and didn't generate new hypotheses about common biological processes.

ADD REPLYlink written 8.1 years ago by Casey Bergman18k
6
gravatar for Jeremy Leipzig
8.1 years ago by
Philadelphia, PA
Jeremy Leipzig18k wrote:

A counterexample might be the sequence of HIV, what Robert Gallo calls in 1985 "HTLV-III" compared with HTLV-I

Notice how tenuous the alignments are to HTLV-I, even among conserved proteins. We know today HIV has nothing much to do with HTLV-I other than both being retroviruses that infect humans.

You can really feel Gallo's incredible force of will shoved down the throat of reality.

Complete nucleotide sequence of the AIDS virus, HTLV-III.

http://www.ncbi.nlm.nih.gov/pubmed/2578615

http://www.nature.com/nature/journal/v313/n6000/pdf/313277a0.pdf

alt text

alt text

ADD COMMENTlink written 8.1 years ago by Jeremy Leipzig18k
1

+1 for advocating for learning how not to do science.

ADD REPLYlink written 8.1 years ago by Casey Bergman18k

Yes, a nice example +1. One could even carry this to something contemporary like the E. coli outbreak in Germany this spring/summer and the alignments done to identify the source strains and, more interestingly, how those came together to produce something so deadly.

ADD REPLYlink written 8.1 years ago by Larry_Parnell16k

Terrific. I wish I could put two answers as the best answers to this question!

ADD REPLYlink written 8.1 years ago by The Original Gtk190
2
gravatar for Pierre Lindenbaum
8.1 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum123k wrote:

The original alignments published by

Needleman & wunsch : (1970) http://genome.crg.es/seminars/Alineator/papers/needleman70.pdf

Smith and Waterman (1981): http://ibi.zju.edu.cn/bioinplant/courses/smithandwaterman1981.pdf

The structure of a RNA viroid. for example try to process the following sequence:

>gi|341870818|gb|HQ891019.1| Chrysanthemum stunt viroid isolate H5-2, complete genome
CGGGACTTACTTGTGGTTCCTGTGGTGCACTCCTGACCCTGCTGCTTTGAAAGAAAAAGAAATGAGGCGA
AGAAGTCCTTCAGGGATCCCCGGGGAAACCTGGAGGAAGTCCGACGAGATCGCGGCTGGGGCTTAGGACC
CCACTCCTGCGAGACAGGAGTAATCCTAAACAGGGTTTTCACCCTTCCTTTAGTTTCCTTCCTCTCCTGG
AGAGGTCTTCTGCCCTAGCCCGGTCTTCGAAGCTTCCTTTGGCTACTACCCGGTGGAAACAACTGAAGCT
TCAACGCCTTTTTTTCCAATCTTCTTTAGCACCGGGCTAGGGAGTAAGCCCGTGGAACCTTAGTTTTGTT
CCCT

with FOLD: http://rna.tbi.univie.ac.at/cgi-bin/RNAfold.cgi

ADD COMMENTlink modified 4 weeks ago by RamRS24k • written 8.1 years ago by Pierre Lindenbaum123k

Pierre, those are all interesting in their own right, but not really what I am looking for. The publications of the N&W and S&W algorithms were certainly milestones, but in my view the actual alignments are not. The alignments in those original papers serve to demonstrate features of the algorithms. There is some interesting discussion of the alignments in N&W, but I still wouldn't consider them biological milestones. RNA folding is also interesting, but brings in a lot of issues apart from alignment (e.g., folding topology, free energy calculations, etc)

ADD REPLYlink written 8.1 years ago by The Original Gtk190
2
gravatar for Larry_Parnell
8.1 years ago by
Larry_Parnell16k
Boston, MA USA
Larry_Parnell16k wrote:

Instead of looking gene by gene for nice examples, you can also offer the example of aligning whole genomes - as was done separately for several yeast and Drosophila species in order to yes, detect new genes and regulatory elements, but more importantly to describe speciation and the degree to which the different species have diverged from one another. This has been extended recently by Paabo's group in aligning the human and Neandertal genomes and identifying that non-Africans have ~4% Neandertal DNA. This is all accomplished with genome-wide alignments.

ADD COMMENTlink written 8.1 years ago by Larry_Parnell16k
1

I understand. However, if you show the human-mouse comparison over the APOA5-APOA4-APOC3-APOA1 (60 kbp) gene region, you will see peaks of different heights, meaning diff. levels of conservation. That can allow you to touch on (w.out going into details) evolution and rates of change and so forth.

ADD REPLYlink written 8.1 years ago by Larry_Parnell16k

Given the way the course is currently structured, that would be better later on. These are second year university students, so I need to make the examples relevant to the their backgrounds (which, of course, vary).

ADD REPLYlink written 8.1 years ago by The Original Gtk190

Very nice segue :-)

ADD REPLYlink written 8.1 years ago by The Original Gtk190
1
gravatar for jli99
8.1 years ago by
jli99150
jli99150 wrote:

Perhaps this one (SNPs resulting in premature STOP codons.):

http://genesdev.cshlp.org/content/25/1/1/F3.expansion.html

ADD COMMENTlink written 8.1 years ago by jli99150
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2249 users visited in the last hour