Conservation Score Calculator
3
1
Entering edit mode
9.0 years ago
anuragm ▴ 130

I have an alignment in FASTA format and I would like to get the column wise conservation scores for each nucleotide position. So I want as output two columns - nucleotide position (basically, the column index in the alignment) and its corresponding conservation score. Is there any software/application that could do that for me ? This might be a very trivial question but I am rather new to bioinformatics.

conservation fasta • 7.9k views
ADD COMMENT
1
Entering edit mode

FASTA is not an alignment format. Please, show us a sample of your data.

ADD REPLY
4
Entering edit mode

FASTA CAN represent alignment, if gaps are positioned where needed. For instance emma (a clustalW wrappar from EMBOSS) returns the MSA in FASTA format

ADD REPLY
0
Entering edit mode

multi-FASTA maybe?

ADD REPLY
2
Entering edit mode
9.0 years ago
Asaf 8.7k

This is not a trivial question at all since you have to have a good evolutionary model and this is very hard to estimate correctly. phyloP from phast package can do this (with the -b option) but you will need to generate a model using phyloFit (requires a phylogenetic tree and alignment of sequences under neutral evolution) or get a model file from somewhere.
Another option is using rate4site which does something pretty similar but with a simpler model (Jukes-Cantor) and requires an alignment and a phylogenetic tree.
I have to mention that you can always compute the entropy of each column and use this as a conservation score but this would be wrong unless the tree looks close enough to a star.

ADD COMMENT
2
Entering edit mode
9.0 years ago

If you are dealing with a multi-FASTA alignment file with sequences from multiple species, and you have large genomic alignments, you may want to try GERP:
http://mendel.stanford.edu/SidowLab/downloads/gerp/
(phyloP is another option, mentioned in another answer)

If you are dealing with alignments of sequences from different individuals of the same species or subspecies, one option is to use software that calculates nucleotide substitution values per window (of size 1 or bigger). You can use a tool like VariScan for that:
http://www.ub.edu/softevol/variscan/

ADD COMMENT
0
Entering edit mode
9.0 years ago
Josh Herr 5.7k

There are a few ways to do this with a text file at the command line and there are a lot of programs which can provide you with site conservation visualization if you have a curated alignment. Here are some links (List of Alignment Visualization Software and Alignment Tools) with a few you can try.

ADD COMMENT

Login before adding your answer.

Traffic: 728 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6