Question: Conservation Score Calculator
1
gravatar for anuragm
7.8 years ago by
anuragm130
India
anuragm130 wrote:

I have an alignment in FASTA format and I would like to get the column wise conservation scores for each nucleotide position. So I want as output two columns - nucleotide position (basically, the column index in the alignment) and its corresponding conservation score. Is there any software/application that could do that for me ? This might be a very trivial question but I am rather new to bioinformatics.

conservation fasta • 6.8k views
ADD COMMENTlink modified 7.8 years ago by 141341254653464453.5k • written 7.8 years ago by anuragm130
1

FASTA is not an alignment format. Please, show us a sample of your data.

ADD REPLYlink written 7.8 years ago by Pierre Lindenbaum134k
4

FASTA CAN represent alignment, if gaps are positioned where needed. For instance emma (a clustalW wrappar from EMBOSS) returns the MSA in FASTA format

ADD REPLYlink written 7.8 years ago by Asaf8.5k

multi-FASTA maybe?

ADD REPLYlink written 7.8 years ago by 141341254653464453.5k
2
gravatar for Asaf
7.8 years ago by
Asaf8.5k
Israel
Asaf8.5k wrote:

This is not a trivial question at all since you have to have a good evolutionary model and this is very hard to estimate correctly. phyloP from phast package can do this (with the -b option) but you will need to generate a model using phyloFit (requires a phylogenetic tree and alignment of sequences under neutral evolution) or get a model file from somewhere.
Another option is using rate4site which does something pretty similar but with a simpler model (Jukes-Cantor) and requires an alignment and a phylogenetic tree.
I have to mention that you can always compute the entropy of each column and use this as a conservation score but this would be wrong unless the tree looks close enough to a star.

ADD COMMENTlink written 7.8 years ago by Asaf8.5k
2
gravatar for 14134125465346445
7.8 years ago by
United Kingdom
141341254653464453.5k wrote:

If you are dealing with a multi-FASTA alignment file with sequences from multiple species, and you have large genomic alignments, you may want to try GERP:
http://mendel.stanford.edu/SidowLab/downloads/gerp/
(phyloP is another option, mentioned in another answer)

If you are dealing with alignments of sequences from different individuals of the same species or subspecies, one option is to use software that calculates nucleotide substitution values per window (of size 1 or bigger). You can use a tool like VariScan for that:
http://www.ub.edu/softevol/variscan/

ADD COMMENTlink written 7.8 years ago by 141341254653464453.5k
0
gravatar for Josh Herr
7.8 years ago by
Josh Herr5.7k
University of Nebraska
Josh Herr5.7k wrote:

There are a few ways to do this with a text file at the command line and there are a lot of programs which can provide you with site conservation visualization if you have a curated alignment. Here are some links (List of Alignment Visualization Software and Alignment Tools) with a few you can try.

ADD COMMENTlink modified 7.8 years ago • written 7.8 years ago by Josh Herr5.7k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1515 users visited in the last hour
_