Question: [Clustal Omega] Posterior Probability Output File; columns meaning
0
gravatar for ReWeeda
13 months ago by
ReWeeda50
University of Bologna
ReWeeda50 wrote:

Hi!

Just to introduce the context: I clustered protein structures given annotated function and architectures (structural domains along the sequence) and now I want to perform a MSA for each cluster trying to highlight emergent relations at sequence level.

I'm trying different softwares (mafft, clustal omega, muscle, etc) from the command line and I'm exploring their available options.

Testing clustal omega (v1.2.1), I found the option: --posterior-out=<file> Posterior probability output file

that's listed in the command line --help but not in the README file (http://www.clustal.org/omega/README) that's probably referred to an older version of the tool given the presence of old commands that are not supported anymore.

the output file obtained using the above mentioned option is composed by 7 columns as follow:

1.i 2.name                    3.L1 4.L2 5.sum       6.sum/L1    7.HH

0   3VPG:A|PDBID|CHAIN|SEQUENCE 310 377 304.243683  0.981431    262.894775

1   1T2D:A|PDBID|CHAIN|SEQUENCE 322 377 307.759918  0.955776    252.773529

(These are the first columns of the output file resulting from one of my tests)

I don't know what columns 5,6,7 represent.

Reading the article about Clustal Omega (doi:10.1038/msb.2011.75) and the one about the alignment engine (HHalign) used by Clustal Omega (doi:10.1093/bioinformatics/bti125) I didn't find information about this output.

I think that the column n5.sum represents the Log-sum-odd-score discussed in the HHalign article while I have no idea about the meaning of column n7.HH.

Anyone can help me?

Thanks in advance!

D.

ADD COMMENTlink modified 13 months ago • written 13 months ago by ReWeeda50
1
gravatar for ReWeeda
13 months ago by
ReWeeda50
University of Bologna
ReWeeda50 wrote:

If someone ever need in the future:

the answer has been given to me by the "help desk" of Clustal developers.

Column 5: It's the sum of the probabilities of each residue of being aligned to its corresponding position in the HMM computed when the --posterior-out=<file> option is flagged

Column 6: Is the average probability for each position.

Column 7: Is an HHalign internal measure that measures how well a sequence aligns back onto the overall alignment (Higher is better)

ADD COMMENTlink written 13 months ago by ReWeeda50
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1052 users visited in the last hour