Question

What is the '?' in the result of R package 'msa' function 'msa'?

1

Entering edit mode

5.0 years ago

YUCHEN CHANG ▴ 10

Hi everyone,

Recently I am using the R package called 'msa', and I have several questions very confused after getting the result.

For my understanding, the last line of MSA result (which title is 'con') is the consensus sequence. In this case, by the handbook of MSA, the consensus residue is the frequency over 80% (by default). Then I am confused about the question mark appear in this line. Is it indicating that in this position we have several candidates which are both in a high frequency?
Since the conserved score (compute by function 'msaConservationScore') is a bit differ from the consensus sequence, how can I know which part of the sequence alignment is matched together? I am guessing maybe the regions which consensus sequence have symbols but not a hyphen (-)?

Thanks everyone!

R msa sequece alignment consensus sequence • 1.5k views

ADD COMMENT • link updated 5.0 years ago by zx8754 11k • written 5.0 years ago by YUCHEN CHANG ▴ 10

1

Entering edit mode

Conservation symbols in MSA are generally categorized using the following scheme:

An * (asterisk) indicates positions which have a single, fully conserved residue.
A : (colon) indicates conservation between groups of strongly similar properties - scoring > 0.5 in the Gonnet PAM 250 matrix.
A . (period) indicates conservation between groups of weakly similar properties - scoring =< 0.5 in the Gonnet PAM 250 matrix.

? in this case appears to indicate that there is no discernible similarity in the residues in that position.

ADD REPLY • link 5.0 years ago by GenoMax 141k