Profile Matrix That Assign The Highest Possible Score
Entering edit mode
12.3 years ago
Haluk ▴ 190


I must design a profile matrix that will assign the highest possible score to the ATATA, CATCA, and GGATA motifs. To do so, I've used biopython and found the following result.

   A   C   G   T
X 1.0 1.0 1.0 0.0
X 1.0 0.0 1.0 1.0
X 2.0 0.0 0.0 1.0
X 0.0 1.0 0.0 2.0
A 3.0 0.0 0.0 0.0

But I don't understand the meaning of Xs and I'm not sure about result. Do you think is it true?

pssm biopython python • 1.9k views
Entering edit mode
12.1 years ago
Ahill ★ 1.9k

How are you going to use this PSSM? The matrix values are essentially correct - they count the number of occurences of each nucleotide at each position in your 3 sequences. For example, 3 out of 3 bases are "A" in the 5th position. But there are number of different flavours of PSSMs that may include subsitution scoring or log ratio transformations, depending on what question you are trying to answer, and what software you are using this PSSM with. Your matrix is a correct frequency count for each base at each position in your sequences.

I'm no Biopython user but it looks to me like the "X" in the first 4 positions indicates positions where there is ambiguity - no single consensus base among your 3 sequences. The 5th position is the only position where the same base is present in all 3 sequences ("A"), and so is labelled "A" in the 5th row.


Login before adding your answer.

Traffic: 2369 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6