Rosalind - Consensus profile
0
1
Entering edit mode
12 weeks ago

Hello all, this is my first post on Biostars so please have grace with me. I am having issues with the "Consensus and Profile" question on Bioinformatics Stronghold on Rosalind. I have literally no clue as to why or how my answer could be wrong. When working with the smaller, sample data set, I produced the correct answer, but when working with the larger dataset, it says my answer is wrong.

Here is my code:

data = open('/Users/danielpintard/Downloads/rosalind_cons (2).txt', 'r').read()

#format data in FASTA file to operate on
if '>' in data :
data_array = data.split('>')
for i in data_array:
if i == '':
data_array.remove(i)
for i in data_array: data_array[data_array.index(i)] = i.split('\n', 2)

#create profile
prof_sequences = []

for i in data_array:
data_array[data_array.index(i)] = i[1]
prof_sequences.append(i[1])

n = len(prof_sequences[0])

profile_matrix = {
'A': [0]*n,
'C': [0]*n,
'G': [0]*n,
'T': [0]*n,
}

for dna in prof_sequences:
for position, nucleotide in enumerate(dna):
profile_matrix[nucleotide][position] += 1

#find consensus string
result = []
for position in range(n):
max_count = 0
max_nucleotide = None
for nucleotide in profile_matrix:
if profile_matrix[nucleotide][position] > max_count:
max_count = profile_matrix[nucleotide][position]
max_nucleotide = nucleotide
result.append(max_nucleotide)

print(''.join(result))
print('A:', ' '.join(map(str, profile_matrix['A'])))
print('C:', ' '.join(map(str, profile_matrix['C'])))
print('G:', ' '.join(map(str, profile_matrix['G'])))
print('T:', ' '.join(map(str, profile_matrix['T'])))


I think any errors I'm making here are above my level of expertise to understand. Even when I compare my code to others', I still can't see what is wrong with my code.

python rosalind • 256 views