I have to find out the size of the protein sequence, but even using the codes below, I couldn't. This first code was to find how many proteins there are in total and to find the size of the sequences.
The attached image is just to show what I want the code to search for. I don't know what is missing in the code
arq = open("genoma9.faa") conteudo = arq.read() print(conteudo) fh = open("genoma9.faa") n= 0 for line in fh: if line.startswith(">"): n+= 1 print(line) proteins = line.count(">") print("Total of Proteins: " + str(proteins))
Trying to find this middles characters above the >WP:
>WP_013277001.1 DNA polymerase III subunit beta [Acetohalobium arabaticum] MQIKIDRKNFYDGIQTVRKAISSKSTLPILSGILIETQEKKLKLVGTDLELGIECRVDANIIKDGAIVLPANHLANIVRE LPNKELELELKKDNKIEISCGLSQFKIHGSPADEYPLLPEVGSGIEYTLSQEKFQAMINRIKFATSDDESRPFLTGGLLS
you said. FAA File Sequence
please, do so now.
Answer of the other post:
Have you tried running this piece of code? It looks like it has an indentation error?
this post is the same of your previous one Print the size of a protein . Stop asking new questions and update your original post.
I reposted because I deleted the other one since I didn't post the code in the old post.
editbutton is for edits, no need to delete.