Problem with Unicode while using BioPython on Windows
0
0
Entering edit mode
3.9 years ago
williamsam • 0

Hi all-

I am a graduate student working with an undergraduate student (remotely) on a phylogenetics project. I was showing her how to use the Phylo package (BioPython) to open a newick tree, and we're running into an error. I am on a Mac and she's on a PC, and I have been unable to replicate the error message, so I think it must be an issue with Windows.

Here is the code:

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

# Phylo is a package specifically designed for trees. it's part of BioPython
from Bio import Phylo
import os

# change the working directory to the folder containing the tree
os.chdir("/Users/xxx/Desktop/project/yyy/mafft_phyml")

# read in the rooted tree
tree = Phylo.read("final_tree.nwk", "newick")

print(tree)
str(tree)

# visualize the tree
Phylo.draw(tree)

This works completely fine for me, but she gets an error:

File "C:\Users\nameredacted\anaconda3\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]

UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 729: character maps to <undefined>

I understand that the error is due to some sort of encoding issue. However, every help request I've found on the topic seems to be related to either heavy-duty programming (not the basic Python stuff we're doing) or because someone is using the symbol for Euros or something. I had her retype out everything into a new script without copying and pasting and reading in the file with its full path instead of using os.chdir, and nothing has worked so far. It is very clearly the Phylo.read command that isn't working, and I'm worried that this issue will make it difficult for her to use any of the BioPython read functions.

She even has the # -- coding: utf-8 -- line at the top of the Python script. Does anyone know if this could be something with Windows or even with phyml (the newick file was outputted by phyml)? Any help would be much appreciated.

Thanks!

Edited to add: I know that with a normal open() function, I could have her add encoding="utf-8" to the open command. We tried this in the Phylo.read command and it didn't work. Does anyone know of a way to do this within BioPython read commands?

BioPython python phylo phylogenetics Windows • 991 views
ADD COMMENT
1
Entering edit mode

You probably need to play with the locale settings e.g. https://stackoverflow.com/questions/7165108/in-os-x-lion-lang-is-not-set-to-utf-8-how-to-fix-it

I believe those errors can also arise if the interpreter failed to compile correctly so they may need to (re)install a python binary.

ADD REPLY
0
Entering edit mode

Thank you for the response!

ADD REPLY

Login before adding your answer.

Traffic: 2931 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6