GTFtools with Homo_sapiens.GRCh38.103.gtf encoding error in python
0
0
Entering edit mode
3.0 years ago
Doreen • 0

I adapted GTF tools to work with python3. The demo.gtf runs fine, but if I try using it with Homo_sapiens.GRCh38.103.gtf from ensembl I get Decoding error:

f = open(gtf)
f.readline()
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte

I tried playing around with the loading. Specifying utf-8 doesn't help.

f = open(gtf, encoding= 'utf-8' )
f.readline()
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte

Ascii also fails.

f = open(gtf, encoding="ascii")
f.readline()
UnicodeDecodeError: 'ascii' codec can't decode byte 0x8b in position 1: ordinal not in range(128)

Encoding = 'latin-1' works but resulting string seems to output gibbrish.

Reading bits works but I need to get the string.

f = open(gtf, 'rb )
f.readline()

python2 also throws an error later on in the script, but looking at printouts it seems it loads gibberish to get that far.

I don't think it was the decompressing - gunzip -d Homo_sapiens.GRCh38.103.abinitio.gtf.gz

Any ideas on how to get this Gnome release to work?

Thanks!

GTFtools Encoding python3 GRCh38.103 • 803 views
ADD COMMENT
0
Entering edit mode

Homo_sapiens.GRCh38.102.gtf works fine, its just Homo_sapiens.GRCh38.103.gtf thats failing

ADD REPLY

Login before adding your answer.

Traffic: 2943 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6