I adapted GTF tools to work with python3. The demo.gtf runs fine, but if I try using it with Homo_sapiens.GRCh38.103.gtf from ensembl I get Decoding error:
f = open(gtf)
f.readline()
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte
I tried playing around with the loading. Specifying utf-8 doesn't help.
f = open(gtf, encoding= 'utf-8' )
f.readline()
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte
Ascii also fails.
f = open(gtf, encoding="ascii")
f.readline()
UnicodeDecodeError: 'ascii' codec can't decode byte 0x8b in position 1: ordinal not in range(128)
Encoding = 'latin-1' works but resulting string seems to output gibbrish.
Reading bits works but I need to get the string.
f = open(gtf, 'rb )
f.readline()
python2 also throws an error later on in the script, but looking at printouts it seems it loads gibberish to get that far.
I don't think it was the decompressing - gunzip -d Homo_sapiens.GRCh38.103.abinitio.gtf.gz
Any ideas on how to get this Gnome release to work?
Thanks!
Homo_sapiens.GRCh38.102.gtf works fine, its just Homo_sapiens.GRCh38.103.gtf thats failing