Entering edit mode
3.7 years ago
whyaberrate
•
0
Hello everyone, I'm trying to convert a bed file into a list of Gene IDs. I'm using the cruzdb module, to convert coordinates into the gene name, but I keep encountering issues. Here is a snippet of the bed file:
chr3 180984538 180984632 CPEB4_K562_IDR 1000 - 4.97411347309479 6.05868285076205 -1 -1
chr1 203853992 203854048 CPEB4_K562_IDR 1000 + 5.76313501648548 7.95820203088544 -1 -1
chr7 119716099 119716134 CPEB4_K562_IDR 1000 - 5.28139795449803 4.78621165693166 -1 -1
and here's the python script I'm using:
from cruzdb import Genome
genomedatabase = Genome(db='hg19')
cpeb4_peaks = open("/Users/ya8eb/Documents/Research/cpeb4/rna_binding_protein_data/cpeb4_clipseq_peaks.txt")
for i, line in enumerate(cpeb4_peaks):
toks = line.split()
if i == 0:
print("\t".join(['gene'] + toks))
else:
chrom, posns = toks[0].split(":")
start, end = map(int, posns.rstrip("|").split("-"))
genes = genomedatabase.bin_query('refGene', chrom, start, end)
print("\t".join(["|".join(set(g.name2 for g in genes))] + toks))
the error I'm getting is:
Traceback (most recent call last):
File "/Users/ya8eb/Documents/Research/cpeb4/rna_binding_protein_data/convert_bed_to_genes.py", line 1, in <module>
from cruzdb import Genome
File "/Users/ya8eb/opt/anaconda3/lib/python3.7/site-packages/cruzdb/__init__.py", line 5, in <module>
from . import soup
File "/Users/ya8eb/opt/anaconda3/lib/python3.7/site-packages/cruzdb/soup.py", line 1, in <module>
from . import sqlsoup
File "/Users/ya8eb/opt/anaconda3/lib/python3.7/site-packages/cruzdb/sqlsoup.py", line 458
except KeyError, ke:
^
SyntaxError: invalid syntax
I'm guessing this has something to do with cruzdb compatibility for Python 2 vs Python 3, but I'm not sure. Does anyone have a work around for this, or suggestion for a different way to convert bed files into a gene list? I'm open to using something else, I just thought this would be the most convenient. Thanks so much for the help!