Edit: NVM, realized chromosome.number has to be specified as a str. Solved.
Good afternoon, this question is related to usage of pyGeno.
Based on this documentation, we can query for specific details as such:
#even complex stuff
exons = myChromosome.get(Exons, {'start >=' : x1, 'stop <' : x2})
hlaGenes = myGenome.get(Gene, {'name like' : 'HLA'})
sry = myGenome.get(Transcript, { "gene.name" : 'SRY' })
Unfortunately, none of these commands seem to work, while basic commands for getting specific genes based on their ids work:
#in this case both queries will yield the same result
myGene.get(Protein, id = "ENSID...")
myGenome.get(Protein, id = "ENSID...")
In this situation, I am attempting to call a list of genes within a particular set of coordinates on a particular chromosome. To illustrate the problem, I use .get
to call p53 (Chr17 in humans):
# getting the gene based on id
gene_example = g.get(Gene, id = 'ENSG00000141510')
# confirming the gene based on chromosome - note that I give the index [0] because for some reason, .get seems to generate a single-index list of the Raba object
print(gene_example[0].chromosome.number)
>17
# now, I get the start and end coords
x1 = gene[0].start
x2 = gene[0].end
# finally, I test getting the gene using the coords
gene_test = g.get(Gene, {'start >=': x1, 'end <=': x2, 'chromosome.number': 17})
Ultimately, gene_test
is not assigned to any value because g.get can't find anything within those coordinates. Even when I tested by replacing x1
and x2
with nearly the entire chromosomal length, no genes were identified.
Would anyone happen to know the correct syntax for this? Perhaps it has changed in recent updates. Thank you!