get gene name from rsID
1
0
Entering edit mode
4 months ago
a3532321 • 0

I've got a list of rs IDs in xlsx format. I need to get the gene name for each rsID. When I use this command, I get the gene name

esearch -db snp -query "rs573455" | esummary | xtract -pattern GENE_E -element NAME | sort | uniq
CEP164

But when I use the code, the result is only found for some rsIDs. Why is this happening?

import subprocess

rsIDs = [
    "rs573455",
    "rs7215121",
    "rs2873296",
    "rs6672420",
    "rs6664445"
]

for rsID in rsIDs:
    query = f"esearch -db snp -query {rsID} | esummary | xtract -pattern GENE_E -element NAME | sort | uniq"
    result = subprocess.run(query, shell=True, stdout=subprocess.PIPE, text=True)

    print(f"{rsID}:")
    print(result.stdout)

Output:

rs573455: 
rs7215121:
rs2873296:
rs6672420: RUNX3 RUNX3-AS1
rs6664445: SPOCD1
dbSNP • 431 views
ADD COMMENT
2
Entering edit mode
4 months ago
Ram 43k

Not all variants fall in coding regions. rs2873296 for example is NC_000001.10:g.21721038A>G, which is in a non-coding region.

ADD COMMENT

Login before adding your answer.

Traffic: 1413 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6