get gene name from rsID
1
0
Entering edit mode
12 weeks ago
a3532321 • 0

I've got a list of rs IDs in xlsx format. I need to get the gene name for each rsID. When I use this command, I get the gene name

esearch -db snp -query "rs573455" | esummary | xtract -pattern GENE_E -element NAME | sort | uniq
CEP164

But when I use the code, the result is only found for some rsIDs. Why is this happening?

import subprocess

rsIDs = [
    "rs573455",
    "rs7215121",
    "rs2873296",
    "rs6672420",
    "rs6664445"
]

for rsID in rsIDs:
    query = f"esearch -db snp -query {rsID} | esummary | xtract -pattern GENE_E -element NAME | sort | uniq"
    result = subprocess.run(query, shell=True, stdout=subprocess.PIPE, text=True)

    print(f"{rsID}:")
    print(result.stdout)

Output:

rs573455: 
rs7215121:
rs2873296:
rs6672420: RUNX3 RUNX3-AS1
rs6664445: SPOCD1
dbSNP • 352 views
ADD COMMENT
2
Entering edit mode
12 weeks ago
Ram 43k

Not all variants fall in coding regions. rs2873296 for example is NC_000001.10:g.21721038A>G, which is in a non-coding region.

ADD COMMENT

Login before adding your answer.

Traffic: 2272 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6