Question: Using cruzdb to retrieve SNP sequence including flanking
0
gravatar for yarrowmadrona
16 months ago by
yarrowmadrona0 wrote:

I want to use cruzdb to query a list of SNPs by rs id in a text file and retrieve sequence including 200 basepairs flanking each SNP. I can do this in the UCSC genome browser table by selecting "Output format" = sequence. I have some code below that I sketched together from previous posts.

from cruzdb import Genome
import sys
file_in = sys.argv[1]
file_handle = open("rs_example2.txt", 'rb')
hg19 = Genome(db = 'hg19')
snp147 = hg19.snp147
for rs in file_handle:
    rs.split()[0].strip('\n')
    if rs.startswith("rs"):
        print snp147.filter_by(name=rs).first()

Unfortunately, there is no sequence information here. I also ran across the snp sequence database but not sure how to use it. hg19.snp147Seq.filter_by(name='rs9923231')

ADD COMMENTlink modified 15 months ago • written 16 months ago by yarrowmadrona0
0
gravatar for yarrowmadrona
15 months ago by
yarrowmadrona0 wrote:

In case anyone is interested I decided not to use dbcruz module and just to query dbSNP instead. Much easier.

https://www.ncbi.nlm.nih.gov/projects/SNP/SNPeutils.htm

ADD COMMENTlink modified 15 months ago • written 15 months ago by yarrowmadrona0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 731 users visited in the last hour