Question

Is there a way to fetch genomic sequences at given coordinates without downloading fasta files?

0

Entering edit mode

6.6 years ago

ericbrenner • 0

So I have a list of start and stop positions along chromosomes in different species, and I'd like to get the corresponding DNA sequence for each set of coordinates. In the past, I've just download the genome as a fasta file and then use pyfaidx to extract the sequences at the given positions. But now that I'm working with several species at once, I was wondering if there's any kind of tool in Python or R that can fetch your sequences of interest without downloading a bunch of large files. Thanks

Python R sequence genome DNA • 3.2k views

ADD COMMENT • link updated 6.6 years ago by Matt Shirley 10k • written 6.6 years ago by ericbrenner • 0

0

Entering edit mode

Please do not cross-post to BioStars and Stack Exchange https://bioinformatics.stackexchange.com/questions/2543/way-to-get-genomic-sequences-at-given-coordinates-without-downloading-fasta-file

ADD REPLY • link 6.6 years ago by Emily 23k

score 2 · Answer 1 · 2017-09-19

2

Entering edit mode

6.6 years ago

Satyajeet Khare ★ 1.6k

Here is one way to download the sequences using DAS server. You can write a loop and fetch the sequences.

ADD COMMENT • link 6.6 years ago by Satyajeet Khare ★ 1.6k

score 1 · Answer 2 · 2017-09-20

1

Entering edit mode

6.6 years ago

Emily 23k

You can use the Ensembl REST API.

ADD COMMENT • link 6.6 years ago by Emily 23k

0

Entering edit mode

Does "GET sequence" on the REST API work for previous assemblies/releases of other species? I only know how to do it for grch37, but is that it. Thanks :)

ADD REPLY • link 6.6 years ago by ericbrenner • 0

0

Entering edit mode

Unfortunately not. We've recently started making REST access archives, which can get you to previous assemblies, but only recent archives are available and the only assembly that's changed in that time is pig. So actually, the answer is yes for pig and for human GRCh37 (as you've discovered), no for any other species.

ADD REPLY • link 6.6 years ago by Emily 23k

score 0 · Answer 3 · 2017-09-20

0

Entering edit mode

6.6 years ago

Matt Shirley 10k

There's a project called genomepy that handles the genome downloads and uses pyfaidx as the interface for queries. You could give that a shot, but you'd still be downloading files, just abstracting away the downloading process.

ADD COMMENT • link 6.6 years ago by Matt Shirley 10k