Question: Is there a way to fetch genomic sequences at given coordinates without downloading fasta files?
0
gravatar for ericbrenner
19 months ago by
ericbrenner0 wrote:

So I have a list of start and stop positions along chromosomes in different species, and I'd like to get the corresponding DNA sequence for each set of coordinates. In the past, I've just download the genome as a fasta file and then use pyfaidx to extract the sequences at the given positions. But now that I'm working with several species at once, I was wondering if there's any kind of tool in Python or R that can fetch your sequences of interest without downloading a bunch of large files. Thanks

dna R sequence python genome • 1.4k views
ADD COMMENTlink modified 19 months ago by Matt Shirley8.9k • written 19 months ago by ericbrenner0

Please do not cross-post to BioStars and Stack Exchange https://bioinformatics.stackexchange.com/questions/2543/way-to-get-genomic-sequences-at-given-coordinates-without-downloading-fasta-file

ADD REPLYlink written 19 months ago by Emily_Ensembl18k
2
gravatar for Satyajeet Khare
19 months ago by
Satyajeet Khare1.3k
Pune, India
Satyajeet Khare1.3k wrote:

Here is one way to download the sequences using DAS server. You can write a loop and fetch the sequences.

ADD COMMENTlink modified 19 months ago • written 19 months ago by Satyajeet Khare1.3k
1
gravatar for Emily_Ensembl
19 months ago by
Emily_Ensembl18k
EMBL-EBI
Emily_Ensembl18k wrote:

You can use the Ensembl REST API.

ADD COMMENTlink written 19 months ago by Emily_Ensembl18k

Does "GET sequence" on the REST API work for previous assemblies/releases of other species? I only know how to do it for grch37, but is that it. Thanks :)

ADD REPLYlink modified 19 months ago • written 19 months ago by ericbrenner0

Unfortunately not. We've recently started making REST access archives, which can get you to previous assemblies, but only recent archives are available and the only assembly that's changed in that time is pig. So actually, the answer is yes for pig and for human GRCh37 (as you've discovered), no for any other species.

ADD REPLYlink written 19 months ago by Emily_Ensembl18k
0
gravatar for Matt Shirley
19 months ago by
Matt Shirley8.9k
Cambridge, MA
Matt Shirley8.9k wrote:

There's a project called genomepy that handles the genome downloads and uses pyfaidx as the interface for queries. You could give that a shot, but you'd still be downloading files, just abstracting away the downloading process.

ADD COMMENTlink written 19 months ago by Matt Shirley8.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 918 users visited in the last hour