Question: Downloading genomes from OmaDB with PyOMADB or OMA API
gravatar for hannah.muelbaier
20 months ago by
hannah.muelbaier0 wrote:

Hello OMA team,

I want to write a tool which downloads sequences automatically from OmaDB. Therefore I wanted to use the new PyOMADB library. I need not only some protein sequences although I need the genome protein sequence of some species. In the documentation of PyOMADB I wasn’t apple to find a command to download multiple fasta sequences (for example all proteins from one species) at once. Is it possible to get the genomes over PyOMADB or the OMA API or is there any possibility to download many sequences with one command or only a few?

Thank you and kind regards Hannah

oma orthologs • 477 views
ADD COMMENTlink modified 20 months ago by adrian.altenhoff770 • written 20 months ago by hannah.muelbaier0
gravatar for adrian.altenhoff
20 months ago by
adrian.altenhoff770 wrote:

Dear Hannah,

there is not a way to directly load whole genomes in fasta format using the PyOMADB tool. However, you can load multiple proteins at once:

import omadb
c = omadb.Client()
ids = ["HUMAN{:05d}".format(x) for x in range(1, 55)]

Note that the number of proteins you can query at once is currently limited to 100. If you would like to download all the sequences from a genome, you could do something like this:

genome = c.genomes.genome("ECOLI")
nr_ecoli_proteins = genome['nr_entries']
prot_ids = ["ECOLI{:05d}".format(x) for x in range(1,nr_ecoli_proteins+1)]
res = []
for x in range(0, len(prot_ids), 100):
    chunk = prot_ids[x:x+100]

Best wishes, Adrian

ADD COMMENTlink written 20 months ago by adrian.altenhoff770

Maybe important to mention that for this exact application of downloading for one or more whole genomes a fasta file of protein sequences, it is much more efficient to use the download link for all protein sequences in fasta format and filter for the genomes of interest. The URL for the latest version of protein sequences is accessible from the Download menu.

ADD REPLYlink written 20 months ago by adrian.altenhoff770
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1041 users visited in the last hour