Question: Downloading Cdna Sequences Of The Trembl And Nr Databases
gravatar for Pappu
5.8 years ago by
Pappu1.9k wrote:

Could you please tell me how to download all the cDNA sequences of the entries in trEMBL and nr databases?

python • 1.4k views
ADD COMMENTlink modified 5.8 years ago by hpmcwill1.1k • written 5.8 years ago by Pappu1.9k
gravatar for Kashyap Chhatbar
5.8 years ago by
Edinburgh, UK
Kashyap Chhatbar70 wrote:


You could use Biomart and choose the database Ensembl Genes and go to the particular species. In the Filters section on the left side, go to Gene and select Limit to genes... With UniProtKB/TrEMBL Accession(s)

Select the attributes you want to download which has the option for cDNA sequence in Sequences radio button.

This is the easy and fast way. You could use Ensembl Perl API too if you would like to customize and batch download for multiple species.

PS: This is a targeted search of Ensembl database and may not be totally up to date with the most recent updated records at UniProtKB/trEMBL.

ADD COMMENTlink written 5.8 years ago by Kashyap Chhatbar70
gravatar for hpmcwill
5.8 years ago by
United Kingdom
hpmcwill1.1k wrote:

For UniProtKB (UniProtKB/SwissProt + UniProtKB/TrEMBL) the set of source coding sequences is equivalent to all the CDS features in EMBL-Bank.

The European Nucleotide Archive (ENA) provide a set of data files for ENA Coding sequences (formerly known as EMBLCDS) which is available from the EMBL-EBI FTP site:

For what it is worth, ENA also provide an equivalent dataset for non-coding RNA features appearing in EMBL-Bank entries:

ADD COMMENTlink written 5.8 years ago by hpmcwill1.1k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1308 users visited in the last hour