Can't get all coding sequences from list of protein IDs [Entrez Direct]
1
0
Entering edit mode
2.4 years ago

I have a list of protein ids in number format (e.g. 25121878) that I want to retrieve the coding sequences for.

This E-direct command is working for some, but for some it's giving me 'NO RESULT':

efetch -db protein -format fasta_cds_na -id **id**

Am I doing something wrong or is there really no way to get a CDS in these instances?

Reproducible example where I get no result:

efetch -db protein -format fasta_cds_na -id 25121878
entrez entrez-direct • 598 views
ADD COMMENT
2
Entering edit mode
2.4 years ago
GenoMax 141k

Problem with 25121878 (I assume this is a gi ID which is deprecated for end-user use now) is that this is a conceptual translation so there is no nucleotide record associated with it.

COMMENT     PROVISIONAL REFSEQ: This record has not yet been subject to final
            NCBI review. The reference sequence was derived from
            NP_040350:647-922.
            Related sequence: M20562.
            Method: conceptual translation.
ADD COMMENT

Login before adding your answer.

Traffic: 2339 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6