Entrez Direct to retrieve Protein and Coding sequences from NCBI accesion
1
0
Entering edit mode
3.1 years ago
dllopezr ▴ 80

Hi everyone!

I have a bunch of fasta files of gene sequences that I download from ncbi trough entrez direct tool. I am wondering if it is posible to obtain the protein and coding sequences of these genes using the accession, that is in this format: NZ_CP006694.1:1104181-1105143

where the data following the : is the sequence section where the gene is located.

Can you help me with that?

Thank you so much

entrez ncbi Coding Sequences Retrieve protein • 1.2k views
ADD COMMENT
1
Entering edit mode

An R-based solution would be to use the Bioconductor package BiomaRt; please see my post here. Since you have the exact chromosomal position already, you can easily covert this to sequences. You can find the appropriate filters (chromosome, start and end position) using listFilters(ensembl), and the attribute (protein / dan sequence) using listAttributes(ensembl).

ADD REPLY
1
Entering edit mode
3.1 years ago
vkkodali ★ 2.8k

I think you can use Edirect for this as follows:

efetch -db nuccore -id 'NZ_CP006694.1' -seq_start 1104181 -seq_stop 1105143 -format fasta_cds_aa
ADD COMMENT

Login before adding your answer.

Traffic: 2996 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6