Retrieve RefSeq protein accession from transcript accession
1
0
Entering edit mode
3.9 years ago
speycast • 0

Hi,

I'd like to know if anyone knows how to retrieve RefSeq protein accession # from its mRNA transcript accession # using NCBI E-utility tool?

For example: Using NM_001382556.2 to get NP_001369485.1

Thanks very much!

NCBI RefSeq Eutils • 1.2k views
ADD COMMENT
2
Entering edit mode
3.9 years ago
GenoMax 141k

Using Entrezdirect:

$ esearch -db nuccore -query "NM_001382556" | elink -target protein | efetch -format acc
NP_001369485.1
ADD COMMENT
0
Entering edit mode

Thanks so much genomax!!!

ADD REPLY
0
Entering edit mode

genomax would you happen to know how to retrieve the reference sequence genbank file for this gene (RETL1) containing the above transcript NM_001382556.2? I want to use -format gbwithparts to get the mRNA and CDS region. There are other isoform transcripts in the genbank file, how can I just get the gbwithparts with FEATURES section containing just NM_001382556.2 and CDS regions?

Thanks again in advance...

ADD REPLY
0
Entering edit mode

You should only need:

$ efetch -db nuccore  -id "NM_001382556.2" -format gb

From reference genbank file as in chromosome/genome?

ADD REPLY
0
Entering edit mode

Yes as in chromosome/genome. efetch -db nuccore -id "NM_001382556.2" -format gb gives the gb in mRNA version. I would like the genome genbank format for this gene, but since it contains two isoform transcripts in the FEATURES section of full genbank file, I only want the full genbank with transcript of interest like mRNA and CDS in the FEATURES section. (like a truncated FEATURES section) with just one transcript: NM_001382556.2 along with its CDS

ADD REPLY
0
Entering edit mode

Can you tell me which specific genbank record you are looking at?

ADD REPLY
0
Entering edit mode

For example the DMD full genbank record (gbwithparts) GRCh37 assembly, here's the link: https://www.ncbi.nlm.nih.gov/nuccore/NC_000023.10?report=genbank&from=31137345&to=33357726&strand=true

In the FEATURES section, there are 30 mRNA transcript_ids (first one begins with mRNA join(1..351...) and follows that is its protein and CDS section. I'm interested in getting only mRNA transcript_id NM_004006.2 and its protein_id NP_003997.1 and its CDS region for FEATURES section and all other sections the same.

ADD REPLY
1
Entering edit mode

Best I can think of is this but that is going to give you all transcripts in that range.

$ efetch -db nuccore -id NC_000023.10 -seq_start 31137345 -seq_stop 33357726 -format gb -style withparts
ADD REPLY
0
Entering edit mode

Yup, thanks very much genomax! This is super helpful already, greatly appreciate it. I guess in this case I will just use python Bio package to parse the transcript of interest. :)

ADD REPLY

Login before adding your answer.

Traffic: 1849 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6