Transcript and protein
0
1
Entering edit mode
4.5 years ago
Vladislav ▴ 20

Hi, Biostars community.

At first, sorry for my English.

I have some set of mRNA transcripts ids e.g.: 'NM_007300.4', 'NM_007297.4', 'NM_007294.3' ... Well, I can to find which of them is canonical by using knownCanonical.txt and kgXref.txt from ucsc. 'NM_007300' in this case.

But I also have a set of their proteins ids, e.g.: 'NP_009231.2', 'NP_009228.2', 'NP_009225.1' ...

So, can you, please, tell me, how to find which of them depends to canonical mRNA transcript?

Thanks.

RNA-Seq Transcript Protein • 733 views
ADD COMMENT
1
Entering edit mode

ID's you have above are basically cross-references to each other. Using EntrezDirect you can verify that:

$ esearch -db nuccore -query "NM_007297.4" | elink -target protein | efetch -format acc
NP_009228.2
$ esearch -db protein -query "NP_009228" | elink -target nuccore | efetch -format acc
NC_000017.11
NM_007297.4

If you want to convert Ensembl identifiers from knownCanonical.txt, you could use their REST API (random example from Canonical file) or BioMart.

ADD REPLY

Login before adding your answer.

Traffic: 1536 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6