Functional domain data from NM_ transcript id
1
0
Entering edit mode
4 months ago
sofie • 0

Hello!

I would like to download annotated data on functional domains for the hg19 RefSeq. The information needed is the domain name and start and stop position of the functional domains for a given a NM_* or NR_ transcript id.

I have some trouble finding the information in the right format. I tried fetching data from biological_region from the RefSeq Functional Element from NCBI (https://www.ncbi.nlm.nih.gov/refseq/functionalelements/#Gene_FTP).

It seems the functional domains are here coded by NC_* numbers and are not compatible with NM_* IDs.

Does anyone know how the functional domain annotations can be fetched for NM_* transcript IDs?

Thanks in advance

transcript functional-domain • 562 views
ADD COMMENT
1
Entering edit mode

Are you sure you want the NM* accessions. Those are nucleotide and are not likely to have any domain information.

With NP* accessions you will get (with EntrezDirect, example below truncated to save space)

$ efetch -db protein -id NP_000050.3 -format ft
>Feature ref|NP_000050.3|
1       3418    Protein
                        product breast cancer type 2 susceptibility protein isoform 1
                        product BRCA1/BRCA2-containing complex, subunit 2
                        product breast cancer type 2 susceptibility protein
                        product DNA repair-associated BRCA2
                        product breast cancer 2 tumor suppressor
                        product breast and ovarian cancer susceptibility gene, early onset
                        product Fanconi anemia group D1 protein
                        product breast and ovarian cancer susceptibility protein 2
                        product breast cancer 2, early onset
                        product mutant BRCA2
                        product mutant DNA repair-associated protein 2
1       40      Region
                        region  Interaction with PALB2
                        note    propagated from UniProtKB/Swiss-Prot (P51587.4)
37      68      Region
                        region  Disordered. /evidence=ECO:0000256|SAM:MobiDB-lite
                        note    propagated from UniProtKB/Swiss-Prot (P51587.4)
70      70      Site
                        site_type       phosphorylation
                        note    Phosphoserine. /evidence=ECO:0007744|PubMed:23186163; propagated from UniProtKB/Swiss-Prot (P51587.4)
358     381     Region
                        region  Disordered. /evidence=ECO:0000256|SAM:MobiDB-lite
                        note    propagated from UniProtKB/Swiss-Prot (P51587.4)
ADD REPLY
0
Entering edit mode

Thank you for the response!

Unfortunately, the software only reports NM and NR and not NP accessions. Could it be possible to translate the NM accessions? The goal is to map the protein positions to the functional domains.

Here is some documentation from the Ion Reporter Software:

Example data: transcript : NM_015215.2 , gene : CAMTA1, protein : p.Cys147Trp.

Documentation on the transcript value: "NM_ or NR_ NCBI versioned transcript identifiers (as specified by the gene-model files provided by UCSC RefSeq v63."

ADD REPLY
2
Entering edit mode
4 months ago
GenoMax 142k

You could do something like this to "translate" the NM id to NP :

$ esearch -db nuccore -query NM_015215 | elink -target protein | efetch -format ft
>Feature ref|NP_056030.1|
1       1673    Protein
                        product calmodulin-binding transcription activator 1 isoform a
67      183     Region
                        region  CG-1
                        note    CG-1 domains are highly conserved domains of about 130 amino-acid residues
                        db_xref CDD:198144
112     119     Region
                        region  Nuclear localization signal. /evidence=ECO:0000255|PROSITE-ProRule:PRU00767
                        note    propagated from UniProtKB/Swiss-Prot (Q9Y6Y1.4)
283     375     Region
                        region  Disordered. /evidence=ECO:0000256|SAM:MobiDB-lite
                        note    propagated from UniProtKB/Swiss-Prot (Q9Y6Y1.4)
873     952     Region
                        region  TIG
                        note    IPT/TIG domain
                        db_xref CDD:426462
ADD COMMENT
0
Entering edit mode

That worked perfectly. Thank you, much appreciated!

ADD REPLY
0
Entering edit mode

A small educational note: if an answer was helpful, you should upvote it; if the answer resolved your question, you should mark it as accepted. You can accept more than one answer if they work. This will help future users that might find this post find the right answer.

upvote_bookmark_accept

ADD REPLY

Login before adding your answer.

Traffic: 3047 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6