Question: How To Convert Refseq Id To Gene Symbol For Non-Coding Rnas
7.5 years ago by
Hamilton260 wrote:


i'm trying to generate a full gene annotation table with corresponding gene symbols/gene descriptions and other gene IDs(ucscknown gene id, entrezid, ensembl id) to refseqID for ncRNAs as well as protein coding genes. it seems kgXref table generates such annotations for only prefix NM* protein coding genes. not NR*.

I would like to get as

e.g ucsc known id, entrezid, ensembl id, NR_045294(RefseqID), Gm4285(gene symbol), Mus musculus predicted gene 4285, non-coding RNA(gene description)

any thoughts?

7.5 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum119k wrote:

You could use the following XSLT stylesheet:

<xsl:stylesheet version="1.0" xmlns:xsl="&lt;a href=" http:="""" 1999="" XSL="" Transform"="" rel="nofollow">"
<xsl:output method="text"/>

<xsl:template match="/">
<xsl:apply-templates select="Bioseq-set/Bioseq-set_seq-set/Seq-entry"/>

<xsl:template match="Seq-entry">
<xsl:value-of select="Seq-entry_seq/Bioseq/Bioseq_id/Seq-id/Seq-id_other/Textseq-id/Textseq-id_accession"/>
<xsl:text>  </xsl:text>
<xsl:for-each select="Seq-entry_seq/Bioseq/Bioseq_annot/Seq-annot/Seq-annot_data/Seq-annot_data_ftable/Seq-feat/Seq-feat_dbxref/Dbtag[Dbtag_db='GeneID']">

<xsl:variable name="geneid" select="Dbtag_tag/Object-id/Object-id_id"/>

<xsl:variable name="url" select="concat('&lt;a href=" http:="""" entrez="" eutils="" efetch.fcgi?db="gene&amp;retmode=xml&amp;id=',$geneid)" "="" rel="nofollow">',$geneid)"/>

<xsl:value-of select="document($url)/Entrezgene-Set/Entrezgene/Entrezgene_gene/Gene-ref/Gene-ref_locus"/>


with NCBI efetch/nucleotide:

xsltproc --novalid stylesheet.xsl "" 
NR_045294   Gm4285
7.5 years ago by
Curiosity120 wrote:

You can download all these annotations from Ensembl

And then use the following command

grep NR_* filename
