From RefSeq to Go Terms
0
0
Entering edit mode
9.2 years ago
aaredav • 0

My aim is getting the list of GO terms for a big list of genes (~250000).

The genes belong to genomes of bacteria downloaded from RefSeq, so they have this kind of identifiers: NP_953938.1.

I guess the solution should be using BioMart. However, the database of RefSeq is not included in the ones you can parse with it.

I tried then to retrieve the Entrez IDs from this file: ftp://ftp.ncbi.nih.gov/gene/DATA/gene2refseq

But some of them (I think most of them) do not appear in the document, e.g. WP_013258072.1

Any other ideas? How can I get the GO terms from the RefSeq IDs?

GO RefSeq BioMart • 3.4k views
ADD COMMENT
0
Entering edit mode

What bacteria are we talking about?

biomartian --list-datasets | grep -i bac

Returns no results, so I am guessing it is not a part of ensembl.

I guess the solution should be using BioMart. However, the database of RefSeq is not included in the ones you can parse with it.

Can you reformulate this sentence?

ADD REPLY

Login before adding your answer.

Traffic: 1722 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6