Hi everyone.

I carried out a bacterial sRNA identification analysis with APERO, as a result I obtained a table with protein names coming from the annotation file, for instante; "NAD(P)-dependent oxidoreductase", "thiol reductant ABC exporter subunit CydD" or "DNA translocase FtsK".

I tried using PANTHER but the gen list I obtain Its not in any of the formats recognized by the platform, wich are;

Ensembl: Ensembl gene identifier. Example: "ENSG00000126243"

Ensembl_PRO: Ensembl protein identifier. Example: "ENSP00000337383"

Ensembl_TRS: Ensembl transcript identifier. "Example: ENST00000391828"

Gene ID: EntrezGene IDs. examples include, "GeneID:10203", "10203" (for Entrez gene GeneID:10203)

Gene symbol: for example, "CALCA"

GI: NCBI GI numbers. Example: "16033597"

HGNC: HUGO Gene Nomenclature ids. Example: "HGNC:16673"

IPI: International Protein Index ids. Example: "IPI00740702"

UniGene: NCBI UniGene ids. Examples: "Hs.654587", "At.36040"

UniProtKB:UniProt accession. Example: "O80536"

UniProtKB-ID: UniProt ID. Example: "AGAP3_HUMAN"

so my question is;

How can I perform a GO enrichment analysis either getting my protein names some recognible ID or with a different strategy?

Thanks in advance, Jose.

The key is to change the common name of the protein to de ID given by the GenBank annotation file in tabular formar, under the column "Protein.product", which is recognible ID by PANTHER. I got a lot of help from the folks of StackOverflow for the coding, If you have any question just hit me up!!


