I have a set of genes, their direction (up (1) or down (-1) regulated, p-values and adjusted p-values. I want to run limma's goana or kegg ontology.
By looking at the code I tried to construct my de to be able to run goana. As far as I understood the main columns are:
de$p.value # p-values
de$coef # <0 or >0 so that gonna identifies if the gene is up or down regulated.
However it gives me some strange results, so I guess I did something wrong.
I have genes in ensembl ids, so I converted them into entrez_gene
:
mart = useMart('ENSEMBL_MART_ENSEMBL',dataset='mmusculus_gene_ensembl',
host="may2012.archive.ensembl.org")
ensembl_ids = c(up.reg.all$Geneid,down.reg.all$Geneid)
entrez_gene = getBM(
filters= "ensembl_gene_id",
attributes= c("ensembl_gene_id", "entrezgene"),
values= ensembl_ids,
mart = mart)
entrez_gene = entrez_gene[match(ensembl_ids,entrez_gene$ensembl_gene_id),]
So, my data.frame is
> head(df_genes)
p.value coef
1 0.66741542 1
2 0.02399576 1
3 0.04609667 1
4 0.99554092 1
5 0.91228805 1
6 0.24023346 1
And I run goana as follows:
go = goana(df_genes, geneid=entrez_gene$entrezgene, species ="Mm")
However, it gives me strange results, so that it seems I constructed my df_genes
wrongly.
> head(go)
Term Ont N p.value coef P.p.value P.coef
GO:0006508 proteolysis BP 1466 0 0 1 1
GO:0007275 multicellular organismal development BP 4580 0 0 1 1
GO:0007565 female pregnancy BP 112 0 0 1 1
GO:0007566 embryo implantation BP 53 0 0 1 1
GO:0008150 biological_process BP 23313 0 0 1 1
GO:0008152 metabolic process BP 10070 0 0 1 1