Problems Converting Gene Names To Numeric Values In R
1
1
Entering edit mode
10.2 years ago

Hi i am carrying out differential gene expression analysis using limma further i need to do gene set enrichment analysis using GOstats but thers a problem. These are my set of differential expressed genes

 [1] "1557994_at"       "205933_at"        "1559688_at"      
 [4] "232837_at"        "212253_x_at"      "212845_at"       
 [7] "233520_s_at"      "236931_at"        "205054_at"       
[10] "237981_at"        "209896_s_at"      "221718_s_at"     
[13] "226648_at"        "208195_at"        "211928_at"

but when I convert the character vector to numeric I get a warning that NA's introduced as coercion and getting result somewhat this way :

 [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
[26] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
[51] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
[76] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA

how do I solve this problem. And when i carry out analysis its taking hours and no output .

• 3.8k views
ADD COMMENT
2
Entering edit mode

Are you literally just as.numeric(d) on a character vector d (just as an example)? That will always produce an NA since there's no obvious conversion between probe IDs like that and numbers. You can as.numeric(c("1","2","100")) since those are just character representations of numbers, but you have probe IDs.

ADD REPLY
0
Entering edit mode

is it necessary to convert them into numeric vecctor

ADD REPLY
1
Entering edit mode

Have you read the GOstats documentation (PDF) ? Nowhere does it mention conversion of probeset IDs to a numeric value. Perhaps what you want to do is convert to Entrez Gene ID?

ADD REPLY
0
Entering edit mode

how am i supposed to move ahead i am trying dis from past 10 days but couldnt get the result

ADD REPLY
0
Entering edit mode

i have generated top 500 genes and saved their rownames in vector rn as

rn<-rownames(toptable(fit,coef=2,n=500)) rn rn<as.numeric(rn) dat.s&lt;-eset.new[rn,]="" i="" created="" an="" object="" dat.s="" to="" store="" the="" differentially="" exprsd="" genes.="" but="" i="" m="" getng="" nly="" na's<="" p="">

ADD REPLY
2
Entering edit mode
10.2 years ago

There are annotation packages for most arrays you'll ever use in R. You'll find that easier than trying to roll your own solution.

>library("hgu133plus2.db")
>d
 [1] "1557994_at"  "205933_at"   "1559688_at"  "232837_at"   "212253_x_at"
 [6] "212845_at"   "233520_s_at" "236931_at"   "205054_at"   "237981_at"  
[11] "209896_s_at" "221718_s_at" "226648_at"   "208195_at"   "211928_at"  
>select(hgu133plus2.db, d, "SYMBOL", "PROBEID")
       PROBEID  SYMBOL
1   1557994_at     TTN
2    205933_at  SETBP1
3   1559688_at   GRAPL
4    232837_at  KIF13A
5  212253_x_at     DST
6    212845_at  SAMD4A
7  233520_s_at   CMYA5
8    236931_at    <NA>
9    205054_at     NEB
10   237981_at   CMYA5
11 209896_s_at  PTPN11
12 221718_s_at  AKAP13
13   226648_at  HIF1AN
14   208195_at     TTN
15   211928_at DYNC1H1
ADD COMMENT
0
Entering edit mode

Thnx ryan but this vl nly give me the symbols i have to the hypergeometric test to using GOstats.Plz if u could help on this.

ADD REPLY
0
Entering edit mode

That's just an example. It looks like GOtats is expecting an EntrezID, so just use ENTREZID instead of SYMBOL. You could even directly get the associated GO terms if you wanted (you'd have to roll your own test function then, most likely) by instead using GO.

As an aside, you have a full keyboard on your computer. There's no need to use things like "Plz" or "u" or "dis".

ADD REPLY

Login before adding your answer.

Traffic: 1744 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6