Question: KEGG profile convertId() returning coercion warnings, but having issues tracking down NA values
12 months ago by
cf5560 wrote:

I attempted to use KEGGprofile's convertId() to get my RNA-seq counts into NCBI gene IDs. I think the genes were given to me in mgi_symbol format. After running convertId() I revived the warning message, repeated until warnings() maxed out:

1: In FUN(newX[, i], ...) : NAs introduced by coercion

I was very concerned by this because I was afraid a substantial portion of my data was now NA. This said I have had trouble tracking them down. (That said the genes that appear in the header all seem to be converted properly.) Still I wanted to find the NAs.

I tried:

> with_id[[, "X"])]

which returns:


I also tried:

>matmain <-
>which(matmain, arr.ind = TRUE)

but that just returned:

     row col

Any advice for figuring out where they have disappeared too?

Here are a the first few rows of my converted matrix:

              X                USC_1          USC_2          USC_3          USC_NDT_1      USC_3DT_1     
100009600     "Zglp1"          "1.167246e+00" "1.314248e+00" "1.414709e-01" "2.111822e-01" "3.060517e-01"
100009609     "Vmn2r65"        "0.000000e+00" "0.000000e+00" "0.000000e+00" "0.000000e+00" "0.000000e+00"
100009614     "Gm10024"        "0.000000e+00" "0.000000e+00" "0.000000e+00" "0.000000e+00" "6.987166e-02"
100012        "Oog3"           "0.000000e+00" "0.000000e+00" "0.000000e+00" "0.000000e+00" "4.143595e-02"
100017        "Ldlrap1"        "1.153021e+01" "1.048102e+01" "1.175961e+01" "6.312788e+00" "1.315837e+01"
100019        "Mdn1"           "2.269278e+00" "1.780199e+00" "3.189900e+00" "1.276779e+00" "2.080692e+00"
100034251     "Wfdc17"         "0.000000e+00" "0.000000e+00" "0.000000e+00" "0.000000e+00" "0.000000e+00"
100034361     "Mfap1b"         "2.980608e+01" "2.990117e+01" "2.839012e+01" "2.375291e+01" "3.852410e+01"

(Oddly it also gained 275 rows according to dim())

On a vaguely related note I was hoping to get another pair of eyes on my filter of choice, I thought that these fell in the "mgi_symbol" mart filter (for "mmusculus_gene_ensembl") but I am not sure. Some of my gene identifiers:

[571] "A830031A19Rik"  "A830080D01Rik"  "A930002H24Rik"  "A930003A15Rik"  "A930004D18Rik"  "A930009A15Rik" 
 [577] "A930017K11Rik"  "A930018M24Rik"  "A930018P22Rik"  "A930033H14Rik"  "AA467197"       "AA792892"      
 [583] "AA986860"       "Aaas"           "Aacs"           "Aadac"          "Aadacl2"        "Aadacl3"       
 [589] "Aadacl4"        "Aadat"          "Aaed1"          "Aagab"          "Aak1"           "Aamdc"         
 [595] "Aamp"           "Aanat"          "Aar2"           "Aard"           "Aars"           "Aars2"         
 [601] "Aarsd1"         "Aasdh"          "Aasdhppt"       "Aass"           "Aatf"           "Aatk"          
 [607] "AB124611"       "Abat"           "Abca1"          "Abca12"         "Abca13"         "Abca14"

Thanks for the help!

bioconductor R • 281 views
