KEGG profile convertId() returning coercion warnings, but having issues tracking down NA values
0
0
Entering edit mode
5.9 years ago
cf556 ▴ 10

I attempted to use KEGGprofile's convertId() to get my RNA-seq counts into NCBI gene IDs. I think the genes were given to me in mgi_symbol format. After running convertId() I revived the warning message, repeated until warnings() maxed out:

1: In FUN(newX[, i], ...) : NAs introduced by coercion

I was very concerned by this because I was afraid a substantial portion of my data was now NA. This said I have had trouble tracking them down. (That said the genes that appear in the header all seem to be converted properly.) Still I wanted to find the NAs.

I tried:

> with_id[is.na(with_id[, "X"])]

which returns:

character(0)

I also tried:

>matmain <- is.na(with_id)
>which(matmain, arr.ind = TRUE)

but that just returned:

     row col

Any advice for figuring out where they have disappeared too?

Here are a the first few rows of my converted matrix:

              X                USC_1          USC_2          USC_3          USC_NDT_1      USC_3DT_1     
100009600     "Zglp1"          "1.167246e+00" "1.314248e+00" "1.414709e-01" "2.111822e-01" "3.060517e-01"
100009609     "Vmn2r65"        "0.000000e+00" "0.000000e+00" "0.000000e+00" "0.000000e+00" "0.000000e+00"
100009614     "Gm10024"        "0.000000e+00" "0.000000e+00" "0.000000e+00" "0.000000e+00" "6.987166e-02"
100012        "Oog3"           "0.000000e+00" "0.000000e+00" "0.000000e+00" "0.000000e+00" "4.143595e-02"
100017        "Ldlrap1"        "1.153021e+01" "1.048102e+01" "1.175961e+01" "6.312788e+00" "1.315837e+01"
100019        "Mdn1"           "2.269278e+00" "1.780199e+00" "3.189900e+00" "1.276779e+00" "2.080692e+00"
100034251     "Wfdc17"         "0.000000e+00" "0.000000e+00" "0.000000e+00" "0.000000e+00" "0.000000e+00"
100034361     "Mfap1b"         "2.980608e+01" "2.990117e+01" "2.839012e+01" "2.375291e+01" "3.852410e+01"

(Oddly it also gained 275 rows according to dim())

On a vaguely related note I was hoping to get another pair of eyes on my filter of choice, I thought that these fell in the "mgi_symbol" mart filter (for "mmusculus_gene_ensembl") but I am not sure. Some of my gene identifiers:

[571] "A830031A19Rik"  "A830080D01Rik"  "A930002H24Rik"  "A930003A15Rik"  "A930004D18Rik"  "A930009A15Rik" 
 [577] "A930017K11Rik"  "A930018M24Rik"  "A930018P22Rik"  "A930033H14Rik"  "AA467197"       "AA792892"      
 [583] "AA986860"       "Aaas"           "Aacs"           "Aadac"          "Aadacl2"        "Aadacl3"       
 [589] "Aadacl4"        "Aadat"          "Aaed1"          "Aagab"          "Aak1"           "Aamdc"         
 [595] "Aamp"           "Aanat"          "Aar2"           "Aard"           "Aars"           "Aars2"         
 [601] "Aarsd1"         "Aasdh"          "Aasdhppt"       "Aass"           "Aatf"           "Aatk"          
 [607] "AB124611"       "Abat"           "Abca1"          "Abca12"         "Abca13"         "Abca14"

Thanks for the help!

R bioconductor • 979 views
ADD COMMENT

Login before adding your answer.

Traffic: 3022 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6