It's always difficult to "guess the identifier" without additional context, but what I think you have there are Affymetrix transcript cluster IDs.
You should first download a probeset annotation file from Affymetrix (account required). In your case, I think this is the appropriate page. Scroll down to "Archived NetAffx Annotation Files".
I downloaded the zip file at the link HuEx-1_0-st-v2 Probeset Annotations, CSV Format, Release 32 (40 MB, 6/23/11) and unzipped it. Here's part of a grep for one of your IDs:
grep 2315633 HuEx-1_0-st-v2.na32.hg19.probeset.csv
"2315637","chr1","+","1167620","1167657","4","2315633","297","407","NM_080605 // B3GALT6 /// ENST00000379198 // B3GALT6","NM_080605 // chr1 // 100 // 1 // 1 // 0 /// ENST00000379198 // chr1 // 100 // 1 // 1 // 0","3","2","4","1","extended","0","0","0","0","0","1","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","main"
"2315638","chr1","+","1167689","1167804","4","2315633","297","408","NM_080605 // B3GALT6 /// ENST00000379198 // B3GALT6","NM_080605 // chr1 // 100 // 4 // 4 // 0 /// ENST00000379198 // chr1 // 100 // 4 // 4 // 0","1","2","0","2","core","0","0","1","2","0","2","0","0","1","0","0","0","0","0","0","0","0","0","0","0","0","0","main"
"2315639","chr1","+","1167873","1167951","4","2315633","297","409","NM_080605 // B3GALT6 /// ENST00000379198 // B3GALT6","NM_080605 // chr1 // 100 // 4 // 4 // 0 /// ENST00000379198 // chr1 // 100 // 4 // 4 // 0","1","2","0","2","core","0","0","1","2","1","4","0","0","1","0","1","1","0","1","0","0","0","0","0","0","0","0","main"
Column 1 is the probeset ID. Now, the problem is that few ID conversion systems use transcript cluster IDs, but many use probeset IDs. So you could use, for example, the R biomaRt package as follows:
library(biomaRt)
mart.hs <- useMart(biomart="ensembl", dataset="hsapiens_gene_ensembl")
huex <- read.table("~/Downloads/HuEx-1_0-st-v2.na32.hg19.probeset.csv", sep = ",", stringsAsFactors = F, header = T)
probes <- subset(huex, transcript_cluster_id == "2315633")$probeset_id
genes <- getBM(attributes = c("affy_huex_1_0_st_v2", "hgnc_symbol"), filters = "affy_huex_1_0_st_v2", values = probes, mart = mart.hs)
genes
For more information, search this site for "biomart".
Neilfws: If you have an established pipeline for "Exon 1.0 ST arrays" analysis (by Oligo or any other package) then can you please share this information? Or if you can point me out towards such a tutorial. I tried to follow userguide of oligo package but it is so confusing for me. Thanks.
Hi Neilfws, I am trying to map HuGene-2_0-st (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GPL16686) using above mentioned script but I get the following error.
Error in getBM(attributes = c("HuGene-2_0-st", "hgnc_symbol"), filters = "HuGene-2_0-st", : Invalid attribute(s): HuGene-2_0-st
I also tried using _v1 or _v2. But no success. How I can locate the actual name. if you have any suggestion? Thanks