Question: Mapping Between Affymetrix Features To Genes
0
gravatar for chengzhao41
5.0 years ago by
chengzhao4180
Canada
chengzhao4180 wrote:

I'm working with gene expression data. The platform is [HuEx-1_0-st] Affymetrix Human Exon 1.0 ST Array [transcript (gene) version].

My questions are:

1) What do the feature represent? "2315554" "2315633" "2315674" "2315739" "2315894" "2315918" "2315951" "2316218" "2316245" "2316379"

2) How do I map them to the gene?

I tried using DAVID to convert the identifiers to gene symbols, but there are some that it is unable to do so.

I then tried using NetAffx but I do not know what to select for Query For option: Transcript Clusters, Exon Probe set, or probe set

Is there an automated way of getting the gene symbol for in R?

ADD COMMENTlink modified 5.0 years ago by Neilfws48k • written 5.0 years ago by chengzhao4180
5
gravatar for Neilfws
5.0 years ago by
Neilfws48k
Sydney, Australia
Neilfws48k wrote:
  1. It's always difficult to "guess the identifier" without additional context, but what I think you have there are Affymetrix transcript cluster IDs.

  2. You should first download a probeset annotation file from Affymetrix (account required). In your case, I think this is the appropriate page. Scroll down to "Archived NetAffx Annotation Files".

I downloaded the zip file at the link HuEx-1_0-st-v2 Probeset Annotations, CSV Format, Release 32 (40 MB, 6/23/11) and unzipped it. Here's part of a grep for one of your IDs:

grep 2315633 HuEx-1_0-st-v2.na32.hg19.probeset.csv
"2315637","chr1","+","1167620","1167657","4","2315633","297","407","NM_080605 // B3GALT6 /// ENST00000379198 // B3GALT6","NM_080605 // chr1 // 100 // 1 // 1 // 0 /// ENST00000379198 // chr1 // 100 // 1 // 1 // 0","3","2","4","1","extended","0","0","0","0","0","1","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","main"
"2315638","chr1","+","1167689","1167804","4","2315633","297","408","NM_080605 // B3GALT6 /// ENST00000379198 // B3GALT6","NM_080605 // chr1 // 100 // 4 // 4 // 0 /// ENST00000379198 // chr1 // 100 // 4 // 4 // 0","1","2","0","2","core","0","0","1","2","0","2","0","0","1","0","0","0","0","0","0","0","0","0","0","0","0","0","main"
"2315639","chr1","+","1167873","1167951","4","2315633","297","409","NM_080605 // B3GALT6 /// ENST00000379198 // B3GALT6","NM_080605 // chr1 // 100 // 4 // 4 // 0 /// ENST00000379198 // chr1 // 100 // 4 // 4 // 0","1","2","0","2","core","0","0","1","2","1","4","0","0","1","0","1","1","0","1","0","0","0","0","0","0","0","0","main"

Column 1 is the probeset ID. Now, the problem is that few ID conversion systems use transcript cluster IDs, but many use probeset IDs. So you could use, for example, the R biomaRt package as follows:

library(biomaRt)
mart.hs <- useMart(biomart="ensembl", dataset="hsapiens_gene_ensembl")
# get probeset IDs for transcript cluster 2315633
huex <- read.table("~/Downloads/HuEx-1_0-st-v2.na32.hg19.probeset.csv", sep = ",", stringsAsFactors = F, header = T)
probes <- subset(huex, transcript_cluster_id == "2315633")$probeset_id
# get gene symbols
genes <- getBM(attributes = c("affy_huex_1_0_st_v2", "hgnc_symbol"), filters = "affy_huex_1_0_st_v2", values = probes, mart = mart.hs)
genes
#  affy_huex_1_0_st_v2 hgnc_symbol
#1             2315638     B3GALT6
#2             2315642     B3GALT6
#3             2315639     B3GALT6
#4             2315643     B3GALT6
#5             2315644     B3GALT6
#6             2315640     B3GALT6
#7             2315637     B3GALT6
#8             2315645     B3GALT6
#9             2315641     B3GALT6

For more information, search this site for "biomart".

ADD COMMENTlink written 5.0 years ago by Neilfws48k

Neilfws: If you have an established pipeline for "Exon 1.0 ST arrays" analysis (by Oligo or any other package) then can you please share this information? Or if you can point me out towards such a tutorial. I tried to follow userguide of oligo package but it is so confusing for me. Thanks.

ADD REPLYlink written 2.9 years ago by Bioinformatist Newbie230
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1604 users visited in the last hour