Question: Annotation of exon array on probeset id and transcriptclusterids using biomaRT
0
gravatar for cqtnljy
6.2 years ago by
cqtnljy0
China
cqtnljy0 wrote:

Hello everybody!

It is my first time  working with the Affy Human Exon St. 1.0. I use Affymetrix Power Tools (APT) and R to do it.Thus, I get two ExpressionSet on exon-level and gene-level,but I have a little questions about the annotation method by biomaRT.

I can annotate the the probesetid in exon-level ExpressionSet by biomaRT

eg:ensembl = useMart("ensembl",dataset="hsapiens_gene_ensembl")

getBM(attributes=c('affy_huex_1_0_st_v2', 'hgnc_symbol'),filters="affy_huex_1_0_st_v2",value="2315588",mart=ensembl)#2315588 is a probesetid

but i cannot annotate the transcriptclusterids produced in the gene-level ExpressionSet only by biomaRT.

eg:getBM(attributes=c('affy_huex_1_0_st_v2', 'hgnc_symbol'),filters="affy_huex_1_0_st_v2",value="2316379 ",mart=ensembl) #2316379is a transcript_cluster_id

Is there no direct annotation of transcript_cluster_id by biomaRt or there is any error of my code?

Thank you very much in advance

 

 

 

exon array biomart annotation • 2.9k views
ADD COMMENTlink modified 6.2 years ago • written 6.2 years ago by cqtnljy0
1
gravatar for komal.rathi
6.2 years ago by
komal.rathi3.7k
Children's Hospital of Philadelphia, Philadelphia, PA
komal.rathi3.7k wrote:

cqtnljy,

Your code is correct. One of the attributes that you are using i.e. 'affy_huex_1_0_st_v2' contains nothing but the probeset ids for Affy Human Exon St. 1.0, which is exactly why you were able to retrieve data based on probeset ids. In fact, that is the only ID available for Affy Human Exon St. 1.0 in biomaRt. You cannot search based on transcriptcluster id because there is no attribute associated with it in biomaRt.

UPDATE:

Alternative Method using Bioconductor AnnotationData Packages:

source("http://bioconductor.org/biocLite.R")
biocLite('huex10sttranscriptcluster.db')
library(huex10sttranscriptcluster.db)
Annot <- data.frame(SYMBOL=sapply(contents(huex10sttranscriptclusterSYMBOL), paste, collapse=","),
                    DESC=sapply(contents(huex10sttranscriptclusterGENENAME), paste, collapse=","),
                    ENSEMBLID=sapply(contents(huex10sttranscriptclusterENSEMBL), paste, collapse=","))

# The rownames are transcript cluster IDs here, so you can access your ID like:

Annot[grep('2316379',rownames(Annot)),]
        SYMBOL               DESC       ENSEMBLID
2316379    SKI SKI proto-oncogene ENSG00000157933
ADD COMMENTlink modified 6.2 years ago • written 6.2 years ago by komal.rathi3.7k
0
gravatar for cqtnljy
6.2 years ago by
cqtnljy0
China
cqtnljy0 wrote:

komal.rathi

Thanks very much! Is there any simple methods to transfer the TranscriptCluster id to gene symbol? In the 'HuGene-1_0-st-v1.na33.2.hg19.transcript.csv' file,there are so many symbol map to one TranscriptCluster id in the gene_assignment colum,i don't know how to deal with it.

ADD COMMENTlink modified 6.2 years ago • written 6.2 years ago by cqtnljy0

In your question you are referring to HuGene 1.0 st v2 and here you are looking at v1. Which one are you working on exactly? v1 and v2 have different probe/transcript IDs.

ADD REPLYlink modified 6.2 years ago • written 6.2 years ago by komal.rathi3.7k

Can you show some Transcript Cluster IDs that you are working on? That might tell us what annotation you are really working on.

ADD REPLYlink written 6.2 years ago by komal.rathi3.7k

Anyway, I have updated my answer. Please check.

ADD REPLYlink written 6.2 years ago by komal.rathi3.7k

I am working on v2,that file name is a mistake,i used your updated code to transfer the TranscriptCluster id to gene symbol successfully! Thank you so much!

ADD REPLYlink written 6.2 years ago by cqtnljy0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1110 users visited in the last hour