How To Filter Affymetrix Probes Related To Murine Genes Encoding Membran Proteins
6
5
Entering edit mode
11.8 years ago

Hello,

I have a set of murine Affymetrix probes of interest. And I am wondering which ones are related to genes encoding membran proteins.

My guess is to first link the probes of interest to their genes and then filter these genes with GO terms such as "plasma membrane".

I was wondering if someone as a better solution that I didn't think about.

Thanks in advance. Fred

affymetrix gene annotation mouse gene • 3.9k views
ADD COMMENT
8
Entering edit mode
11.8 years ago
Ian Simpson ▴ 960

As Tony suggested you can do this easily in R using biomaRt:-

library(biomaRt);
ensembl <- useMart('ensembl',dataset='mmusculus_gene_ensembl');
#get all genes with affyids just for this example
affys <- getBM(mart=ensembl,attributes='affy_mg_u74a');
#just take 100 for this example
affys <- affys[1:100,]
#get genes
affy_genes <-getBM(
     mart=ensembl,attributes=c('mgi_automatic_gene_symbol','mgi_curated_gene_symbol','affy_mg_u74a'),
     filters=c('affy_mg_u74a','go'),
     values=list(affys,'GO:0005886')
);

This is just an example you should check what attributes and which array set you want to use using the listAttributes() and listFilters() functions. Hope this is useful.

ADD COMMENT
0
Entering edit mode

I think I should probably add that you need to watch out for promiscuous probe-sets, ones that map to more than one gene. This is discussed in another answer on this site Is It Possible For Two Different Affymetrix Probe Set Id To Have Common Annotations To Same Gene ?

ADD REPLY
0
Entering edit mode

Ian thanks a lot for the detailed reponse but I was more looking for filtering suggestions like Daniel did. May be my question was not well formulated. Anyway thanks for the code sample.

ADD REPLY
0
Entering edit mode

You're welcome. As you probably noticed there are a very large number of filtering options available with BioMart which you could certainly formulate into detailed analyses. The benefit of using Ensembl is that it essentially syndicates a lot of the data that you might otherwise have to trawl multiple databases (and with different access methods) to find. The pipelines also include many predictive tool outputs that you might not have realised were hidden in there !

ADD REPLY
4
Entering edit mode
11.8 years ago
toni ★ 2.2k

Hello,

If you or someone in your neighborhood know R programming, I would consider using biomaRt R package. It could be a straightforward way for such a task.

Cheers

tony

ADD COMMENT
0
Entering edit mode

this is not of much use unless you provide an example, obviously it can be done in many ways, one which happens to be R

ADD REPLY
0
Entering edit mode

True. I would take the time to give an example next time. In this case, I just wanted to pop up biomaRt package, useful for this task.

ADD REPLY
3
Entering edit mode
11.8 years ago
User 59 13k

You could also, once you have found a method to translate your identifiers, look for matches with genes that exist within existing membrane protein databases.

A quick Google suggests that a couple exist and are up to date, I'm sure the NAR database issue would supply half a dozen more.

PTBTM: http://pdbtm.enzim.hu/

MPDB: http://www.mpdb.tcd.ie/

The former at least offers downloads (the latter does not appear to) which would facilitate bringing the data into whatever package you're using for your array analysis.

I'd be interested to see how the GO term approach intersects with this one.

ADD COMMENT
3
Entering edit mode
11.6 years ago
Andrew Su 4.9k

Others have covered the use of BioMart and GO annotations -- seems like a perfectly reasonable route. But I'm not sure how complete the GO annotations for localization are. My understanding is that there are membrane-bound proteins that are not annotated with the corresponding GO term (but I don't have a concrete example unfortunately).

Depending on your tolerance for false negatives, you might also consider downloading Phobius and running it locally against a protein sequence file. For example, you might download the Refsesq protein fasta file from NCBI's ftp site.

ADD COMMENT
2
Entering edit mode
11.6 years ago
Satish Gupta ▴ 40

Hi Fred,

If you have a set of probesets, just go to DAVID tool,and there is a option to paste your probeset IDs. Ypu will get the annotation there according to your interest. Hope you will find it useful.

Cheers Satish

ADD COMMENT
1
Entering edit mode
11.4 years ago
Shigeta ▴ 160

Affymetrix annotations also have TMHMM predictions on all the genes associated with the probe sets - its in the annotation.csv file that's updated regularly. you can just do a quick search on that column.

ADD COMMENT

Login before adding your answer.

Traffic: 1722 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6