Question: annotating Agilent Drosophila microarray probes by BLAST
gravatar for yotiao
3.9 years ago by
United Kingdom
yotiao0 wrote:



I am analysing old Agilent Drosophila microarray data and trying to update the annotations of the probes, as there were probably two new Drosophila genome assemblies since the data was acquired. Typically, I would do it via BiomaRt, but there is no longer such option (i.e. I cannot convert Agilent probe ids to anything). And I'd rather not use the annotation provided by Agilent on the microarray, as this may be way too old and inaccurate.

So the best option (?) is using probe sequences themselves and getting gene ids based on them. I figured out that I could run BLAST with probe sequence against Drosophila genome/transcriptome and get gene/transcript IDs from BLAST hits (I have >25k probe sequences to run). My question: is there an easier way to do that (that's a lot of blasting and parsing...)? I have been trying to find if someone somewhere maintains the conversion between probes and other identifiers, but haven't found anything (not UCSC, not FlyBase). Annotate package is not helpful either.





ADD COMMENTlink modified 3.9 years ago by Antonio R. Franco4.0k • written 3.9 years ago by yotiao0
gravatar for Antonio R. Franco
3.9 years ago by
Spain. Universidad de Córdoba
Antonio R. Franco4.0k wrote:

A question and an answer..

Agilent text data can be read by R packages such as limma, and then you can get the ProbeName names used in the array (which is not useful for you) and the SystematicName of each of the probes, both being accessible by calling the arrayname$gene part of the data (which also contains the row, column, Type of control, etc)

The SystematicName use to have accession names that can be useful for you.. Have you give it a look?

If not SystematicName is present with useful information, then you can use Blast2Go. There are two versions. The PRO version can be used for a week or so for free, and will allow you to search in local databases

Blast2Go will provided you with the best Blast hit (any kind of Blast), and also with interPro domains, EC enzyme, KEGG and extensive GO data

ADD COMMENTlink modified 3.9 years ago • written 3.9 years ago by Antonio R. Franco4.0k

Yes. I have the SystematicName table (and it contains transcript id (in Ensembl/FlyBase format) for each probe) but my worry is that it's outdated (these arrays were designed probably 10 years ago). So I am trying to make a new annotation, and the only other piece of information I have is ProbeName and Sequence. Currently none of the packages/websited I tried supports conversion from Agilent ProbeName to Ensembl/FlyBase gene id. So I am trying it the hard way, aligning probe sequence and extracting gene/transcript name from there. I just hope there is an easier way.

ADD REPLYlink written 3.8 years ago by yotiao0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 910 users visited in the last hour