Question

annotating Agilent Drosophila microarray probes by BLAST

0

Entering edit mode

8.8 years ago

yotiao • 0

Hello,

I am analysing old Agilent Drosophila microarray data and trying to update the annotations of the probes, as there were probably two new Drosophila genome assemblies since the data was acquired. Typically, I would do it via BiomaRt, but there is no longer such option (i.e. I cannot convert Agilent probe ids to anything). And I'd rather not use the annotation provided by Agilent on the microarray, as this may be way too old and inaccurate.

So the best option (?) is using probe sequences themselves and getting gene ids based on them. I figured out that I could run BLAST with probe sequence against Drosophila genome/transcriptome and get gene/transcript IDs from BLAST hits (I have >25k probe sequences to run). My question: is there an easier way to do that (that's a lot of blasting and parsing...)? I have been trying to find if someone somewhere maintains the conversion between probes and other identifiers, but haven't found anything (not UCSC, not FlyBase). Annotate package is not helpful either.

Thanks!

blast microarrays drosophila agilent R • 2.3k views

ADD COMMENT • link updated 17 months ago by Ram 43k • written 8.8 years ago by yotiao • 0

Ram · Answer 1 · 2015-07-17

0

Entering edit mode

8.8 years ago

Antonio R. Franco ★ 5.1k

A question and an answer..

Agilent text data can be read by R packages such as limma, and then you can get the ProbeName names used in the array (which is not useful for you) and the SystematicName of each of the probes, both being accessible by calling the arrayname$gene part of the data (which also contains the row, column, Type of control, etc)

The SystematicName use to have accession names that can be useful for you.. Have you give it a look?

If not SystematicName is present with useful information, then you can use Blast2Go. There are two versions. The PRO version can be used for a week or so for free, and will allow you to search in local databases

Blast2Go will provided you with the best Blast hit (any kind of Blast), and also with interPro domains, EC enzyme, KEGG and extensive GO data

ADD COMMENT • link updated 17 months ago by Ram 43k • written 8.8 years ago by Antonio R. Franco ★ 5.1k

0

Entering edit mode

Yes. I have the SystematicName table (and it contains transcript id (in Ensembl/FlyBase format) for each probe) but my worry is that it's outdated (these arrays were designed probably 10 years ago). So I am trying to make a new annotation, and the only other piece of information I have is ProbeName and Sequence. Currently none of the packages/websites I tried supports conversion from Agilent ProbeName to Ensembl/FlyBase gene id. So I am trying it the hard way, aligning probe sequence and extracting gene/transcript name from there. I just hope there is an easier way.

ADD REPLY • link updated 17 months ago by Ram 43k • written 8.8 years ago by yotiao • 0