Visualization Of Custom Agilent In Situ Oligonucleotide Array
1
1
Entering edit mode
13.4 years ago
User 5463 ▴ 10

Hello,

I'm trying to analyze a custom agilent oligonucleotide array. I have acces to three files (via GEO) the original .gpr file, an annotation file which links Probe IDs to GB_RANGE

GenBank accession range - specifies a particular sequence position within a GenBank accession number. Use format ACCESSION.VERSION[start..end]. Useful for tiling arrays.

And an already processed file which links Probe IDs to values.

My task is to visualize the chip data in the UCSC Browser (value vs GB_RANGE), but basically I can't work with it in any way at the moment. The array is a custom type, and the provided annotation file comes in the following format:

ProbeSet ID Name CONTROL TYPE SEQUENCE GBRANGE SPOTID 5 Hs05041539310905-60 GTTCCCACCCCCAACCCGAACTCACAGCCGGTCTCCTTCTTGATCTCCTCGAGCTCTTCG NC000015.8[39310905..39310845]

So sadly, there is only the GB_RANGE to identify the probe (or a complete remapping of all probes with help of the sequence). To visualize the data, I would need it either in .wig format, or better in .bed format. Something like: chromosome start end value

I could copy and paste the GB_RANGE together with the value from the .gpr in one excelfile, but this file would still not be readeable by the ucsc browser. The problem is, that I can't extract the chromosome, start and end values from the annotation file. Is there perhaps a standard method to deal with custom arrays?

If there is not, how can I extract the needed Information out of the GBRANGE format NC000015.8[39310905..39310845]???

Thanks,

David

microarray agilent ucsc visualization • 2.6k views
ADD COMMENT
0
Entering edit mode

Since you have a custom array it is unlikely that there is a standard method to transform the data.

ADD REPLY
1
Entering edit mode
13.4 years ago
User 59 13k

Isn't NC_000015 a reference to chromosome 15? So the probe maps to bases 39310905-39310845 on that chromosome I assume.

You will need to be careful about which release you're mapping back to, a quick BLAT against the latest release does not have it in those positions. It looks like it's mapping to hg17 or hg18 rather than the latest release. LiftOver will let you map between revisions assuming you wish to do so.

ADD COMMENT
0
Entering edit mode

Yes, it is hg17, so I would have to do a liftover. But if I want to load a filetype

GB_RANGE  VALUE
XXXXXXX    YYYYYYY

to ucsc, it is not recognized. I would need it as a bed or wig format. Is there a script, that converts NC_000015.8[39310905..39310845] to Chr15 39310905 39310845?

ADD REPLY
0
Entering edit mode

No, you'd pretty much have to write that yourself or get someone to do it for you, if you can't find someone who has done something similar already. The formats aren't complicated, but obviously you need to draw the co-ordinate and value data together in one place for display.

ADD REPLY

Login before adding your answer.

Traffic: 3997 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6