Question: Problems Identifying The Right Version Of Chip: Ht-Hgu133A Vs Ht-Hgu133A V2
5.8 years ago by
Bethesda, MD
Kenneth Daily50 wrote:

I'm trying to use some cancer cell line data from the paper "Systematic identification of genomic markers of drug sensitivity in cancer cells":

The title says that it is HT-HGU133a V2; but the array accession linked to it says it's the HT-HGU133a. I thought that some probes/layout had been changed between the U133a and U133a2 chips? I want to be able to add in our cell line data that was on the HGU133a V2 platform.

I then found this BioC thread that seemed to say that the annotation to use would be hgu133a2, but that a separate annotation was made for hthgu133a to 'reduce' confusion:

But I'm still confused! When I load the raw CEL files and then normalize it with the different CDFs (hthgu133a vs. hgu133a2) I get completely different results; the normalized expression values for a single sample normalized with the different CDFs have absolutely no correlation. It makes me think that the layout is somehow different! Though, in the resulting ExpressionSets the probes are in the same order and exactly match. Can anyone offer a suggestion, or is there a way that I can compare the layout for the chips?

I actually also tried this in Partek; it failed when I manually changed the annotation to use to HGU133a V2 (said that the dimensions were incorrect).

microarray affymetrix • 1.6k views
ADD COMMENTlink modified 5.8 years ago by Charles Warden6.4k • written 5.8 years ago by Kenneth Daily50
5.8 years ago by
Charles Warden6.4k
Duarte, CA
Charles Warden6.4k wrote:

If you are using Partek, it should automatically recognize the type of .CEL file (and, most likely, download the appropriate annotation file if it is not already in your libraries file). I would not recommend manually changing the annotation type if Partek is able to recognize it automatically.

Just to be clear, these are HT-HGU133A arrays (not the regular HGU133A arrays). These are often used for large genome projects (like TCGA, before they switched over to RNA-Seq). I think the reason is that you can process more arrays at the same time. I think the probes are essentially the same, but the layout may be different. I'm not sure what are the differences (if any) for the V2 HT-arrays. For example, you can't specifically order a V2 HT-array:

So, I would probably just use Partek (with the automatically recognized annotation file). If you are really worried, there should also be processed expression values in the ArrayExpress project (so, you don't have to worry about doing your own mapping - you just have to trust that proper normalization was performed).

ADD COMMENTlink modified 5.8 years ago • written 5.8 years ago by Charles Warden6.4k
