I am working on NCI-60 database for data mining purpose. In NCBI I found miRNA expression profiles (link). But for mRNA of each cell lines I found 2 kinds of expression profile array in NCBI (link) categorized as Affymetrix HG-U133A and HG-U133B. They also overlap many mRNA probes and their expression values. I can't understand which one to use or how ? What is the difference ?
This PDF from Affymetrix clarifies the difference between 133A, 133 B and 133 plus. As for using the datasets, look at this question: Combining Two Platforms Affy Hgu133A And Hgu133B. It seems 60 samples are analyzed using both the chips.
copy/pasted from Affymetrix website:
The HG-U133A Array includes representation of the RefSeq database sequences and probe sets related to sequences previously represented on the Human Genome U95Av2 Array. The HG-U133B Array contains primarily probe sets representing EST clusters.
One big difference (for studies about cancer) is that the HG-U133B array appears to include gene for PD-L1 (annotated as CD274) but this gene appears to be absent from the annotation for the HG-U133A array; maybe I have just not found the right annotation since there can be multiple names for this gene. For older data sets, this PD-L1 gene was not yet recognised to be of importance. The related gene PD-1 appears to be in both arrays.
On the other hand, HG-U133B seems to contain many fewer lines in the annotation file and so maybe fewer genes. I suggest you check for known proteins of interest, to make sure that no important ones are missing, in case you want to compare known actors with newly found ones.