Hi all,
I'm trying to use the R&Bioconductor and Bioconductor from python to analyze raw data (cel.gz files) from GEO because the sort.txt.data of gene&counts matrix doesn't include the genes I am interested in. So I figured I probably needed to process the raw data in case txt file missing the genes. However, the code I followed only got me to the quality control of all samples. Can anyone suggest where to download the gene annotation for my samples? and suggest some good tools or pipeline or papers I can follow for the geo raw data analysis. I am pretty new to the GEO analysis. Thank you so much !
Best, Amanda
Can you post the GEO accession here? Whatever microarray platform is used, you can look up the probe annotations to see if there are any to your gene of interest. If it's an old platform, your gene may not be covered (or may have an old name).
The platform is Affymetrix GeneChip HT-HG_U133A Early Access Array. I went to affymetrix following link:http://www.affymetrix.com/support/technical/byproduct.affx?product=huexon-st. "Archived NetAffx Annotation Files". But they have a lot of options, which one should I choose to download as probe annotation? And in the file, which column correspond to gene ID or gene name? Is there any tutorial to follow? Thank you !
You can access the annotation file directly through GEO using the platform number. Download the file there and search for your gene of interest to see if it is included.