Question: Getting a Count Matrix from GEO for RNA-eq
4.2 years ago by
ahnje7700 wrote:


Could anyone help me get some sort of a count matrix when processed RNA-seq data is extracted from GEO. I have some experience using microarray data from GEO.

For example, for GSE78220, the data has been processed by: "FASTQ files were mapped by Tophat2 Tophat BAMs were quantified and normalized using Cuffnorm (for gene analysis) and by htseq-count followed by edgeR's log CPM (for gene-set analysis) Genome_build: hg19 Supplementary_files_format_and_content: The normalized expression levels by cuffnorm"

After geting the GSE by:

g <- getGEO("GSE78220")

What would be the next step?

4.2 years ago by
GZ1995380 wrote:

I've been recently working on reanalyzing this data set. The data for this series was provided in the supplementary file and cannot be directly accessed by exprs(). So just download the supplementary file "GSE78220_PatientFPKM.xlsx" into your working directory.

assay <- read.csv("GSE78220_PatientFPKM.csv", row.names = 1)

For phenotypic data,

e <- getGEO("GSE78220", destdir = ".")
e <- e[[1]]
pheno <- pData(e)
