Question: Retrieving data from NCBI GEO and RNA-Seq Data Analysis
0
gravatar for hkarakurt
2.1 years ago by
hkarakurt80
hkarakurt80 wrote:

Hello, I am new at RNA-Seq data analysis and I want to analyze the data and do some analyses such as finding differentially expressed genes. My data set is from NCBI GEO and coded as GSE80336. Link is here:

https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE80336

I see data is not normalized. How can I download the data from GEO with R, normalize it and make analyses? Is there a good pipeline for that?

Also there is a file called "Counts.txt" in supplementary. What is this file actually and can I use it?

Thank you.

count rna-seq normalization R geo • 4.1k views
ADD COMMENTlink modified 2.1 years ago by theobroma221.1k • written 2.1 years ago by hkarakurt80
2

Not sure about downloading data with R, but you can download the raw sequence reads with the fastq-dump command from the SRA Toolkit. Have a read of the following workflow for analysing RNA-seq data with R and Bioconductor: https://f1000research.com/articles/4-1070/v2

ADD REPLYlink written 2.1 years ago by James Ashmore2.7k

This is RNA-Seq data and is quite big. You need to download the data manually ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByStudy/sra/SRP/SRP073/SRP073382

ADD REPLYlink written 2.1 years ago by Karma210

count_file This contains the fpkm or rpkm value. You can use this for the analysis

ADD REPLYlink written 2.1 years ago by Karma210

Hi snijesh,

Thanks for contributing! Note that you can also edit your post to add additional information.

Cheers,
Wouter

ADD REPLYlink written 2.1 years ago by WouterDeCoster41k
1
gravatar for theobroma22
2.1 years ago by
theobroma221.1k
theobroma221.1k wrote:

You can use the Bioconductor GEOquery package to retrieve / download datasets and platforms in R. You can normalize RNA-seq data a few different ways, so check out the Bioconductor Limma package. The counts file must be just that, the counts of each read. Of course you can use it, but should you use it for your analysis is a different question.

ADD COMMENTlink written 2.1 years ago by theobroma221.1k

I tried to download it with getGEO() command but the expression matrix is empty. I used exprs() command for that. I am not sure how can I download non-normalized data.

ADD REPLYlink written 2.1 years ago by hkarakurt80

Can you post all of your code not just the functions you used. This will tell me which files you are trying to get from GSE80336, and why it is empty. Also, post any errors you may get too.

ADD REPLYlink written 2.1 years ago by theobroma221.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 694 users visited in the last hour