Question: Retrieving data from NCBI GEO and RNA-Seq Data Analysis
0
gravatar for hkarakurt
20 months ago by
hkarakurt50
hkarakurt50 wrote:

Hello, I am new at RNA-Seq data analysis and I want to analyze the data and do some analyses such as finding differentially expressed genes. My data set is from NCBI GEO and coded as GSE80336. Link is here:

https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE80336

I see data is not normalized. How can I download the data from GEO with R, normalize it and make analyses? Is there a good pipeline for that?

Also there is a file called "Counts.txt" in supplementary. What is this file actually and can I use it?

Thank you.

count rna-seq normalization R geo • 3.3k views
ADD COMMENTlink modified 20 months ago by theobroma221.1k • written 20 months ago by hkarakurt50
2

Not sure about downloading data with R, but you can download the raw sequence reads with the fastq-dump command from the SRA Toolkit. Have a read of the following workflow for analysing RNA-seq data with R and Bioconductor: https://f1000research.com/articles/4-1070/v2

ADD REPLYlink written 20 months ago by James Ashmore2.6k

This is RNA-Seq data and is quite big. You need to download the data manually ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByStudy/sra/SRP/SRP073/SRP073382

ADD REPLYlink written 20 months ago by Karma200

count_file This contains the fpkm or rpkm value. You can use this for the analysis

ADD REPLYlink written 20 months ago by Karma200

Hi snijesh,

Thanks for contributing! Note that you can also edit your post to add additional information.

Cheers,
Wouter

ADD REPLYlink written 20 months ago by WouterDeCoster38k
1
gravatar for theobroma22
20 months ago by
theobroma221.1k
theobroma221.1k wrote:

You can use the Bioconductor GEOquery package to retrieve / download datasets and platforms in R. You can normalize RNA-seq data a few different ways, so check out the Bioconductor Limma package. The counts file must be just that, the counts of each read. Of course you can use it, but should you use it for your analysis is a different question.

ADD COMMENTlink written 20 months ago by theobroma221.1k

I tried to download it with getGEO() command but the expression matrix is empty. I used exprs() command for that. I am not sure how can I download non-normalized data.

ADD REPLYlink written 20 months ago by hkarakurt50

Can you post all of your code not just the functions you used. This will tell me which files you are trying to get from GSE80336, and why it is empty. Also, post any errors you may get too.

ADD REPLYlink written 20 months ago by theobroma221.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1132 users visited in the last hour