Question: RNA-Seq Data analysis
0
gravatar for sujitsilas
6 weeks ago by
sujitsilas0
sujitsilas0 wrote:

Hello everyone, I had tried comparing two gene expression datasets in the .tsv file formats (reads quantified using Salmon) that were obtained from the RefineBio website. I was able to annotate the data using Galaxy using a GTF file from ENSEMBL, but when I tried using edgeR or limma, an error would occur. I was wondering if there is another way to compare datasets and build a volcano plot to show the differentially expressed genes between two samples.

RefineBio link to data: https://www.refine.bio/experiments/SRP117268/transcriptome-analysis-reveals-determinant-stages-controlling-human-embryonic-stem-cell-commitment-to-neuronal-cells

The expression files for each day (day0-22) are available. My goal is to compare day 0 with days 2, 4, and 6. I want to make volcano plots for each comparison and find the differentially expressed genes.

Thank you in advance !!

ADD COMMENTlink modified 6 weeks ago by dsull1.8k • written 6 weeks ago by sujitsilas0

You can't use edgeR directly on the data quantified with Salmon because it generates abundance estimates (usually real values), whereas edgeR requires integer read counts (not the same thing). However, I see there's a catchSalmon function in edgeR now for reading Salmon data...did you try that?

ADD REPLYlink written 6 weeks ago by seidel7.4k

Thank you! I'll look into that. I also have a .csv file format of the same experiment with FPKM data. Can the FPKM data be used in any way to find differentially expressed genes between the two datasets?

ADD REPLYlink written 6 weeks ago by sujitsilas0
1

Unfortunately no. Your problem is that there are no replicates and there's really nothing you can do that will allow you to do differential gene expression analysis and obtain p-values.

The most you can do is some exploratory analysis: e.g. you can make a heatmap of some genes you're interested in across the different days, you can look at the FPKM values and see which genes are more highly expressed than others within a given sample, etc. But you can't do a differential gene expression analysis between two conditions (i.e. you won't be able to claim "these are the genes that are statistically significantly differentially expressed between day 2 and day 0").

ADD REPLYlink written 6 weeks ago by dsull1.8k

Thank you for your insight!

ADD REPLYlink written 6 weeks ago by sujitsilas0
3
gravatar for dsull
6 weeks ago by
dsull1.8k
UCLA
dsull1.8k wrote:

You can't make volcano plots nor can you perform a statistical differential gene expression analysis, because there are no replicates! There is only one day 0 sample, only one day 2 sample, etc.

ADD COMMENTlink written 6 weeks ago by dsull1.8k

But is there a way to look at the differences in gene expression between the two days? I also have a .csv file format of the same experiment with FPKM data. Can the FPKM data be used in any way?

ADD REPLYlink written 6 weeks ago by sujitsilas0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2084 users visited in the last hour
_