RNA-Seq Data analysis
1
0
Entering edit mode
3.2 years ago

Hello everyone, I had tried comparing two gene expression datasets in the .tsv file formats (reads quantified using Salmon) that were obtained from the RefineBio website. I was able to annotate the data using Galaxy using a GTF file from ENSEMBL, but when I tried using edgeR or limma, an error would occur. I was wondering if there is another way to compare datasets and build a volcano plot to show the differentially expressed genes between two samples.

RefineBio link to data: https://www.refine.bio/experiments/SRP117268/transcriptome-analysis-reveals-determinant-stages-controlling-human-embryonic-stem-cell-commitment-to-neuronal-cells

The expression files for each day (day0-22) are available. My goal is to compare day 0 with days 2, 4, and 6. I want to make volcano plots for each comparison and find the differentially expressed genes.

Thank you in advance !!

RNA-Seq FPKM .tsv Salmon volcano plots • 1.2k views
ADD COMMENT
0
Entering edit mode

You can't use edgeR directly on the data quantified with Salmon because it generates abundance estimates (usually real values), whereas edgeR requires integer read counts (not the same thing). However, I see there's a catchSalmon function in edgeR now for reading Salmon data...did you try that?

ADD REPLY
0
Entering edit mode

Thank you! I'll look into that. I also have a .csv file format of the same experiment with FPKM data. Can the FPKM data be used in any way to find differentially expressed genes between the two datasets?

ADD REPLY
1
Entering edit mode

Unfortunately no. Your problem is that there are no replicates and there's really nothing you can do that will allow you to do differential gene expression analysis and obtain p-values.

The most you can do is some exploratory analysis: e.g. you can make a heatmap of some genes you're interested in across the different days, you can look at the FPKM values and see which genes are more highly expressed than others within a given sample, etc. But you can't do a differential gene expression analysis between two conditions (i.e. you won't be able to claim "these are the genes that are statistically significantly differentially expressed between day 2 and day 0").

ADD REPLY
0
Entering edit mode

Thank you for your insight!

ADD REPLY
3
Entering edit mode
3.2 years ago
dsull ★ 5.8k

You can't make volcano plots nor can you perform a statistical differential gene expression analysis, because there are no replicates! There is only one day 0 sample, only one day 2 sample, etc.

ADD COMMENT
0
Entering edit mode

But is there a way to look at the differences in gene expression between the two days? I also have a .csv file format of the same experiment with FPKM data. Can the FPKM data be used in any way?

ADD REPLY

Login before adding your answer.

Traffic: 1563 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6