Question: Heatmap: edgeR counts and DEG
0
gravatar for RiNG
10 months ago by
RiNG10
RiNG10 wrote:

I have used edgeR in Galaxy to perform differential expression analysis. As output I have 3 files with a list of differentially expressed genes (comparison between 3 different groups of samples) with log(FC), log(CPM), FDR, etc. I also have a separate file containing the normalized counts, but for all the different samples within each group.

To make a heatmap out of the diferentially expressed genes, how can I cross the information between the counts file and the DEG files to select only the genes of interest in the counts file?

Thanks in advance.

heatmap edger rna-seq • 817 views
ADD COMMENTlink modified 10 months ago • written 10 months ago by RiNG10

Take your Differential expressed genes list and get the counts data of those genes. Convert counts to logCPM and use that for heatmap.

ADD REPLYlink written 10 months ago by Vasu330

It would really help us if you pasted samples of the data that you have. Otherwise, we can only speculate as to their formatting / structure.

ADD REPLYlink modified 10 months ago • written 10 months ago by Kevin Blighe42k

How can I get the counts data for specific genes in an Excel file with >20000 genes? If I use the "Find" function I would take ages.

ADD REPLYlink written 10 months ago by RiNG10

How did your data end-up in Excel? Better to export your Excel data in TSV or CSV format, and then read that into a R Programming Language environment.

ADD REPLYlink written 10 months ago by Kevin Blighe42k

It was a TSV file I got as output in Galaxy after running edgeR.

If I open this file in Rstudio, how can I then find the genes I am interested in?

ADD REPLYlink written 10 months ago by RiNG10

Can you paste the top-left corner of the data? I assume that it is genes as rows and samples as columns?

ADD REPLYlink written 10 months ago by Kevin Blighe42k

GeneID 19_c2.trimmed.fastq.sorted.bam 24_c3.trimmed.fastq.sorted.bam .....

ENSSSCG000000...

ENSSSCG0000...

This is it; so yes rows is genes and columns for samples (8 samples).

ADD REPLYlink modified 10 months ago • written 10 months ago by RiNG10

These are ENSEMBL gene ideas Sus scrofa (pig). You can likely convert these using biomaRt package in R. Galaxy should also have a gene conversion tool, no?

ADD REPLYlink written 10 months ago by Kevin Blighe42k

That is not a problem. My problem is that I have +20000 genes in a counts file and I only want to select a few.

Is there a function in R that can find and return only the rows I am interested in? And reduce the counts matrix from 20000 to 100 genes of interest?

ADD REPLYlink written 10 months ago by RiNG10

Yes, if these ENSEMBL IDs are the rownames of your object, then just do this:

genesOfInterest <- c("ENSSSCG0000001", "ENSSSCG0000056", "ENSSSCG005555", "ENSSSCG000009", "ENSSSCG003332")

MyData[which(rownames(MyData) %in% genesOfInterest),]

There are a few ways of doing it, though.

ADD REPLYlink written 10 months ago by Kevin Blighe42k
1

It works! Thank you for your time.

ADD REPLYlink written 10 months ago by RiNG10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1016 users visited in the last hour