ESTIMATE (Estimation of STromal and Immune cells in MAlignant Tumor tissues using Expression data)
1
0
Entering edit mode
17 months ago

Hi there Can anyone explain to me how to use the ESTIMATE package in RNA-seq analysis? I want to calculate immune scores and stromal scores by employing the ESTIMATE algorithm, then analyze the relationship of immune/stromal scores with subtype classification and cytogenetic risk by one-way analysis of variance, but I don't know how to do this!

RNA-Seq R ESTIMATE • 2.5k views
0
Entering edit mode

Can you please elaborate on what you have already tried and on which part (or parts) you are having trouble? Thank you.

0
Entering edit mode

How can I use this package? I want to know whether I should normalize data before use or not? And at last, how can I read the .gct output file?

0
Entering edit mode

How to prepare the .gct file to use in DESeq2

4
Entering edit mode
17 months ago

A vignette PDF comes installed with the package, and should be located at:

• R/x86_64-pc-linux-gnu-library/4.0/estimate/doc/ESTIMATE_Vignette.pdf

In this vignette, they use some data that comes bundled with the package (R/x86_64-pc-linux-gnu-library/4.0/estimate/extdata/sample_input.txt), which represents Affymetrix U133 microarray data that appears to be normalised and transformed by log [base 2].

So, if you have RNA-seq data, I would normalise the data in the usual way, and then transform via rlog or vst. Then, with ESTIMATE, use the rlog or vst expression levels.

Kevin

0
Entering edit mode

Hi,

I am trying ESTIMATE tool in RNA-Seq data aswell.

I have normalized and transformed data via rlog but even if I checked the man and help option, the steps are unclear for me.

Should I use first "filterCommonGenes" option and get "genes.gct" file? With the given codes below will I obtain "OV_estimate_score.gct" and get raw estimation using these files with "estimateScore" commond?

   out.file <- tempfile(pattern="estimate", fileext=".gct")
outputGCT(in.file, out.file)


I am sorry, I confused a lot through infos.

0
Entering edit mode

Hi, I think that function (outputGCT()) just changes the format of the data. You have read through the vignette, right?

0
Entering edit mode

Yes I read it and followed the codes given below. BTW I don't have repeated GeneSymbols

in.file <- read.table("Normalized and rlog transformed DE lncRNAs.txt", sep = "\t", header=T)
lncRNAgct <- tempfile(pattern="estimate", fileext=".gct")
outputGCT(in.file, lncRNAgct)
estimateScore(lncRNAgct, "estimate_score.gct", platform="Illumina")


But it turned as;

[1] "1 gene set: StromalSignature  overlap= 0"
[1] "2 gene set: ImmuneSignature  overlap= 0"


Could it be because I am trying genes related with lncRNAs or am I doing something wrong?

0
Entering edit mode

You'll be surprised to hear that I have not actually used this package.

It seems that your first argument, lncRNAgct, should actually be the input GCT filename, i.e., it's absolute or relative file location.

input.ds, character string specifying name of input GCT file containing stromal, immune, and estimate scores for each sample

output.ds, character string specifying name of output file

platform, character string indicating platform type. Defaults to "affymetrix"

I think that you have your parameters in incorrect places. See the example at the bottom of the page accessed via the link above

0
Entering edit mode

I tried it but it is still turning as 0 overlapped.

When I checked the example system file, I realized exp. data does not have any negative value but my data has. Could it be problem?

0
Entering edit mode

It says filterCommonGenes() takes input as the directory of your file or your data frame. I tried using the dataframe as input but it didn't work ,got this error:

(is.character(input.f) && length(input.f) == 1 && nzchar(input.f)) || .... is not TRUE)

which is indicating it only wants a directory in the form of a character string as input.. I tried that and got this:

[1] "Merged dataset includes 0 genes (10412 mismatched)." Error in dimnames(x) <- dn : length of 'dimnames' [2] not equal to array extent

Is there a specific format the input has to be in? you seemed to get it to work as a data frame

1
Entering edit mode

I got the same error (is.character(input.f) && length(input.f) == 1 && nzchar(input.f)) || .... is not TRUE) when I tried providing it a data.frame. I changed the input to a string with complete file path to my data.frame and it worked fine for me. Make sure your data.frame is vst/rlog transformed with row names as gene symbols and columns as samples.

0
Entering edit mode

are you starting off with raw counts or FPKMS (or another form) before log transforming? I read in their paper they used RPKM so I'm going to try FPKM first since they should be similar.

0
Entering edit mode

also what format is your dataframe in (.txt?) and what seperation are you using?

0
Entering edit mode

also what format is your dataframe in (.txt?) and what seperation are you using?

0
Entering edit mode

ah got it to work by looking at the sample data file. The format of the data frame .txt file has to be tab seperated and there can be no quotes around the character values