Filter by Fold change in DESeq2
2
5
Entering edit mode
6.7 years ago
EpiExplorer ▴ 90

Hi,

I am using DESeq2 for RNA-Seq data. I got 557 genes deferentially expressed with 5% FDR and Log Fold change >= +/- 0.

I would like to choose the top genes that have a fold change >=2. How do I do that in DESeq2? Can I just export the list of 557 genes and filter them by fold change >=2 or do I have to use any argument in the results function to statistically calculate that?

The functions that I used in DESeq2 to get the 557 deferentially expressed genes is below:

res <- results(dds, alpha=0.05)

Thanks.

RNA-Seq • 8.7k views
ADD COMMENT
0
Entering edit mode

Hi, I need urgent help in this please. I have read the vignette but I am not clear how to get the genes that are significant with 5% FDR and also greater than or equal to fold change of -2/+2.

I have the list of genes which was extracted by results function with alpha value as 0.05. Now if I want to further filter this list based on the fold change, can I use the logfold change column in the file to filter the significant genes with -1/+1( Log(2 fold up or down) or do I need to necessarily rerun the statistics on fold change by specifying the lfsthreshold argument in the results function along with the alpha function set to 0.05? When I run results with only alpha, I get 557 genes significant res <- results(dds, alpha=0.05)

But when I run results with foldchange and alpha, i get 91 genes: res1 <- results(dds,alpha=0.05, lfcThreshold=1, altHypothesis="greaterAbs") By comparing the results with 557 genes and 91 genes, I found that the in list of 557 genes about only 180 genes are below +1/-1 logfold change. So I am wandering why I get only 91 genes when filtered by alpha and fold change both. I am sure DESeq2 is doing some internal statistics for p-value adjustment and fold change significance but not sure how and what it is doing.

Can somebody please explain me if I am running a correct function with right arguments for what I want as output?

ADD REPLY
1
Entering edit mode

Just a comment: in my opinion, the right approach is to run res <- results(dds, alpha=0.05) and then filter based on log2FoldChange (eventually remove the rows with NAs, which can appear if the number of reads in one condition equals zero). Thus I would assume that 180 genes are what you are looking for.

Now by specifying the altHypothesis parameter in res1 <- results(dds,alpha=0.05, lfcThreshold=1, altHypothesis="greaterAbs"), you calculate DEGs based on a different null hypothesis:

|beta| <= lfcThreshold

Per default, the null hypothesis is

beta = 0.

This explains, in my opinion, the difference in the number of obtained genes (91 vs 180), since the hypothesis

|beta| <= lfcThreshold

is more conservative.

ADD REPLY
1
Entering edit mode
6.7 years ago
e.rempel ★ 1.1k

Hi,

you could filter the res object directly in R

res <- res[res$log2FoldChange >= 1, ]
ADD COMMENT
0
Entering edit mode

Thanks for the reply. I tried this but I get an error below:

Error in NSBS(i, x, exact = exact, upperBoundIsStrict = !allow.append) : subscript contains NAs

ADD REPLY
0
Entering edit mode
6.7 years ago
firatuyulur ▴ 320

There is a function, complete.cases(), in R that helps you to get rid of NAs.

mydata[complete.cases(mydata),]

will give you your dataframe with no NAs. Then you can apply any numeric filtering you like.

ADD COMMENT
0
Entering edit mode

Thank you for your reply. But this is not I want.

ADD REPLY
0
Entering edit mode

Explain how this is not what you're looking for - a previous answer had you showing an error owing to NAs, so how is removing NAs not the way forward?

ADD REPLY

Login before adding your answer.

Traffic: 2095 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6