Question: LogFC calculation in multiple comparisons
gravatar for elb
5 weeks ago by
elb160 wrote:

Hi guys, suppose to be in the following situation:

    SampleA1  SampleA2   SampleA3   Ctrl1  Ctrl2    SampleB1  SampleB2   SampleB3     
       234       1          32        5      2          0        21       12344
       2434      134         0        2      0          0        0         0            
        1        0           0        1      1         1234     456       345             

Specifically rows are genes while columns are samples. Data are counts of an RNA seq experiment.

Suppose you want to perform the differential gene expression analysis and you want to compare Ctrl* vs Sample* condition. To do this you first of all filter the raw count matrix on (cpm>1) > n (n == number of samples you decide) using edgeR for example. Once this is done you have the data matrix I showed you. Then you apply glmQLFTest (after the design etc) and you will have logFC. Now my point is: suppose your boss don't want that you apply a more stringent filter on (cpm>1) > n how is it possible to avoid high logFC values even if the genes are poorly expressed as in line 3 for SampleA* vs Ctrl? LogFC will be "comparable" in terms of magnitude to the logFC referring to genes highly expressed versus 0 (line 1 for example). Moreover....suppose that gene is highly expressed in SampleB and you cannot remove it because otherwise you will remove this information when you compare SampleB* vs Ctrl. The logFC of SampleA vs Ctrl* will be high as the logFC of SampleB* vs Ctrl* but they refer to genes differently expressed in terms of magnitude. How to deal with this situation? I thought to treat the comparisons independently, i.e. considering different sets of genes when comparing SampleA* vs Ctrl* and SampleB* vs Ctrl* but I'm not sure it is correct.

Can anyone help me please?

deseq edger rna-seq • 126 views
ADD COMMENTlink modified 5 weeks ago by RamRS21k • written 5 weeks ago by elb160
gravatar for swbarnes2
5 weeks ago by
United States
swbarnes25.6k wrote:

Well, don't just look at the fold changes, look at the p-values too! Also, see the lfcshrink function in DESeq

ADD COMMENTlink written 5 weeks ago by swbarnes25.6k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 702 users visited in the last hour