Question: How to remove spike ins from illumina beadchip data
1
gravatar for maria2019
13 months ago by
maria2019130
maria2019130 wrote:

Hi,

I am very new to microarray analysis. I have some cancer and control samples (idat files) from illumina beadchip to analyze. I was following the tutorial on limma from ( https://www.bioconductor.org/packages/release/bioc/vignettes/limma/inst/doc/usersguide.pdf ), page 107.

When I check the EListRaw object, I get lots of ERCC as well which I did not expect!! (code and results are below).

  1. Why do I have spike ins?
  2. How can I remove them? (should I actually remove them?)

I ignored this and went through the downstream analysis but the FDR that I get is so hight (0.5 and above) which I think might be the result of I not removing ERCCs.

    $ idatfiles = dir("path", pattern = "idat",full.names = TRUE)
    $ bgxfile <- "my.bgx"
    $ x = read.idat(idatfiles, bgxfile)
    $ x$other$Detection <- detectionPValues(x)
    $ table(x$genes$Status)

>               biotin            cy3_hyb      ERCC-00002-02      ERCC-00003-01 
                         2                  6                  1                  1 
             ERCC-00004-01      ERCC-00009-01      ERCC-00012-01      ERCC-00013-01 
                         1                  1                  1                  1 
             ERCC-00014-02      ERCC-00016-01      ERCC-00017-02      ERCC-00019-01 
                         1                  1                  1                  1 
             ERCC-00022-02      ERCC-00024-02      ERCC-00025-01      ERCC-00028-02 
                         1                  1                  1                  1 
             ERCC-00031-02      ERCC-00033-01      ERCC-00034-02      ERCC-00035-02 
                         1                  1                  1                  1 
             ERCC-00039-01      ERCC-00040-01      ERCC-00041-01      ERCC-00042-01 
                         1                  1                  1                  1 
             ERCC-00043-01      ERCC-00044-02      ERCC-00046-01      ERCC-00048-01 
                         1                  1                  1                  1 
             ERCC-00051-01      ERCC-00053-01      ERCC-00054-01      ERCC-00057-01 
                         1                  1                  1                  1 
             ERCC-00058-02      ERCC-00059-01      ERCC-00060-01      ERCC-00061-02 
                         1                  1                  1                  1 
             ERCC-00062-01      ERCC-00067-02      ERCC-00069-02      ERCC-00071-01 
                         1                  1                  1                  1 
             ERCC-00073-01      ERCC-00074-01      ERCC-00075-01      ERCC-00076-02 
                         1                  1                  1                  1 
             ERCC-00077-01      ERCC-00078-01      ERCC-00079-01      ERCC-00081-02 
                         1                  1                  1                  1 
             ERCC-00083-01      ERCC-00084-01      ERCC-00085-01      ERCC-00086-01 
                         1                  1                  1                  1 
             ERCC-00092-02      ERCC-00095-01      ERCC-00096-02      ERCC-00097-01 
                         1                  1                  1                  1 
             ERCC-00098-02      ERCC-00099-01      ERCC-00104-01      ERCC-00108-02 
                         1                  1                  1                  1 
             ERCC-00109-02      ERCC-00111-01      ERCC-00112-02      ERCC-00113-01 
                         1                  1                  1                  1 
             ERCC-00116-02      ERCC-00117-02      ERCC-00120-01      ERCC-00123-01 
                         1                  1                  1                  1 
             ERCC-00126-02      ERCC-00130-01      ERCC-00131-02      ERCC-00134-01 
                         1                  1                  1                  1 
             ERCC-00136-01      ERCC-00137-02      ERCC-00138-01      ERCC-00142-02 
                         1                  1                  1                  1 
             ERCC-00143-01      ERCC-00144-02      ERCC-00145-01      ERCC-00147-01 
                         1                  1                  1                  1 
             ERCC-00148-01      ERCC-00150-01      ERCC-00154-02      ERCC-00156-01 
                         1                  1                  1                  1 
             ERCC-00157-02      ERCC-00158-01      ERCC-00160-02      ERCC-00162-01 
                         1                  1                  1                  1 
             ERCC-00163-01      ERCC-00164-01      ERCC-00165-01      ERCC-00168-01 
                         1                  1                  1                  1 
             ERCC-00170-01      ERCC-00171-01       housekeeping           labeling 
                         1                  1                  7                  2 
        low_stringency_hyb           negative            regular 
                         8                770              47231
ADD COMMENTlink modified 13 months ago by Kevin Blighe69k • written 13 months ago by maria2019130
2
gravatar for Kevin Blighe
13 months ago by
Kevin Blighe69k
Republic of Ireland
Kevin Blighe69k wrote:

You need to leave the control probes in the data for the purpose of background correction and normalisation. Then, if you perform background correction and normalisation via neqc(), these control probes should be automatically removed.

After normalisation, you can do further filtering based on the detection p-values. Any other control probes that still remain in the data may be identified via x$genes$Source. Others that can be filtered out include those with no gene symbol (x$genes$Symbol == "")

Thus, filtering that I perform post-normalisation is like this:

Control <- project.bgcorrect.norm$genes$Source=="ILMN_Controls"
NoSymbol <- project.bgcorrect.norm$genes$Symbol == ""
isexpr <- rowSums(project.bgcorrect.norm$other$Detection <= 0.05) >= 3

project.bgcorrect.norm.filt <- project.bgcorrect.norm[!Control & !NoSymbol & isexpr, ]

dim(project.bgcorrect.norm)
dim(project.bgcorrect.norm.filt)

Kevin

ADD COMMENTlink modified 13 months ago • written 13 months ago by Kevin Blighe69k
1

Hi Kevin,

Thank you very much for your answer. The code above worked for me and the result of control probes after normalization is 0. Maryam

ADD REPLYlink written 12 months ago by maria2019130
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1585 users visited in the last hour
_