Question

Help with Minfi for methylation analysis.

0

Entering edit mode

7.5 years ago

halo22 ▴ 300

Hello All,

I am trying to analyze some methylation data using the minfi package. I have raw idat files for all my patient samples, right now I am stuck at the very first step. As the tutorial suggest, I have created a sample targets file that includes the samples information and also location of the idat files. I have pasted few lines of my sample target file below and also the output that I get from printing read.metharray.sheet. For some reason in the input I see character(0) as Basename. But I see correct output from list.files. Would really appreciate your input on this one:

This is my code:

library(minfi)
baseDir <-"/home/idats"
list.files(baseDir)
targets <- read.metharray.sheet(baseDir)
print (head(targets))

Input targets file:

Sample_Name,Sample_Well,Sample_Plate,Sample_Group,Pool_ID,person,age,sex,status,Array,Slide,Basename  
908224,NA,NA,Cancer,NA,908224,62.9,F,Cancer,420015,R01C01,/home/idats/420015_R01C01
908224,NA,NA,Cancer,NA,908224,62.9,F,Cancer,420015,R01C01,/home/idats/420015_R01C01
836160,NA,NA,Normal,NA,836160,50.1,M,Normal,7420015,R05C01,/home/idats/7420015_R05C01
836160,NA,NA,Normal,NA,836160,50.1,M,Normal,7420015,R05C01,/home/idats/7420015_R05C01

Output:

read.metharray.sheet] Found the following CSV files:

"/home/idats/test.csv"
      Sample_Name Sample_Well Sample_Plate Sample_Group Pool_ID person  age sex
1     908224        <NA>         <NA>       Cancer    <NA> 908224 62.9   F
2     908224        <NA>         <NA>       Cancer    <NA> 908224 62.9   F
3      836160        <NA>         <NA>       Normal    <NA> 836160 50.1   M
4      836160        <NA>         <NA>       Normal    <NA> 836160 50.1   M
       status   Array  Slide     Basename
1    Cancer  420015 R01C01 character(0)
2    Cancer  420015 R01C01 character(0)
3    Normal 7420015 R05C01 character(0)
4    Normal 7420015 R05C01 character(0)

https://github.com/stephaniehicks/bioconductorNotes/blob/master/minfi.Rmd https://www.bioconductor.org/help/course-materials/2015/BioC2015/methylation450k.html#introduction

Thanks

snp sequencing methylation 450k • 5.7k views

ADD COMMENT • link updated 7.5 years ago by igor 13k • written 7.5 years ago by halo22 ▴ 300

0

Entering edit mode

I also met such problem before. Something it is caused by the format of samplesheet. Something I will debug it by download the code of these function, such as read.metharray.exp(). More convenient way should be provided so that other's can help you. I prefer github to keep you data and script, so that other can easily check the pipeline with git clone.

ADD REPLY • link 7.5 years ago by Shicheng Guo ★ 9.4k

0

Entering edit mode

I saw your post over here https://support.bioconductor.org/p/71585/, can you please specify what samplesheet format is applicable? Posted above is the only code that I have written for this analysis. Thanks

ADD REPLY • link 7.5 years ago by halo22 ▴ 300

score 0 · Answer 1 · 2016-11-08

0

Entering edit mode

7.5 years ago

igor 13k

Try using read.metharray.exp() (formerly read.450k.exp()) instead.

ADD COMMENT • link 7.5 years ago by igor 13k

0

Entering edit mode

I tried using read.metharray.exp in two different forms: 1) RGset <- read.metharray.exp(base=baseDir) This creates the "RGChannel" datastructures but fails ti capture the sample info or it doesn't read the test.csv file

2) RGset <- read.metharray.exp(base=baseDir,targets="/home/idats/test.csv") Here I try to specify the path to the test.csv separately but I get the following error: Error in read.metharray.exp(base = baseDir, targets = "/homedats/test.csv") : Need 'Basename' amongst the column names of 'targets'

ADD REPLY • link 7.5 years ago by halo22 ▴ 300

2

Entering edit mode

You should import your CSV first. Something like:

targets = read.csv("test.csv", strip.white = TRUE, stringsAsFactors = FALSE)
raw_set = read.metharray.exp(targets = targets, recursive = TRUE, verbose = TRUE)

ADD REPLY • link 7.5 years ago by igor 13k