Question: Run Methylkit Analysis Sequentially On Several Comparisons And Contexts
1
gravatar for essell22
7.5 years ago by
essell2210
United States
essell2210 wrote:

I'm realatively new to R (having only used it interactively in R studio before) and have never written an R script before. I'm using the MethylKit package to analyze many files and would like to write a script that would run through my comparisons sequentially. My current script looks like:

library("methylKit")
getFilenames <- function(FileNames) {
  File.list <- list()
  x <- readline("Filename: ")
  while (x != "") {
    File.list <- append(File.list, x)
    x <- readline("Filename: ")
  }
  File.list
}
getFilenames(FileNames)
Context <- readline("Context: ")
SampleIDs <- readline("List Sample IDs separated by a space: ")
SampleIDs <- lapply(SampleIDs, list, sep = " ")
Treatment <- readline("enter treatment types (0 or 1) separated by commas: ")

myobj <- read(File.list, sample.id=SampleIDs, assembly="mm9", treatment = c(Treatment), context= Context)
filt <- filterByCoverage(myobj, lo.count=20, lo.perc=NULL, hi.count=NULL, hi.perc=99.9)
meth <- unite(filt)
rm(Files)
rm(File.list)
rm(myobj)
rm(filt)
gc(TRUE)
diff <- calculateDiffMeth(meth, num.cores = 8)
diff20.01 <- get.methylDiff(diff, difference =20, qvalue=0.01)
MainFileName <- readline("MainFileName: ")
write.table(data.frame(meth), file=MainFileName+"_"+Context+".methylbase.txt", sep="\t", quote=FALSE)
write.table(data.frame(diff), file=MainFileName+"_"+Context+".methyldiff.txt", sep="\t", quote=FALSE)
write.table(data.frame(diff20.01), file=MainFileName+"_"+Context+"_20.01.methyldiff.txt", sep="\t", quote=FALSE)

This would (hopefully) run through one comparison using a single context. For each comparison I have 3 different contexts (CpG, CHG and CHH) coming from different input files. However, I would like to enter all the information for multiple comparisons and contexts that will then run one after another without my having to enter all the information for the next run after the previous run has finished.

R • 3.0k views
ADD COMMENTlink modified 7.2 years ago by Charles Warden7.9k • written 7.5 years ago by essell2210
0
gravatar for Charles Warden
7.2 years ago by
Charles Warden7.9k
Duarte, CA
Charles Warden7.9k wrote:

I'm not sure how many files you have to read, but the analysis structure is defined by "myobj". In other words, you only have to create your file list once, and then you set up "myobj" for the relevant contrasts.

You may have already seen the relevant part of the code, but this is all it takes to set up your files:

file.list=list(  "test1.myCpG.txt",  "test2.myCpG.txt", "control1.myCpG.txt",  "control2.myCpG.txt" )

If you have different sets of treatments and controls among the same set of samples, you only have to do this once. If you need to look at different subsets, you should do this once for each subset.

It looks like your code above is problematic for at least a couple reasons:

1) It looks like you only have one value in the treatment array (you must have at least one control and one treatment)

2) I think you have to represent treatments and controls numerically (1=treatment, 0=control)

FYI, I think the example code for running methylKit are pretty good. I remember getting feedback pretty quickly when e-mailing the methylKit contact. That is probably a better source for specific technically questions.

http://code.google.com/p/methylkit/

ADD COMMENTlink written 7.2 years ago by Charles Warden7.9k

Oh, I see that you are trying to defined the treatments above myobj. However, I can't tell how this is supposed to create the desired result (and, if you already have an array defined, you don't need the additional c()).

Maybe more details about the early part of the script will be helpful?

ADD REPLYlink written 7.2 years ago by Charles Warden7.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1019 users visited in the last hour