Question

Process multiples files

0

Entering edit mode

7.4 years ago

Lila M ★ 1.2k

Hi everyone, I'm new using R and I have a doubt, I have this code for make peak annotation from a bed file (narrowPeak)

peak <- readPeakFile("file", header=F)
peakpeakAnno <- annotatePeak(peak, tssRegion=c(-3000, 3000), TxDb=txdb, annoDb="org.Hs.eg.db")

write.table(peakAnno,"new_name", sep="\t", col.names=T, row.names = F)

The code works, but I would like to know how I can create a loop that processes more than one bed file.

Thank you!!

ChIP-Seq R • 2.4k views

ADD COMMENT • link updated 7.4 years ago by Biostar 20 • written 7.4 years ago by Lila M ★ 1.2k

1

Entering edit mode

Applying a task to several files in R

ADD REPLY • link 7.4 years ago by GenoMax 141k

0

Entering edit mode

Thank you, but when I process two files, the code only writes one file:

files = Sys.glob("*.txt")
files
[1] "1.txt"  "2.txt"
for(i in files){
peak <- readPeakFile(i, header=F)
peak
peakAnno <- annotatePeak(peak, tssRegion=c(-3000, 3000), TxDb=txdb, annoDb="org.Hs.eg.db")
write.table(peakAnno,"proof", sep="\t", col.names=T, row.names = F)
}

How can I get the two new tables?

Thanks!

ADD REPLY • link updated 7.4 years ago by WouterDeCoster 47k • written 7.4 years ago by Lila M ★ 1.2k

0

Entering edit mode

Please use ADD REPLY to answer to earlier comments or posts, as such this thread remains logically structured and easy to follow. I moved your answer now, but as you can see that's not optimal.

You have write.table in the loop, with which you overwrite the previous results. Either write it to separate files depending on the value of i, or keep the information in memory and rbind() the results together (depending on the size of your dataset this may or may not be possible), after which you write the output to a file after the completion of the for loop.

ADD REPLY • link 7.4 years ago by WouterDeCoster 47k

0

Entering edit mode

Can you show an example how the files look like?

ADD REPLY • link 7.4 years ago by Ron ★ 1.2k

0

Entering edit mode

They are tab delimited files. I can't fix the problem :( can anybody write an example, please? Thanks

ADD REPLY • link 7.4 years ago by Lila M ★ 1.2k

1

Entering edit mode

Try this.This code can be used to rbind the tab delimited files (concatenating row-wise). You can change the rbind function to something else.

fileList <- list.files(, pattern=".txt")

new_df=do.call(rbind, lapply( fileList, function(X) {
  data.frame(id = basename(X), tryCatch(read.table(X), error=function(e) NULL))}
))

ADD REPLY • link 7.4 years ago by Ron ★ 1.2k

0

Entering edit mode

I think that my problem is easier: This is my code, that works

files <- list("1", "2")
peakAnno <- lapply(files, annotatePeak, TxDb=txdb, tssRegion=c(-3000, 3000), annoDb="org.Hs.eg.db")
print (peakAnno)
for (i in peakAnno){
    write.table(i, xxxx , sep="\t", col.names=T, row.names = F)
}

I only need that "xxx" will be different each time in the loop. It that possible?

Thanks!

ADD REPLY • link updated 7.4 years ago by GenoMax 141k • written 7.4 years ago by Lila M ★ 1.2k

0

Entering edit mode

If I understood correctly you just want the output file name to be dependent on the i while looping, right? I don't understand why you use files <- list("1", "2") and the solution of Ron is far better, I modified a bit:

fileList <- list.files(, pattern=".txt")
peakAnno <- lapply(files, annotatePeak, TxDb=txdb, tssRegion=c(-3000, 3000), annoDb="org.Hs.eg.db")
for (i in fileList){
    write.table(i, paste("peaks_", i, sep="") , sep="\t", col.names=T, row.names = F)
}

But you could also write peakAnno to one output file, I guess. As you can see I'm reusing the names from the fileList object and adding a prefix to it, which you can off course freely modify.

ADD REPLY • link 7.4 years ago by WouterDeCoster 47k