Process multiples files
0
0
Entering edit mode
4.4 years ago
Lila M ▴ 910

Hi everyone, I'm new using R and I have a doubt, I have this code for make peak annotation from a bed file (narrowPeak)

peak <- readPeakFile("file", header=F)
peakpeakAnno <- annotatePeak(peak, tssRegion=c(-3000, 3000), TxDb=txdb, annoDb="org.Hs.eg.db")

write.table(peakAnno,"new_name", sep="\t", col.names=T, row.names = F)


The code works, but I would like to know how I can create a loop that processes more than one bed file.

Thank you!!

ChIP-Seq R • 1.7k views
1
Entering edit mode
0
Entering edit mode

Thank you, but when I process two files, the code only writes one file:

files = Sys.glob("*.txt")
files
[1] "1.txt"  "2.txt"
for(i in files){
peak
peakAnno <- annotatePeak(peak, tssRegion=c(-3000, 3000), TxDb=txdb, annoDb="org.Hs.eg.db")
write.table(peakAnno,"proof", sep="\t", col.names=T, row.names = F)
}


How can I get the two new tables?

Thanks!

0
Entering edit mode

Please use ADD REPLY to answer to earlier comments or posts, as such this thread remains logically structured and easy to follow. I moved your answer now, but as you can see that's not optimal.

You have write.table in the loop, with which you overwrite the previous results. Either write it to separate files depending on the value of i, or keep the information in memory and rbind() the results together (depending on the size of your dataset this may or may not be possible), after which you write the output to a file after the completion of the for loop.

0
Entering edit mode

Can you show an example how the files look like?

0
Entering edit mode

They are tab delimited files. I can't fix the problem :( can anybody write an example, please? Thanks

1
Entering edit mode

Try this.This code can be used to rbind the tab delimited files (concatenating row-wise). You can change the rbind function to something else.

fileList <- list.files(, pattern=".txt")

new_df=do.call(rbind, lapply( fileList, function(X) {
data.frame(id = basename(X), tryCatch(read.table(X), error=function(e) NULL))}
))

0
Entering edit mode

I think that my problem is easier: This is my code, that works

files <- list("1", "2")
peakAnno <- lapply(files, annotatePeak, TxDb=txdb, tssRegion=c(-3000, 3000), annoDb="org.Hs.eg.db")
print (peakAnno)
for (i in peakAnno){
write.table(i, xxxx , sep="\t", col.names=T, row.names = F)
}


I only need that "xxx" will be different each time in the loop. It that possible?

Thanks!

0
Entering edit mode

If I understood correctly you just want the output file name to be dependent on the i while looping, right? I don't understand why you use files <- list("1", "2") and the solution of Ron is far better, I modified a bit:

fileList <- list.files(, pattern=".txt")
peakAnno <- lapply(files, annotatePeak, TxDb=txdb, tssRegion=c(-3000, 3000), annoDb="org.Hs.eg.db")
for (i in fileList){
write.table(i, paste("peaks_", i, sep="") , sep="\t", col.names=T, row.names = F)
}


But you could also write peakAnno to one output file, I guess. As you can see I'm reusing the names from the fileList object and adding a prefix to it, which you can off course freely modify.