Question: Using forloop to add headers to multiple files
1
gravatar for jordi.planells
2.0 years ago by
jordi.planells330 wrote:

Hi guys! I need help in setting a for loop in R (I'm quite new programming in R). I would like to add the same header to all the files that match a concrete pattern inside a folder.

To get the list of files I'm using the following code:

filelist <- list.files(pattern = "DESeq2_result*")

And this is the for loop I am trying to implement:

for (i in seq_along(filelist)) {
   names[[i]] <- a
   out [i]

}

where a is a vector that I defined with the names of the different columns:

a <- c("gene_id", "baseMean", "log2FC",
       "SD", "WaldStatistic", "pval", "padj")

If you have any tutorial/page to help me to learn and practise my ability to code functions and loops in R would be so much appreciated.

Thank you so much in advance!

Jordi

R • 1.2k views
ADD COMMENTlink modified 2.0 years ago by Arup Ghosh2.7k • written 2.0 years ago by jordi.planells330
1

What error do you get? What is it that you would like to get out of the loop? The same files but now with change of header?

ADD REPLYlink modified 2.0 years ago • written 2.0 years ago by Benn8.1k
4
gravatar for Gautier Richard
2.0 years ago by
MPI IE, Freiburg, Germany
Gautier Richard340 wrote:

As I understood, you want to open some files containing the pattern "DESeq2_result*" as data.frames in a list, assigning new names to the columns of these dataframes based on the a vector, and maybe name the elements of the list based on the files name? You can do it that way:

filelist <- list.files(pattern = "DESeq2_result")

a <- c("gene_id", "baseMean", "log2FC",
       "SD", "WaldStatistic", "pval", "padj")

data<-list()

for (i in 1:length(filelist)) {
  data[[i]]<-read.table(filelist[i])
  colnames(data[[i]])<-a
}

names(data)<-filelist

for (i in 1:length(data)) {
    output_name <- paste0("colnames_",names(data[i]))
    write.table(data[[i]], output_name, quote = F, row.names = F, col.names = T, sep="\t")
}

EDIT: added the write.table part to take your comment into account in order to export the imported files with the column names.

ADD COMMENTlink modified 2.0 years ago • written 2.0 years ago by Gautier Richard340
2

Basically, what i want to do is to add column names to my files, in one piece of code. For a single file, the code would be:

data <- read.table("file", sep = "\t", header = FALSE)
names(data)
colnames(data) <- a
write.table(data, "file", sep = "\t", col.names = TRUE, row.names = FALSE)

I'm trying to optimise my code, so I can do all of them without having to type in my Rmarkdown document 8 times the same code (one for each file)

ADD REPLYlink written 2.0 years ago by jordi.planells330
1

Ok, I modified the code so it does what you want.

I think that a much better way would be to simply replace the first line of every file with a shell script or a one-liner. You don't need R for that. The code above is really heavy for what you need. Pasting the first line stored in file a at the beginning of every file with pattern would be much more efficient, but not R-based.

ADD REPLYlink modified 2.0 years ago • written 2.0 years ago by Gautier Richard340
1

Ok thank you so much! So basically you are suggesting me to do this with a shell for loop and sed/awk command instead. Thank you so much again for your help! Cheers!

ADD REPLYlink written 2.0 years ago by jordi.planells330
4
gravatar for zx8754
2.0 years ago by
zx87549.9k
London
zx87549.9k wrote:

Use col.names = argument when reading the files, then write out, something like this:

for(i in list.files(pattern = "DESeq2_result_*"))
  write.table(read.table(i, col.names = c("gene_id", "baseMean", "log2FC",
                                          "SD", "WaldStatistic", "pval", "padj")), i)

Note: this overwrites existing files, to create new files:

for(i in list.files(pattern = "DESeq2_result_*"))
  write.table(read.table(i, col.names = c("gene_id", "baseMean", "log2FC",
                                          "SD", "WaldStatistic", "pval", "padj")),
              paste0(i, ".fixed.txt"))
ADD COMMENTlink written 2.0 years ago by zx87549.9k
1

Thank you a lot!! It is really appreciated

ADD REPLYlink written 2.0 years ago by jordi.planells330
1
gravatar for Arup Ghosh
2.0 years ago by
Arup Ghosh2.7k
India
Arup Ghosh2.7k wrote:

Creates a new file with headers. Change the gsub pattern before using.

col=c("gene_id", "baseMean", "log2FC","SD", "WaldStatistic", "pval", "padj")
for(file in list.files(pattern="^DESeq2_result*")){
  write.table(read.table(file,col.names = col),
              paste0("updated_",gsub(pattern = "\\.txt$", "", file),".tsv"),col.names = TRUE, row.names = FALSE,sep="\t") #Also update the file extension
}
ADD COMMENTlink written 2.0 years ago by Arup Ghosh2.7k
1

Thank you too, arup!

ADD REPLYlink written 2.0 years ago by jordi.planells330
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1679 users visited in the last hour
_