Question: Using forloop to add headers to multiple files
1
gravatar for jordi.planells
6 months ago by
jordi.planells80 wrote:

Hi guys! I need help in setting a for loop in R (I'm quite new programming in R). I would like to add the same header to all the files that match a concrete pattern inside a folder.

To get the list of files I'm using the following code:

filelist <- list.files(pattern = "DESeq2_result*")

And this is the for loop I am trying to implement:

for (i in seq_along(filelist)) {
   names[[i]] <- a
   out [i]

}

where a is a vector that I defined with the names of the different columns:

a <- c("gene_id", "baseMean", "log2FC",
       "SD", "WaldStatistic", "pval", "padj")

If you have any tutorial/page to help me to learn and practise my ability to code functions and loops in R would be so much appreciated.

Thank you so much in advance!

Jordi

R • 370 views
ADD COMMENTlink modified 6 months ago by arup1.5k • written 6 months ago by jordi.planells80
1

What error do you get? What is it that you would like to get out of the loop? The same files but now with change of header?

ADD REPLYlink modified 6 months ago • written 6 months ago by Benn7.4k
4
gravatar for Gautier Richard
6 months ago by
MPI IE, Freiburg, Germany
Gautier Richard280 wrote:

As I understood, you want to open some files containing the pattern "DESeq2_result*" as data.frames in a list, assigning new names to the columns of these dataframes based on the a vector, and maybe name the elements of the list based on the files name? You can do it that way:

filelist <- list.files(pattern = "DESeq2_result")

a <- c("gene_id", "baseMean", "log2FC",
       "SD", "WaldStatistic", "pval", "padj")

data<-list()

for (i in 1:length(filelist)) {
  data[[i]]<-read.table(filelist[i])
  colnames(data[[i]])<-a
}

names(data)<-filelist

for (i in 1:length(data)) {
    output_name <- paste0("colnames_",names(data[i]))
    write.table(data[[i]], output_name, quote = F, row.names = F, col.names = T, sep="\t")
}

EDIT: added the write.table part to take your comment into account in order to export the imported files with the column names.

ADD COMMENTlink modified 6 months ago • written 6 months ago by Gautier Richard280
2

Basically, what i want to do is to add column names to my files, in one piece of code. For a single file, the code would be:

data <- read.table("file", sep = "\t", header = FALSE)
names(data)
colnames(data) <- a
write.table(data, "file", sep = "\t", col.names = TRUE, row.names = FALSE)

I'm trying to optimise my code, so I can do all of them without having to type in my Rmarkdown document 8 times the same code (one for each file)

ADD REPLYlink written 6 months ago by jordi.planells80
1

Ok, I modified the code so it does what you want.

I think that a much better way would be to simply replace the first line of every file with a shell script or a one-liner. You don't need R for that. The code above is really heavy for what you need. Pasting the first line stored in file a at the beginning of every file with pattern would be much more efficient, but not R-based.

ADD REPLYlink modified 6 months ago • written 6 months ago by Gautier Richard280
1

Ok thank you so much! So basically you are suggesting me to do this with a shell for loop and sed/awk command instead. Thank you so much again for your help! Cheers!

ADD REPLYlink written 6 months ago by jordi.planells80
4
gravatar for zx8754
6 months ago by
zx87547.9k
London
zx87547.9k wrote:

Use col.names = argument when reading the files, then write out, something like this:

for(i in list.files(pattern = "DESeq2_result_*"))
  write.table(read.table(i, col.names = c("gene_id", "baseMean", "log2FC",
                                          "SD", "WaldStatistic", "pval", "padj")), i)

Note: this overwrites existing files, to create new files:

for(i in list.files(pattern = "DESeq2_result_*"))
  write.table(read.table(i, col.names = c("gene_id", "baseMean", "log2FC",
                                          "SD", "WaldStatistic", "pval", "padj")),
              paste0(i, ".fixed.txt"))
ADD COMMENTlink written 6 months ago by zx87547.9k
1

Thank you a lot!! It is really appreciated

ADD REPLYlink written 6 months ago by jordi.planells80
1
gravatar for arup
6 months ago by
arup1.5k
India
arup1.5k wrote:

Creates a new file with headers. Change the gsub pattern before using.

col=c("gene_id", "baseMean", "log2FC","SD", "WaldStatistic", "pval", "padj")
for(file in list.files(pattern="^DESeq2_result*")){
  write.table(read.table(file,col.names = col),
              paste0("updated_",gsub(pattern = "\\.txt$", "", file),".tsv"),col.names = TRUE, row.names = FALSE,sep="\t") #Also update the file extension
}
ADD COMMENTlink written 6 months ago by arup1.5k
1

Thank you too, arup!

ADD REPLYlink written 6 months ago by jordi.planells80
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 542 users visited in the last hour