Question: Multiple files, grep value, create new column with value and bind data
gravatar for odayel
3.1 years ago by
odayel20 wrote:

Hello! Thank you for your help in advance. I'm still getting my bearings with R and I am rubbish with loops

I have 1200 .txt files where the top 15 lines contain meta data, the bottom data contain the data readout (varying rows with 7 variables). I want to pull the "mac value" (row 9, column 2) from the top 15 lines and repeat that value in a new column with the other values of the data. I want to repeat this process for all 1200 sample files and then bind all the rows together to have a Masterlist. Here's an example for 2 of the .txt files. Any suggestions as how to set this up for all 1200 files would be greatly appreciated!!! Thank you!

df=read.csv("sample1.txt", nrows=15, sep="\t")
 patb <- "mac*"
 b <- grep(patb, df[9,2], value = TRUE)

 df1=read_table("sample1.txt", skip = 15)
 df1$macid=rep(b, nrow(df1))

 df2=read.csv("sample2. txt", nrows=15, sep="\t")
 patb <- "mac*"
 b2 <- grep(patb, df2[9,2], value = TRUE)

 df3=read_table("sample2.txt", skip = 15)
 df3$macid=rep(b2, nrow(df3))

 Master=rbind(df1, df3)
R • 1.5k views
ADD COMMENTlink modified 3.1 years ago by ddiez1.8k • written 3.1 years ago by odayel20
gravatar for ddiez
3.1 years ago by
ddiez1.8k wrote:

Something like this should work (WARNING: untested; might need some adjustments):

# put all your file names into a character vector.
# adjust `path` and `pattern` to ensure you pick the right ones.
ff <- list.files(path = ".", pattern = ".txt")

# iterate over the file names. `f` contains the name of one
# file in each iteration. `lapply` returns a list.
master <- lapply(ff, funtion(f) {
  # read the 9th row, 2nd column. 
  mac_id <- read.table(f, skip = 8, nrows = 1)[, 2]
  # read the rest.
  tmp <- read.table(f, skip = 15)
 # join columns.
  cbind(tmp, rep(mac_id, nrow(tmp))
# master is now a list of 1-row data.frames of same dimensions.
# this puts them together:
master <-, master)

If your data files has some inconsistencies regarding the position of mac value then you can get around that using grep and alike.

ADD COMMENTlink modified 3.1 years ago • written 3.1 years ago by ddiez1.8k

and @ddiez you are a rockstar yet again! Thanks this is working now and I was able to implement grep for inconsistencies. Thank you!!

ADD REPLYlink written 3.1 years ago by odayel20
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1871 users visited in the last hour