Hi there,
I have some coverage stats and I want to merge them together. However, I want to merge them regardless of how many samples I have. Let's say in one batch I have 5 samples (A, B, C, D and E). Their corresponding files are (A.cov, B.cov, C.cov, D.cov and E.cov).
I first load these into R
library(tidyverse)
library(dplyr)
#load in coverage.stats files
coverage_stats = list.files(pattern="*.cov")
for (i in 1:length(coverage_stats)) assign(coverage_stats[i], read.table(coverage_stats[i]))
cov_stats_files = lapply(coverage_stats, read.table)
All is good here. This is the structure of the new dataframes (all the same) - this is A.cov
:
structure(list(V1 = c("sample", "total_reads", "mapped_to_target_reads",
"percentage", "mapped_to_target_reads_plus_150bp", "percentage"
), V2 = c("A", "56402158", "45018562", "79.82", "56664165",
"100.46")), row.names = c(NA, 6L), class = "data.frame")
This is B.cov
so you have another for good measure:
structure(list(V1 = c("sample", "total_reads", "mapped_to_target_reads",
"percentage", "mapped_to_target_reads_plus_150bp", "percentage"
), V2 = c("B", "56402458", "45018555", "80.82", "5666416",
"98")), row.names = c(NA, 6L), class = "data.frame")
I want to transform the tables into a nicer format:
transform_tables <- function(x) {
x %<>% t()
x <- as.data.frame(x)
x %<>% setNames(as.character(x[1,]))
x <- x[-1,]
}
cov_stats_files <- lapply(cov_stats_files, transform_tables)
Now I have my tables in the format I want. I now want to bind all the tables vertically (like an rbind
, but without explicitly giving the objects. I want to use the list cov_stats_files
to do this (since each batch will have a different number of samples). This is where I'm stuck! I don't know how to iterate through the list and bind each dataframe together...
Would appreciate any help pls! E
So easy! Thank you so much! Works beautifully! :-)
Courtesy rpolicastro -
rbind
is vectorized, so you can usedo.call
instead ofReduce
asrbind
doesn't need to be supplied arguments 2 at a time.will be faster than
Reduce