I am trying to generate a series of fasta files with the number of sequences replicated by the estimated copy number of said sequence. I have a bunch of populations and using loops would achieve this, and my code currently has a loop at it's core. But I can imagine there is a better and faster way.
The current loop-centric code is:
Write_out <- function(comb_in, isl_index){
split_in <- comb_in %>% strsplit(" ") %>% unlist()
if(as.numeric(split_in[2]) >= 1){
for(rep_index in 1:as.numeric(split_in[2])){
named_seqs <- paste(">", split_in[1], "_", isl_index, "_", rep_index, "\n", as.character(Sequences[split_in[1]]), "\n", sep = "")
cat(named_seqs, file = paste("./Island_Sequences/", isl_index, "_OLF_Seqs.fasta", sep = ""), append = TRUE)
}
}
}
This function is then vectorised for each population. This is the final function in the pipeline, and has two inputs passed to it. The first is a string with sequence name and the number of replicates to generate, e.g. "Seq2345 19"
. I couldn't figure out how to pipe the two arguments together otherwise (but that's for another day), and the second is the population identifier (e.g. BV
).
What I'm after is a way to generate n number of sequence headers that are iteratively named, e.g. >Seq2345_BV_1
, >Seq2345_BV_2
, ...
, >Seq2345_BV_19
in a function I can vectorise.
Thanks.
if(as.numeric(split_in[2] >= 1)){
should beif(as.numeric(split_in[2]) >= 1){
- you're convertingsplit_in[2]
, not the result of the comparison, to numeric type.Edit: I misunderstood what you wrote. That is a mistake I hadn't caught. Thanks