Entering edit mode
4.2 years ago
regue.hadrien
▴
40
Hi there!
I'm trying to build my first snakemake pipeline. I want to create a rule which can split a csv file into many. Each time I get a "Nothing to be done." message. Thus, it seems my rule is simply not executed, and I cant solve the problem.
Here is my first rule which is working properly:
rule xls_to_fasta_csv:
input:
xls_file = "data/last_gisaid_xls.xls"
output:
metadata_raw = "temp_data/metadata_raw.csv",
fasta_seq = "temp_data/sequences.fasta"
shell:
"script/process_xls.py {input} {output.fasta_seq} {output.metadata_raw}"
And there is the non-executing rule:
rule make_metadata:
input:
csv_file = rules.xls_to_fasta_csv.output.metadata_raw,
script = "script/make_metadata.R"
output:
H1N1_S4 = "temp_data/H1N1_S4.tsv",
H1N1_S6 = "temp_data/H1N1_S6.tsv",
H3N2_S4 = "temp_data/H3N2_S4.tsv",
H3N2_S6 = "temp_data/H3N2_S6.tsv",
B_S4 = "temp_data/B_S4.tsv",
B_S6 = "temp_data/B_S6.tsv"
script:
"{input.script} {input.csv_file}"
"{output.H1N1_S4} {output.H1N1_S6k}"
"{output.H3N2_S4} {output.H3N2_S6}"
"{output.B_S4} {output.B_S6}"
At last, here is the first and last lines of my R script:
raw_metadata<-snakemake@input[[1]]
H1N1_S4_output<-snakemake@output[[1]]
H1N1_S6_output<-snakemake@output[[2]]
H3N2_S4_output<-snakemake@output[[3]]
H3N2_S4_output<-snakemake@output[[4]]
B_S4_output<-snakemake@output[[5]]
B_S4_output<-snakemake@output[[6]]
#some code
write.table(H1N1_S4,file=H1N1_S4,col.names = TRUE,row.names = FALSE,quote=FALSE,sep="\t")
write.table(H1N1_S6,file=H1N1_S6,col.names = TRUE,row.names = FALSE,quote=FALSE,sep="\t")
write.table(H3N2_S4,file=H3N2_S4,col.names = TRUE,row.names = FALSE,quote=FALSE,sep="\t")
write.table(H3N2_S6,file=H3N2_S6,col.names = TRUE,row.names = FALSE,quote=FALSE,sep="\t")
write.table(B_S4,file=B_S4,col.names = TRUE,row.names = FALSE,quote=FALSE,sep="\t")
write.table(B_S6,file=B_S6,col.names = TRUE,row.names = FALSE,quote=FALSE,sep="\t")
It's probably not the regular way to use R with snakemake, so I take any advice. thank:)
Do you have an
all
rule or something like that at the beginning? You would normally put all of your final output files there. The message from snakeMake has nothing to do with your R script, since it's not even being run.I've added a rule:
Now the first rule xls_to_fasta_csv is not executed to, only the all rule, and of course, it fail. Any clues?
remove the
output
section, those files should all be in the input.Well thank you, it work perfectly now:)
Last question: if I want to add more rules, I have to upgrade the
all
rule with the final files, or simply add them in theinput
section?Thx again!
all
should have all final files that you want from all of your rules. Generally you'll add various rules with intermediate files, which then don't need to be added to theall
rule.BTW, if you ever need to use
make
and makefiles, you'll see that snakeMake is modeled after them, since this is also how they work.