snakemake workflow to download GEOdatasets in r have error of missing output exception
1
0
Entering edit mode
16 months ago

I am creating a snakemake workflow to download geodatasets. However, I get the following error

```
RuleException: CalledProcessError in file /home/sara/sara/Snakefile, line 8: Command 'set -euo pipefail; Rscript --vanilla /home/sara/sara/.snakemake/scripts/tmps2lv_wlb.meta.R' returned non-zero exit status 1. 
File "/home/sara/sara/Snakefile", line 8, in __rule_download 
File"/home/sara/mambaforge/envs/snakemake/lib/python3.11/concurrent/futures/thread.py", line 58, in run Shutting down, this might take some time. Exiting because a job execution failed. Look above for error message
```

Here is my snakemake file

configfile: "config.yml"
rule download:
    output: 
        dir= directory("/home/sara/sara/downloads/")
    params:
        sample=config["Samples"]
    script: 
        "scripts/meta.R"

Here is my config.yml

env:
            - bioconductor-oligo
            - bioconducor-GEOquery
channels:
            - conda-forge
            - bioconda
Samples:
            - "GSE6955"
            - "GSE67311"

Here is the Rscript

library(GEOquery)
getGEOSuppFiles(snakemakege@params[sample],baseDir = snakemake@output[dir])
r geoquery snakemake • 1.2k views
ADD COMMENT
1
Entering edit mode

Syntax error: getGEOSuppFiles(snakemake@params[['sample']], baseDir = snakemake@output[['dir']])

ADD REPLY
0
Entering edit mode

Thank you, after I fixed it, I had new error message:

**MissingOutputException in rule download** in file /home/sara/sara/Snakefile, line 7:
Job 1  completed successfully, but some output files are missing. Missing files after 5 seconds. This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait:
GSE6955.tar
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
ADD REPLY
0
Entering edit mode

R code is for downloading datasets from GEOdatabase, I edited snakmake file but still same error

configfile: "config.yml"
sample=config["Samples"]
rule all:
    input:
        expand("{sample}.tar",sample=config["Samples"])

rule download:
    output: 
        "{sample}.tar"
    params:
        sample=config["Samples"]
    script: 
        "scripts/meta.R"

R code

library(GEOquery)
getGEOSuppFiles(snakemake@params[['sample']])
ADD REPLY
0
Entering edit mode
15 months ago
zorbax ▴ 610

Try this: snakemake --use-conda -s Snakemake, where the Snakefile looks like:

configfile: "config.yml"

rule all:
    input:
        expand("{sample}/{sample}_RAW.tar", sample=config["Samples"])

rule download:
    output: 
        "{sample}/{sample}_RAW.tar"
    params:
        sample="{sample}"
    conda:
        "geo.yml"
    script: 
        "scripts/meta.R"

your config.yml:

Samples:
    - "GSE6955"
    - "GSE67311"

your conda geo.yaml file:

channels:
    - conda-forge
    - bioconda

dependencies:
   - bioconductor-oligo
   - bioconductor-geoquery

Please don't cross-post your questions, or at least put the things clear at the beginning.

When your rule all input think backward to check the correct output filename.

ADD COMMENT

Login before adding your answer.

Traffic: 1489 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6