Snakemake input and wildcards
1
2
Entering edit mode
4.6 years ago
Rox ★ 1.4k

Hi everyone,

Trying to change my habits from bash scripting to snakemake. I struggle to understand the basic logic of few things and the most important one is how to deal with input name. I don't want to freeze a particular input name inside the snakefile, I want it to be able to work on any fastq file, and I know that's what is snakemake for. So I thought the correct way to do this was to use the wildcardss so it know where he have to look and what for (a fastq file for example).

I have made the following script :

from Bio import SeqIO
import sys
import pandas

rule split_fastq:
    input:
        "data/{reads}.fastq",
        "data/{seqsum}.txt"
    output:
        "split_fastq/low.fastq",
        "split_fastq/high.fastq"
    run:
        cols = ["read_id","mean_qscore_template"]
        dataInter = pandas.read_csv(filepath_or_buffer=args.seqsum_path,sep="\t",usecols=cols)
        data = dataInter.rename(mapper={"mean_qscore_template": "quals"}, axis="columns").set_index("read_id").to_dict()["quals"]
        with open("split_fastq/low.fastq",'w+') as a, open("split_fastq/high.fastq",'w+') as b:
            try:
                for rec in SeqIO.parse("data/{reads}.fastq","fastq"):
                    if(data[rec.id] <= 6):
                        SeqIO.write(rec,a,"fastq")
                    else:
                        SeqIO.write(rec,b,"fastq")
            except KeyError:
                sys.exit('\nERROR: Mismatch between sequencing_summary and fastq file: {} was not found in the summary file.\nByeBye.'.formatrec.id))

And I keep switching from this error :

(snakemake-4.8.0_venv) |sbsuser@genologin2 /work/sbsuser/test/roxane/alignement-ont|$snakemake
Building DAG of jobs...
WildcardError in line 5 of /work/sbsuser/test/roxane/alignement-ont/Snakefile:
Wildcards in input files cannot be determined from output files:
'reads'

So then I add the wildcars also in the output like this "split_fastq/{reads}-low.fastq","split_fastq/{reads}-high.fastq" and I end up with :

(snakemake-4.8.0_venv) |sbsuser@genologin2 /work/sbsuser/test/roxane/alignement-ont|$snakemake
Building DAG of jobs...
WorkflowError:
Target rules may not contain wildcards. Please specify concrete files or a rule without wildcards.

Can someone explain what I am doing wrong ? I really don't get it...

Thanks,

Roxane

snakemake python • 8.4k views
ADD COMMENT
1
Entering edit mode

Hello Roxane Boyer ,

is this you whole workflow or do you plan to include more rules? Depending on that, there might be technique you should have a look on:

The later one I have used in the example of my tutorial.

fin swimmer

ADD REPLY
1
Entering edit mode

Hi finswimmer !

I was planning on adding more rules later, but I wanted to build my first snakescript slowly and leanr step by step. That's why it's frustrating to get stuck at the first rule !

I'll have a look on those link and try to figure out what I am missing. Thanks

ADD REPLY
1
Entering edit mode
4.6 years ago
Thibault D. ▴ 700

Hi,

The error points out your problem: you have only one rule. This rule is therefore your 'target rule'. A target rule shall not contain any wildcard. Basically, when you have wildcards in the input section, you should also find them in the output one.

See examples at the front page of the snakemake documentation: one rule (the first, a.k.a the target rule) has a list of expected output. The second rule has identical wildcards in both input and output sections.

I see other issues that will be raised by Snakemake as soon as this wildcard issue will be solved.

For Snakemake issues like this one, you should try StackOverFlow's Snakemake tag: the community is more active there.

ADD COMMENT
1
Entering edit mode
ADD REPLY
0
Entering edit mode

Well, I also tried with a general rule, it didn't worked either. I must miss a point. But you are right, I should ask my question on the stackoverflow snakemake's section. Thank you !

ADD REPLY

Login before adding your answer.

Traffic: 1569 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6