How to write the rule for Indexing with bowtie2-build and Expand function in snakemake ?
1
0
Entering edit mode
6 months ago
Ahmed • 0

I am a beginner user of snakemake. I want to index the reference genome by using bowtie2-build.

I found this post How to write the output in snakefile (snakemake) for bowtie2-build as they used this cose

rule bowtie2Build:
    input: "/home/bnextgen/refgenome/infected_consensus.fasta"
    params:
        basename: "/home/s1104230/output/reference"
    output:
        output1="output/reference.1.bt2",
        output2="output/reference.2.bt2",
        output3="output/reference.3.bt2",
        output4="output/reference.4.bt2",
        outputrev1="output/reference.rev1.bt2",
        outputrev2="output/reference.rev2.bt2"
    shell: "bowtie2-build {input} {params.basename}"

I do not completely understand (what is params. basename means or indication )

and tried to use it as it is with change to expand function instead of mention path to 6 files of indexing

this my code:

f_ext = [".1.bt2",".2.bt2",".3.bt2",".4.bt2",".rev.1.bt2",".rev.2.bt2"]
bt_idx_folder ="bt_idx"


rule bowtie2_index:
   input:
     data/reference.fa"
   params:
     basename= "result/bt_idx_folder/reference"

   output:
     expand(result/ bt_idx_folder + "reference{ext}" , ext = f_ext)

 shell:
    "bowtie2-build {input} {params.basename}"
`

It gives me an error :

MissingOutputException in line 13 of Snakefile
Job Missing files after 5 seconds:
/homel/result/bt_idx/reference.1.bt2
/home/result/bt_idx/reference.2.bt2
/home/result/bt_idx/reference.3.bt2 , etc  (all 6 files) 

This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait.
Job id: 0 completed successfully, but some output files are missing. 0
  File "/home/ahmed/anaconda3/envs/snakemake/lib/python3.9/site-packages/snakemake/executors/__init__.py", line 584, in handle_job_success
  File "/home/ahmed/anaconda3/envs/snakemake/lib/python3.9/site-packages/snakemake/executors/__init__.py", line 252, in handle_job_success

what is the problem in my code and the meaning of params. basename?

bowtie2 snakemake indexing • 419 views
ADD COMMENT
1
Entering edit mode
ADD REPLY
1
Entering edit mode
4 months ago
lily ▴ 70

The error you get is due to your variable bt_idx_folder. In the definition of your parameter basename

basename= "result/bt_idx_folder/reference"

it is actually defined as the string itself, if you would want the index to go into " "result/bt_idx/reference" it should be

basename= "result/" + bt_idx_folder + "/reference" 

This is exactly what the MissingOutputException tells you, the files are actually written into result/bt_idx_folder/reference, but are expected to go into result/bt_idx/reference which causes your workflow to crash.

In general, the basename flag tells bowtie2 the directory and prefix for the index files. So in order to run this rule the basename parameter and the output files minus their extension should match.

ADD COMMENT

Login before adding your answer.

Traffic: 3014 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6