Question: How to create a snakefile(snakemake) to align paired-end reads with bowtie2
1
1
Entering edit mode
6 months ago
daalsa98 ▴ 10

Im new using snakemake, I'm trying to align paired-end reads using bowtie2, I have already indexed my fasta files, with bowtie-build, and i dont know how to reference the index that i previously made in my snakefile,

the index files generated by bowtie-build are:

  • xylella_fastidiosa.1.bt2
  • xylella_fastidiosa.2.bt2
  • xylella_fastidiosa.3.bt2
  • xylella_fastidiosa.4.bt2
  • xylella_fastidiosa.rev.1.bt2
  • xylella_fastidiosa.rev.2.bt2

my code is :

N_THREADS = 12

SAMPLES = ['M1', 'M2', 'M3', 'M4', 'M5', 'M6', 'M7', 'M8', 'M9', 'M10']

rule all:
    input:
        expand('Xylella_fastidiosa/index_bowtie2', id = SAMPLES)

rule bowtie:
    input:
        'INSECTOS/{id}/trimmed/{id}_val_1.fq.gz',
        'INSECTOS/{id}/trimmed/{id}_val_2.fq.gz'
    output:
        "{id}_bowtie.sam"
    threads: N_THREADS
    shell:
        'bowtie2 -p {threads} '
        '-x xylella fastidiosa '
        '-1 {input[0]} '
        '-2 {input[1]} '
        '-S {output}'

the output is

Building DAG of jobs...
Nothing to be done.

is there any advices ? Thanks.

snakemake bowtie2 snakefile • 407 views
ADD COMMENT
2
Entering edit mode
6 months ago

The index is for the genome not for your samples. You are solving this incorrectly.

In general, don't build snakemake files until you know how to run the commands separately. You'll just end up overcomplicating your job.

Instead of the example above forget about snakemake and do a:

# Set the index location
IDX=foo

# Put the samples ids into a file, one id per line
echo M1 M2 M3 M4 M5 M6 M7 M8 M9 M10 | tr ' ' '\n' > ids

# run the
cat ids | parallel 'echo bowtie2 -x $IDX {}_val_1.fq.gz {}_val_2.fq.gz \> {}.sam'

prints:

bowtie2 -x foo M1_val_1.fq.gz M1_val_2.fq.gz > M1.sam
bowtie2 -x foo M2_val_1.fq.gz M2_val_2.fq.gz > M2.sam
bowtie2 -x foo M3_val_1.fq.gz M3_val_2.fq.gz > M3.sam
bowtie2 -x foo M4_val_1.fq.gz M4_val_2.fq.gz > M4.sam
bowtie2 -x foo M5_val_1.fq.gz M5_val_2.fq.gz > M5.sam

now tweak the IDX and other paths until all parameters are filled in a way that it works, you can then either remove the echo in front to execute the commands or pipe it right into bash like so:

cat ids | parallel 'echo bowtie2 -x $IDX {}_val_1.fq.gz {}_val_2.fq.gz \> {}.sam' | bash
ADD COMMENT
0
Entering edit mode

Thank you so much for the answer, it works perfectly, i will use this from now on

ADD REPLY
0
Entering edit mode

Well done Daniel... good initiative

ADD REPLY

Login before adding your answer.

Traffic: 1953 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6