Question: Question: How to create a snakefile(snakemake) to align paired-end reads with bowtie2
1
gravatar for daalsa98
5 weeks ago by
daalsa9810
daalsa9810 wrote:

Im new using snakemake, I'm trying to align paired-end reads using bowtie2, I have already indexed my fasta files, with bowtie-build, and i dont know how to reference the index that i previously made in my snakefile,

the index files generated by bowtie-build are:

  • xylella_fastidiosa.1.bt2
  • xylella_fastidiosa.2.bt2
  • xylella_fastidiosa.3.bt2
  • xylella_fastidiosa.4.bt2
  • xylella_fastidiosa.rev.1.bt2
  • xylella_fastidiosa.rev.2.bt2

my code is :

N_THREADS = 12

SAMPLES = ['M1', 'M2', 'M3', 'M4', 'M5', 'M6', 'M7', 'M8', 'M9', 'M10']

rule all:
    input:
        expand('Xylella_fastidiosa/index_bowtie2', id = SAMPLES)

rule bowtie:
    input:
        'INSECTOS/{id}/trimmed/{id}_val_1.fq.gz',
        'INSECTOS/{id}/trimmed/{id}_val_2.fq.gz'
    output:
        "{id}_bowtie.sam"
    threads: N_THREADS
    shell:
        'bowtie2 -p {threads} '
        '-x xylella fastidiosa '
        '-1 {input[0]} '
        '-2 {input[1]} '
        '-S {output}'

the output is

Building DAG of jobs...
Nothing to be done.

is there any advices ? Thanks.

snakefile snakemake bowtie2 • 179 views
ADD COMMENTlink modified 5 weeks ago by Istvan Albert ♦♦ 85k • written 5 weeks ago by daalsa9810
2
gravatar for Istvan Albert
5 weeks ago by
Istvan Albert ♦♦ 85k
University Park, USA
Istvan Albert ♦♦ 85k wrote:

The index is for the genome not for your samples. You are solving this incorrectly.

In general, don't build snakemake files until you know how to run the commands separately. You'll just end up overcomplicating your job.

Instead of the example above forget about snakemake and do a:

# Set the index location
IDX=foo

# Put the samples ids into a file, one id per line
echo M1 M2 M3 M4 M5 M6 M7 M8 M9 M10 | tr ' ' '\n' > ids

# run the
cat ids | parallel 'echo bowtie2 -x $IDX {}_val_1.fq.gz {}_val_2.fq.gz \> {}.sam'

prints:

bowtie2 -x foo M1_val_1.fq.gz M1_val_2.fq.gz > M1.sam
bowtie2 -x foo M2_val_1.fq.gz M2_val_2.fq.gz > M2.sam
bowtie2 -x foo M3_val_1.fq.gz M3_val_2.fq.gz > M3.sam
bowtie2 -x foo M4_val_1.fq.gz M4_val_2.fq.gz > M4.sam
bowtie2 -x foo M5_val_1.fq.gz M5_val_2.fq.gz > M5.sam

now tweak the IDX and other paths until all parameters are filled in a way that it works, you can then either remove the echo in front to execute the commands or pipe it right into bash like so:

cat ids | parallel 'echo bowtie2 -x $IDX {}_val_1.fq.gz {}_val_2.fq.gz \> {}.sam' | bash
ADD COMMENTlink modified 5 weeks ago • written 5 weeks ago by Istvan Albert ♦♦ 85k

Thank you so much for the answer, it works perfectly, i will use this from now on

ADD REPLYlink written 5 weeks ago by daalsa9810

Well done Daniel... good initiative

ADD REPLYlink written 5 weeks ago by Antonio R. Franco4.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1472 users visited in the last hour