shell script for bowtie/bwa alignment pair end reads
3
3
Entering edit mode
7.4 years ago
Paul ★ 1.4k

Dear all,

I would like to write shell script to automatic alignment my pair end data with bowtie (or bwa) aligners.

Problem is, when I have lot of pair-end fastq files in one directory I have problem to loading data in for cycle.

Here is my code:

for i in $(ls *.fastq | rev | cut -c 22- | rev | uniq) do ./bowtie2 --very-sensitive -p32 --rg-id${i} --rg-sample ${i} --rg-library Illumina --rg-platform Illuimna -x$reference -1 ${i}_L001_R1_001.fatsq.gz -2${i}_L001_R2_001.fatsq.gz -S ${i%.fastq.gz}${i}.sam
done;


This is very nice solution help solved by RamRS. Is there any other way how to treat Bowtie or BWA with many samples (corresponding one sample to each read) in one for cycle?

my directory contains lot of fastq.gz:

name1_R1_001.fatsq.gz
name2_R2_001.fatsq.gz
...
...
...


Each sample name (prefix before R._001.fastq.gz) has different name length so algorithm above doesn't work properly :-(

Thank you for any idea and solutions..

fastq shell bwa bowtie illumina • 15k views
1
Entering edit mode

Hi Paul,

Not sure if you have this sorted but I managed to manipulate RamRS's bash to run for Bowtie2

#!bin/bash
for I in $(ls *.fastq | rev | cut -c 14- | rev | uniq) do bowtie2 --very-sensitive -p16 --rg-id${i} -x cprefseqs -1 ${i}_R1_001.fastq -2${i}_R2_001.fastq  -S i.sam done  Not sure if this is exactly what you want but it gives the output I'm after. I'm by no means an expert in the btw, it just worked for me. The format of my input files are like this: 65_S155_L001_R1_001.fastq Cheers, Todd ADD REPLY 0 Entering edit mode I assume you mean "one sample per file/pair of files" rather than "one sample to each read". What have you tried other than that solution from RamRS and do you understand what it's doing? Once you know how that solution works, you should be able to do this without any help. ADD REPLY 0 Entering edit mode I don't understand very well what is exactly your question. Are you trying to run BWA/Bowtie for each pair of samples (assuming PE) in a folder? So, the problem is that you are not able to get the files two by two? ADD REPLY 0 Entering edit mode Presumably you meant to reply comment on the original post, not my comment :) I've moved it accordingly. ADD REPLY 0 Entering edit mode @airan: This might provide some context: bash loop for alignment RNA-seq data ADD REPLY 0 Entering edit mode Hi, this script depends not on the prefix length but only on the length of the R[12]_001.fastq part, which is constant (13). What is the problem you face, exactly? Also, {MY_VARIABLE_NAME} is the syntax to encapsulate a variable name; you're looking for the $(MY_COMMAND_OR_PIPE) syntax, to execute a command or pipe. ADD REPLY 3 Entering edit mode 7.4 years ago Zaag ▴ 800 for I in ls *.fastq | cut -d "_" -f 1 ; do ./bowtie2 --very-sensitive -p 32 --rg-id$i --rg-sample $i --rg-library Illumina --rg-platform Illuimna -x$reference -1 $i\_L001_R1_001.fastq.gz -2$i_L001_R2_001.fastq.gz -S $i.sam ; done ;  edit: removed useless grep ADD COMMENT 1 Entering edit mode There's no reason to grep the output of ls. Just ls *.fastq. ADD REPLY 1 Entering edit mode OP has made a copy-paste mistake. My original script here has $(COMMAND), not ${COMMAND}. $(COMMAND) and COMMAND are equivalent. (COMMAND) IMO avoids quoting confusion. ADD REPLY 0 Entering edit mode 5.1 years ago Hello, I tried the above-indicated code after modification according to my sequences, I always get an error. bowtie Error! Error: Encountered internal Bowtie 2 exception (#1) (ERR): bowtie2-align exited with value 1  the code is like that: for i inls *.fastq ; do bowtie2 --very-sensitive-local -p 16 --rg-id ${i} --rg sample${i} -x $index/*.bt2 -1${i}_*_1P.fastq -2 ${i}_*_2P.fastq -U${i}_*_U.fastq -S {i}-H2-7.sam ; done  my sequence are already trimmed and have a uniq prefix and ending like NG-9974_*_**1**_1P.fastq NG-9974_*_**1**_2P.fastq NG-9974_*_**1**_1U.fastq # for trimmed unpaired NG-9974_*_**1**_2U.fastq # for trimmed unpaired  it is also like that (5 or 6, 8 instead of 1 as this) NG-9974_*_**5**_1P.fastq NG-9974_*_**5**_2P.fastq NG-9974_*_**5**_1U.fastq # for trimmed unpaired NG-9974_*_**5**_2U.fastq # for trimmed unpaired  I also tried to rev and cut the sequence name as it was suggested, it always gave me the indicated error, could somebody help please to improve or correct my code. Thank you in advance, Emad ADD COMMENT 2 Entering edit mode Try the same loop but add echo in front of the command, to check how it looks like without running the actual alignment. As such you can spot errors. ADD REPLY 0 Entering edit mode Thanks Wouter and Ram for your reply, it looks like the bowtie loop does not take the right paired sequence NG-9974__1_1P.fastq and NG-9974__1_2P.fastq as -1 and -2 paired sequences, and the unpaired sequences NG-9974__1_1U.fastq and NG-9974__1_2U.fastq as unpaired under -U arguments rather than take the individual sequence like NG-9974__1_1P.fastq and add the suffixes __1P.fastq, __2P.fastq, and __1U.fastq after the -1, -2 and -U arguments like this: -1 NG-9974__1_1P.fastq__1P.fastq -2 NG-9974__1_1P.fastq__2P.fastq -U NG-9974__1_1P.fastq__1U.fastq  and -1 NG-9974__1_2P.fastq__1P.fastq -2 NG-9974__1_2P.fastq__2P.fastq -U NG-9974__1_2P.fastq__1U.fastq  and -1 NG-9974__2_1U.fastq__1P.fastq -2 NG-9974__2_1U.fastq__2P.fastq -U NG-9974__2_1U.fastq__1U.fastq  I could solve this problem by cutting the last 10 character of the sequences, but the problem still exists with bowtie error, although the alignment pairs and single unpaired looks good now! ADD REPLY 0 Entering edit mode Is there a verbose option to bowtie? Does it write a log? ADD REPLY 0 Entering edit mode bowtie alignment itself does have a verbose option, it has just in bowtie-build and bowtie-inspector options. Anyway, I could dig out the problem and it looks that the sign in front of the index argument made all the problems. It works now, thank you again! :)

2
Entering edit mode

Your bash loop looks for a ${i}_*_U.fastq file, but you have ${i}_*_1U.fastq and \${i}_*_2U.fastq files.