bowtie2 using loop on bash with paired-ended reads
1
0
Entering edit mode
6 months ago

Hello, I'm trying to align metagenomic read sequences against a reference set of contigs, the samples are each one splitted in two files (these are paired-ended reads) and I'm trying to make a loop using bowtie2 and at the same time making directly a BAM file without making a SAM file on my directory using pipe |

this is my command:

for i in ls /mnt/path/DEC2017/*1.fastq.gz
do dir=/mnt/path/DEC2017/
base=$(basename$i _1.fastq.gz)
bowtie2 -p 4 -x testbuild -1 $dir/${base}_1.fastq.gz -2 $dir/${base}_2.fastq.gz | samtools view -b -o $dir/${base}.bam -
done


and here is the output (which is repeated a couple of amounts of times)

    stat: Bad file descriptor
Error: No input read files were valid
(ERR): bowtie2-align exited with value 1


and the console stills processing this output (as I still see the keyboard pointer symbol, seems frozen.)

So I understand that bowtie2 is not reading any of my files, instead of that is trying to read a file called ".fastq.gz" which clearly doesn't exist.

Here is the format of my files: DEC2017_L3S3D05_1.fastq.gz DEC2017_L3S3D05_2.fastq.gz DEC2017_L3S1D15_1.fastq.gz DEC2017_L3S3D05_2.fastq.gz DEC2017_L1S2D05_1.fastq.gz DEC2017_L1S2D05_2.fastq.gz and so on...

The problem seems to be in the basename command part, but I can't figure out what exactly is.

I know that this issue is pretty similar to this one to run bowtie on multiple fastq file , In fact, I made this loop taking as reference a command described there but given my lack of knowledge of how exactly basename really works, I cannot identify the exact problem.

alignment sequence bowtie2 bash • 503 views
0
Entering edit mode

can you try this:

for i in $(ls /mnt/path/DEC2017/*1.fastq.gz); do echo$i; done


Try to dry-run the final command if above loop prints the correct files.

0
Entering edit mode

Hello, thanks for your answer :) , yes, the above loop gives me the entire path to my files which corresponds to the "fwd" reads which ends in "_1.fastq-gz" , when you say dry-run the final command, you mean this?:

  do dir=/mnt/path/DEC2017/
base=$(basename$i _1.fastq.gz)
bowtie2 -p 4 -x testbuild -1 $dir/${base}_1.fastq.gz -2 $dir/${base}_2.fastq.gz | samtools view -b -o $dir/${base}.bam -
done

0
Entering edit mode

Can you try this? "

for i in $(ls /mnt/path/DEC2017/*1.fastq.gz); do echo$i ${i/1.fastq/2.fastq}${i%1.fastq.gz}.bam; done


This should print R1, R2, destination directory for bam and name of the bam file. Then dry-run with appropriate commands as below:

for i in $(ls /mnt/path/DEC2017/*_1.fastq.gz); do echo "bowtie2 -p 4 -x testbuild -1$i -2 ${i/_1.fastq/_2.fastq} | samtools view -b -o${i%_1.fastq.gz}.bam"; done


This would only dry-run, and doesn't execute any command. Remove echo after verifying the output file name and output directory. Becareful, while copying inverted commas.

1
Entering edit mode
6 months ago
swbarnes2 9.9k

Looping over the output of ls is a bad idea. And it doesn't look like it's even working. It looks like it's taking ls literally instead of running it as a command.

try

for i in *R1.fastq.gz

0
Entering edit mode

Hello thanks for answering!, yes it seems this worked for me, (the command is running and using my threads), now I'm figuring out how to apply nohup over this, this is what happens:

nohup for i in
*1.fastq.gz ; do dir=/mnt/c/Users/eduar/Desktop/Ubuntu/rawdata_Comau/DEC2017/ ; base=$(basename$i _1.fastq.gz
) ; bowtie2 -p 4 -x testbuild -1 $dir/${base}_1.fastq.gz -2 $dir/${base}_2.fastq.gz | samtools view -b -o $dir/${base}.bam -; done > bowtie2output.txt


This is the error: -bash: syntax error near unexpected token do'

at the same, I don't really understand what basename` part is really doing, if you can explain this ill be so grateful.

Thanks for helping.

0
Entering edit mode

Are you absolutely sure that your files are named "_1.fastq.gz", and not "_R1.fastq.gz" or "_R1_001.fastq.gz"