Question: bowtie2 using loop on bash with paired-ended reads
0
gravatar for v.berriosfarias
7 days ago by
v.berriosfarias0 wrote:

Hello, I'm trying to align metagenomic read sequences against a reference set of contigs, the samples are each one splitted in two files (these are paired-ended reads) and I'm trying to make a loop using bowtie2 and at the same time making directly a BAM file without making a SAM file on my directory using pipe |

this is my command:

for i in ls /mnt/path/DEC2017/*1.fastq.gz 
do dir=/mnt/path/DEC2017/ 
base=$(basename $i _1.fastq.gz) 
bowtie2 -p 4 -x testbuild -1 $dir/${base}_1.fastq.gz -2 $dir/${base}_2.fastq.gz | samtools view -b -o $dir/${base}.bam -
done

and here is the output (which is repeated a couple of amounts of times)

    stat: Bad file descriptor
Warning: Could not open read file "/mnt/path/DEC2017/ls_1.fastq.gz" for reading; skipping...
Error: No input read files were valid
(ERR): bowtie2-align exited with value 1
[main_samview] fail to read the header from "-".

and the console stills processing this output (as I still see the keyboard pointer symbol, seems frozen.)

So I understand that bowtie2 is not reading any of my files, instead of that is trying to read a file called ".fastq.gz" which clearly doesn't exist.

Here is the format of my files: DEC2017_L3S3D05_1.fastq.gz DEC2017_L3S3D05_2.fastq.gz DEC2017_L3S1D15_1.fastq.gz DEC2017_L3S3D05_2.fastq.gz DEC2017_L1S2D05_1.fastq.gz DEC2017_L1S2D05_2.fastq.gz and so on...

The problem seems to be in the basename command part, but I can't figure out what exactly is.

I know that this issue is pretty similar to this one to run bowtie on multiple fastq file , In fact, I made this loop taking as reference a command described there but given my lack of knowledge of how exactly basename really works, I cannot identify the exact problem.

ADD COMMENTlink modified 7 days ago by swbarnes28.9k • written 7 days ago by v.berriosfarias0

can you try this:

for i in $(ls /mnt/path/DEC2017/*1.fastq.gz); do echo $i; done

Try to dry-run the final command if above loop prints the correct files.

ADD REPLYlink modified 7 days ago • written 7 days ago by cpad011214k

Hello, thanks for your answer :) , yes, the above loop gives me the entire path to my files which corresponds to the "fwd" reads which ends in "_1.fastq-gz" , when you say dry-run the final command, you mean this?:

  do dir=/mnt/path/DEC2017/ 
    base=$(basename $i _1.fastq.gz) 
    bowtie2 -p 4 -x testbuild -1 $dir/${base}_1.fastq.gz -2 $dir/${base}_2.fastq.gz | samtools view -b -o $dir/${base}.bam -
done
ADD REPLYlink written 6 days ago by v.berriosfarias0

Can you try this? "

for i in $(ls /mnt/path/DEC2017/*1.fastq.gz); do echo $i ${i/1.fastq/2.fastq} ${i%1.fastq.gz}.bam; done

This should print R1, R2, destination directory for bam and name of the bam file. Then dry-run with appropriate commands as below:

for i in $(ls /mnt/path/DEC2017/*_1.fastq.gz); do echo "bowtie2 -p 4 -x testbuild -1 $i -2 ${i/_1.fastq/_2.fastq} | samtools view -b -o ${i%_1.fastq.gz}.bam"; done

This would only dry-run, and doesn't execute any command. Remove echo after verifying the output file name and output directory. Becareful, while copying inverted commas.

ADD REPLYlink modified 5 days ago • written 5 days ago by cpad011214k
1
gravatar for swbarnes2
7 days ago by
swbarnes28.9k
United States
swbarnes28.9k wrote:

Looping over the output of ls is a bad idea. And it doesn't look like it's even working. It looks like it's taking ls literally instead of running it as a command.

try

for i in *R1.fastq.gz
ADD COMMENTlink written 7 days ago by swbarnes28.9k

Hello thanks for answering!, yes it seems this worked for me, (the command is running and using my threads), now I'm figuring out how to apply nohup over this, this is what happens:

nohup for i in
*1.fastq.gz ; do dir=/mnt/c/Users/eduar/Desktop/Ubuntu/rawdata_Comau/DEC2017/ ; base=$(basename $i _1.fastq.gz
) ; bowtie2 -p 4 -x testbuild -1 $dir/${base}_1.fastq.gz -2 $dir/${base}_2.fastq.gz | samtools view -b -o $dir/${base}.bam -; done > bowtie2output.txt

This is the error: -bash: syntax error near unexpected token `do'

at the same, I don't really understand what basename part is really doing, if you can explain this ill be so grateful.

Thanks for helping.

ADD REPLYlink modified 6 days ago • written 6 days ago by v.berriosfarias0

Are you absolutely sure that your files are named "_1.fastq.gz", and not "_R1.fastq.gz" or "_R1_001.fastq.gz"

ADD REPLYlink written 5 days ago by swbarnes28.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 993 users visited in the last hour