bowtie2 using loop on bash with paired-ended reads
1
0
Entering edit mode
3.5 years ago

Hello, I'm trying to align metagenomic read sequences against a reference set of contigs, the samples are each one splitted in two files (these are paired-ended reads) and I'm trying to make a loop using bowtie2 and at the same time making directly a BAM file without making a SAM file on my directory using pipe |

this is my command:

for i in ls /mnt/path/DEC2017/*1.fastq.gz 
do dir=/mnt/path/DEC2017/ 
base=$(basename $i _1.fastq.gz) 
bowtie2 -p 4 -x testbuild -1 $dir/${base}_1.fastq.gz -2 $dir/${base}_2.fastq.gz | samtools view -b -o $dir/${base}.bam -
done

and here is the output (which is repeated a couple of amounts of times)

    stat: Bad file descriptor
Warning: Could not open read file "/mnt/path/DEC2017/ls_1.fastq.gz" for reading; skipping...
Error: No input read files were valid
(ERR): bowtie2-align exited with value 1
[main_samview] fail to read the header from "-".

and the console stills processing this output (as I still see the keyboard pointer symbol, seems frozen.)

So I understand that bowtie2 is not reading any of my files, instead of that is trying to read a file called ".fastq.gz" which clearly doesn't exist.

Here is the format of my files: DEC2017_L3S3D05_1.fastq.gz DEC2017_L3S3D05_2.fastq.gz DEC2017_L3S1D15_1.fastq.gz DEC2017_L3S3D05_2.fastq.gz DEC2017_L1S2D05_1.fastq.gz DEC2017_L1S2D05_2.fastq.gz and so on...

The problem seems to be in the basename command part, but I can't figure out what exactly is.

I know that this issue is pretty similar to this one to run bowtie on multiple fastq file , In fact, I made this loop taking as reference a command described there but given my lack of knowledge of how exactly basename really works, I cannot identify the exact problem.

alignment sequence bowtie2 bash • 4.6k views
ADD COMMENT
0
Entering edit mode

can you try this:

for i in $(ls /mnt/path/DEC2017/*1.fastq.gz); do echo $i; done

Try to dry-run the final command if above loop prints the correct files.

ADD REPLY
0
Entering edit mode

Hello, thanks for your answer :) , yes, the above loop gives me the entire path to my files which corresponds to the "fwd" reads which ends in "_1.fastq-gz" , when you say dry-run the final command, you mean this?:

  do dir=/mnt/path/DEC2017/ 
    base=$(basename $i _1.fastq.gz) 
    bowtie2 -p 4 -x testbuild -1 $dir/${base}_1.fastq.gz -2 $dir/${base}_2.fastq.gz | samtools view -b -o $dir/${base}.bam -
done
ADD REPLY
0
Entering edit mode

Can you try this? "

for i in $(ls /mnt/path/DEC2017/*1.fastq.gz); do echo $i ${i/1.fastq/2.fastq} ${i%1.fastq.gz}.bam; done

This should print R1, R2, destination directory for bam and name of the bam file. Then dry-run with appropriate commands as below:

for i in $(ls /mnt/path/DEC2017/*_1.fastq.gz); do echo "bowtie2 -p 4 -x testbuild -1 $i -2 ${i/_1.fastq/_2.fastq} | samtools view -b -o ${i%_1.fastq.gz}.bam"; done

This would only dry-run, and doesn't execute any command. Remove echo after verifying the output file name and output directory. Becareful, while copying inverted commas.

ADD REPLY
1
Entering edit mode
3.5 years ago

Looping over the output of ls is a bad idea. And it doesn't look like it's even working. It looks like it's taking ls literally instead of running it as a command.

try

for i in *R1.fastq.gz
ADD COMMENT
0
Entering edit mode

Hello thanks for answering!, yes it seems this worked for me, (the command is running and using my threads), now I'm figuring out how to apply nohup over this, this is what happens:

nohup for i in
*1.fastq.gz ; do dir=/mnt/c/Users/eduar/Desktop/Ubuntu/rawdata_Comau/DEC2017/ ; base=$(basename $i _1.fastq.gz
) ; bowtie2 -p 4 -x testbuild -1 $dir/${base}_1.fastq.gz -2 $dir/${base}_2.fastq.gz | samtools view -b -o $dir/${base}.bam -; done > bowtie2output.txt

This is the error: -bash: syntax error near unexpected token `do'

at the same, I don't really understand what basename part is really doing, if you can explain this ill be so grateful.

Thanks for helping.

ADD REPLY
0
Entering edit mode

Are you absolutely sure that your files are named "_1.fastq.gz", and not "_R1.fastq.gz" or "_R1_001.fastq.gz"

ADD REPLY

Login before adding your answer.

Traffic: 2906 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6