Question: Bowtie2 alignment of multiple paired-end RAD samples
0
gravatar for Angel R.
3.8 years ago by
Angel R. 0
Puerto Rico, UPR-RP
Angel R. 0 wrote:

Greetings

I’m trying to utilize Bowtie2 to align multiple paired-end RAD reads into a reference genome.  I am building my script based on the information provided by the Bowtie2 manual. However, I keep encountering a problem when using the -1 and -2 arguments to put the comma separated list of the input files.

I have a total of 30 files, thus a make a list …

Greetings

I’m trying to utilize Bowtie2 to align multiple paired-end RAD reads into a reference genome.  I am building my script based on the information provided by the Bowtie2 manual. However, I keep encountering a problem when using the -1 and -2 arguments to put the comma separated list of the input files.

I have a total of 30 files, thus a make a list …

-1 file_1.1.fq,file2_1.fq,…,file_30.1.fq -2 file_1.2.fq,file_2.2.fq,…,file_30.2.fq

But when I run my script a get an error that:

“File name too long”

As a test, I tried running the same script, but utilized only 5 of the 30 samples to make sure there wasn’t any more errors, but still get the same problems.

Here’s my script:

module load bowtie2/2.1.0

REFindex=/directory/and/basename/of/ref/index
IN_DIR=/directory/containing/input/.fq/files
OUT_DIR=/directory/of/output/.sam/files

for input in $IN_DIR/*.fq
do

bowtie2 -p 16 -x $REFindex -1 <file_1.1.fq,file2_1.fq,…,file_30.1.fq> -2 <file_1.2.fq,file_2.2.fq,…,file_30.2.fq> $OUT_DIR/$(basename $input .fq).sam --sensitive

done

Other solutions online suggested utilizing a wildcard to obtain all the samples, but I am not sure what would be the correct procedure there. 

Sorry to ask. I am pretty new doing all this computational work.

Thanks for all the help

Angel R.

ADD COMMENTlink modified 3.8 years ago • written 3.8 years ago by Angel R. 0
0
gravatar for Angel R.
3.8 years ago by
Angel R. 0
Puerto Rico, UPR-RP
Angel R. 0 wrote:

Another thing that I read was to put the list of files in a variable on top and place that variable then in the script. Like this:

module load bowtie2/2.1.0

REFindex=/directory/and/basename/of/ref/index
IN_DIR=/directory/containing/input/.fq/files
OUT_DIR=/directory/of/output/.sam/files
MATE1=file_1.1.fq,file2_1.fq,…,file_30.1.fq
MATE2=file_1.2.fq,file_2.2.fq,…,file_30.2.fq

for input in $IN_DIR/* .fq
do

bowtie2 -p 16 -x $REFindex -1 $IN_DIR/$MATE1 -2 $IN_DIR/$MATE2 -S $OUT_DIR/$(basename $input .fq).sam --sensitive

done

I get an output that says this for all the samples:

Warning: Could not open read file "sample_2.1.fq" for reading; skipping...
Warning: Could not open read file "sample_3.1.fq" for reading; skipping...
...
Warning: Could not open read file "sample_29.1.fq" for reading; skipping...
Warning: Could not open read file "sample_30.1.fq" for reading; skipping...
Error: No input read files were valid
bowtie2-align exited with value 1

What I found interesting is that it skips sample_1 and the first warning comes from sample_2. 

I still don't know what's happening. Any help will be appreciated.

Thanks, 

Angel R.

ADD COMMENTlink modified 3.8 years ago • written 3.8 years ago by Angel R. 0
1

If you need the $IN_DIR prefix for one of the samples, you'll need it for all of them.

ADD REPLYlink written 3.8 years ago by Devon Ryan89k

If I understood you correctly, this means that when I do the $IN_DIR/$MATE1 the script only reads the very first file, in this case sample_1.1.fq, right?

Do you know a way in which they can all be read?

Thanks

ADD REPLYlink written 3.8 years ago by Angel R. 0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2042 users visited in the last hour