Question

Bowtie2 alignment of multiple paired-end RAD samples

0

Entering edit mode

8.8 years ago

Angel R. • 0

Greetings

I'm trying to utilize Bowtie2 to align multiple paired-end RAD reads into a reference genome. I am building my script based on the information provided by the Bowtie2 manual. However, I keep encountering a problem when using the -1 and -2 arguments to put the comma separated list of the input files.

I have a total of 30 files, thus a make a list

-1 file_1.1.fq,file2_1.fq,…,file_30.1.fq -2 file_1.2.fq,file_2.2.fq,…,file_30.2.fq

But when I run my script a get an error that:

"File name too long"

As a test, I tried running the same script, but utilized only 5 of the 30 samples to make sure there wasn't any more errors, but still get the same problems.

Here's my script:

module load bowtie2/2.1.0

REFindex=/directory/and/basename/of/ref/index
IN_DIR=/directory/containing/input/.fq/files
OUT_DIR=/directory/of/output/.sam/files

for input in $IN_DIR/*.fq
do

bowtie2 -p 16 -x $REFindex -1 <file_1.1.fq,file2_1.fq,…,file_30.1.fq> -2 <file_1.2.fq,file_2.2.fq,…,file_30.2.fq> $OUT_DIR/$(basename $input .fq).sam --sensitive

done

Other solutions online suggested utilizing a wildcard to obtain all the samples, but I am not sure what would be the correct procedure there.

Sorry to ask. I am pretty new doing all this computational work.

Thanks for all the help

Angel R.

paired-end alignment RADtags bowtie2 • 4.9k views

ADD COMMENT • link updated 16 months ago by Ram 43k • written 8.8 years ago by Angel R. • 0

Ram · Answer 1 · 2015-06-25

0

Entering edit mode

8.8 years ago

Angel R. • 0

Another thing that I read was to put the list of files in a variable on top and place that variable then in the script. Like this:

module load bowtie2/2.1.0

REFindex=/directory/and/basename/of/ref/index
IN_DIR=/directory/containing/input/.fq/files
OUT_DIR=/directory/of/output/.sam/files
MATE1=file_1.1.fq,file2_1.fq,...,file_30.1.fq
MATE2=file_1.2.fq,file_2.2.fq,...,file_30.2.fq

for input in $IN_DIR/* .fq
do

bowtie2 -p 16 -x $REFindex -1 $IN_DIR/$MATE1 -2 $IN_DIR/$MATE2 -S $OUT_DIR/$(basename $input .fq).sam --sensitive

done

I get an output that says this for all the samples:

Warning: Could not open read file "sample_2.1.fq" for reading; skipping...
Warning: Could not open read file "sample_3.1.fq" for reading; skipping...
...
Warning: Could not open read file "sample_29.1.fq" for reading; skipping...
Warning: Could not open read file "sample_30.1.fq" for reading; skipping...
Error: No input read files were valid
bowtie2-align exited with value 1

What I found interesting is that it skips sample_1 and the first warning comes from sample_2.

I still don't know what's happening. Any help will be appreciated.

Thanks,
Angel R.

ADD COMMENT • link updated 16 months ago by Ram 43k • written 8.8 years ago by Angel R. • 0

1

Entering edit mode

If you need the $IN_DIR prefix for one of the samples, you'll need it for all of them.

ADD REPLY • link updated 16 months ago by Ram 43k • written 8.8 years ago by Devon Ryan 104k

0

Entering edit mode

If I understood you correctly, this means that when I do the $IN_DIR/$MATE1 the script only reads the very first file, in this case sample_1.1.fq, right?

Do you know a way in which they can all be read?

Thanks

ADD REPLY • link updated 16 months ago by Ram 43k • written 8.8 years ago by Angel R. • 0