Bowtie2 alignment of multiple paired-end RAD samples
1
0
Entering edit mode
8.8 years ago
Angel R. • 0

Greetings

I'm trying to utilize Bowtie2 to align multiple paired-end RAD reads into a reference genome. I am building my script based on the information provided by the Bowtie2 manual. However, I keep encountering a problem when using the -1 and -2 arguments to put the comma separated list of the input files.

I have a total of 30 files, thus a make a list

-1 file_1.1.fq,file2_1.fq,…,file_30.1.fq -2 file_1.2.fq,file_2.2.fq,…,file_30.2.fq

But when I run my script a get an error that:

"File name too long"

As a test, I tried running the same script, but utilized only 5 of the 30 samples to make sure there wasn't any more errors, but still get the same problems.

Here's my script:

module load bowtie2/2.1.0

REFindex=/directory/and/basename/of/ref/index
IN_DIR=/directory/containing/input/.fq/files
OUT_DIR=/directory/of/output/.sam/files

for input in $IN_DIR/*.fq
do

bowtie2 -p 16 -x $REFindex -1 <file_1.1.fq,file2_1.fq,…,file_30.1.fq> -2 <file_1.2.fq,file_2.2.fq,…,file_30.2.fq> $OUT_DIR/$(basename $input .fq).sam --sensitive

done

Other solutions online suggested utilizing a wildcard to obtain all the samples, but I am not sure what would be the correct procedure there.

Sorry to ask. I am pretty new doing all this computational work.

Thanks for all the help

Angel R.

paired-end alignment RADtags bowtie2 • 4.9k views
ADD COMMENT
0
Entering edit mode
8.8 years ago
Angel R. • 0

Another thing that I read was to put the list of files in a variable on top and place that variable then in the script. Like this:

module load bowtie2/2.1.0

REFindex=/directory/and/basename/of/ref/index
IN_DIR=/directory/containing/input/.fq/files
OUT_DIR=/directory/of/output/.sam/files
MATE1=file_1.1.fq,file2_1.fq,...,file_30.1.fq
MATE2=file_1.2.fq,file_2.2.fq,...,file_30.2.fq

for input in $IN_DIR/* .fq
do

bowtie2 -p 16 -x $REFindex -1 $IN_DIR/$MATE1 -2 $IN_DIR/$MATE2 -S $OUT_DIR/$(basename $input .fq).sam --sensitive

done

I get an output that says this for all the samples:

Warning: Could not open read file "sample_2.1.fq" for reading; skipping...
Warning: Could not open read file "sample_3.1.fq" for reading; skipping...
...
Warning: Could not open read file "sample_29.1.fq" for reading; skipping...
Warning: Could not open read file "sample_30.1.fq" for reading; skipping...
Error: No input read files were valid
bowtie2-align exited with value 1

What I found interesting is that it skips sample_1 and the first warning comes from sample_2.

I still don't know what's happening. Any help will be appreciated.

Thanks,
Angel R.

ADD COMMENT
1
Entering edit mode

If you need the $IN_DIR prefix for one of the samples, you'll need it for all of them.

ADD REPLY
0
Entering edit mode

If I understood you correctly, this means that when I do the $IN_DIR/$MATE1 the script only reads the very first file, in this case sample_1.1.fq, right?

Do you know a way in which they can all be read?

Thanks

ADD REPLY

Login before adding your answer.

Traffic: 2190 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6