Question: Alligning hundrededs of RADseq samples in Bowtie 2
0
gravatar for jt358
2.5 years ago by
jt3580
jt3580 wrote:

Hi Everybody,

I wondered if I could ask a question regarding bowtie2

I'm trying to align around 500 samples of PE data to a reference genome producing a .sam file output for each.

I've tried running specifying:

bowtie2 -x assembly44indexfile -1 *.1.fq.gz -2 *.2.fq.gz -S (but this aligns all input reads into a single SAM)

I've tried running specifying:

bowtie2 -x assembly44indexfile -1 ind1.1.fq.gz,ind2.1.fq.gz -2 ind1.2.fq.gz,ind2.2.fq.gz -S (but again all samples get aligned in a single SAM)

I've tried running a script I found on the net and altered to the below but it didn't run (I'm slightly lost on writing scripts:

for sample in `ls /data/omicsScratch/mpx247/script/rad`
do
dir="/data/omicsScratch/mpx247/script/rad"
base=$(basename $sample ".1.fq.gz")
bowtie2 -x /data/omicsScratch/mpx247/script/rad/assembly44indexfile -1 ${dir}/${base}.1.fq.gz -2 ${dir}/${base}.1.fq.gz -S$
done

I wonder if anybody knows an answer at all or can offer advice as to how to deal with this issue. All my read 1's end in .1.fq.gz and read 2's .2.fq.gz with a prefix of usually something like AU12 (sample 12 from Australia) but sometimes this has additional info in the prefix.

Thanks so much for any help

Best

J

ADD COMMENTlink modified 2.4 years ago by letitbitsuren0 • written 2.5 years ago by jt3580

See if this works.

for sample in `ls /data/omicsScratch/mpx247/script/rad/*.1.gz`
    do
    dir="/data/omicsScratch/mpx247/script/rad";
    base=$(basename "$sample" | cut -d. -f1);
    bowtie2 -x /data/omicsScratch/mpx247/script/rad/assembly44indexfile -1 $dir/$base.1.fq.gz -2 $dir/$base.2.fq.gz -S $base.sam;
    done
ADD REPLYlink modified 2.5 years ago • written 2.5 years ago by genomax64k

Please help me?

I want to learn bowtie2 but some trouble ...

Could not locate a Bowtie index corresponding to basename "/home/otgoo/jishee/amplicons_dk.fa" Error: Encountered internal Bowtie 2 exception (#1) Command: /usr/bin/../lib/bowtie2/bin/bowtie2-align-s --wrapper basic-0 -p16 -x /home/otgoo/jishee/amplicons_dk.fa -s /home/otgoo/jishee/result/dun.sam -1 /home/otgoo/jishee/VEGF1_S1_L001_R1_001.fastq -2 /home/otgoo/jishee/VEGF1_S1_L001_R2_001.fastq (ERR): bowtie2-align exited with value 1

My input is "bowtie2 -p16 -x /home/otgoo/jishee/amplicons_dk.fa -1 /home/otgoo/jishee/VEGF1_S1_L001_R1_001.fastq -2 /home/otgoo/jishee/VEGF1_S1_L001_R2_001.fastq -s /home/otgoo/jishee/result/dun.sam"

ADD REPLYlink written 2.4 years ago by letitbitsuren0

Have you created the bowtie2 indexes by using the command bowtie2-build ? If not you should do that first. bowtie2-build -f your_fasta my_index (something along this line). Then use my_index as the base name for the index.

ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by genomax64k

Thanks for your reply. Yes I have created indexes files (there, 6 files were created) by bowtie2-build.

ADD REPLYlink written 2.4 years ago by letitbitsuren0

Can you provide the bowtie2-build command that you used? Is the amplicons_dk.fa file in the same directory as the other six index files?

ADD REPLYlink written 2.4 years ago by genomax64k

Ok. I used following command.

bowtie2-build /home/otgoo/example/reference/lambda_virus.fa /home/otgoo/example/index/index.fasta

ADD REPLYlink written 2.4 years ago by letitbitsuren0

That means you set the "base name" for your index to index.fasta. So you would need to run your command as

bowtie2 -p16 -x /home/otgoo/jishee/index.fasta -1 /home/otgoo/jishee/VEGF1_S1_L001_R1_001.fastq -2 /home/otgoo/jishee/VEGF1_S1_L001_R2_001.fastq -s /home/otgoo/jishee/result/dun.sam
ADD REPLYlink written 2.4 years ago by genomax64k

Ok. That is working thank you very much. In future, Could I ask to you for some question on these study ? Best regards.

ADD REPLYlink written 2.4 years ago by letitbitsuren0

Sure. If the new question is unrelated to the original in this thread you can start a new post.

ADD REPLYlink written 2.4 years ago by genomax64k
1
gravatar for Brice Sarver
2.5 years ago by
Brice Sarver2.5k
United States
Brice Sarver2.5k wrote:

As written, you have a few small typos in your call - maybe intentional or not. This should do what you want. Make sure things run appropriately on your system; maybe try a single sample first.

#don't need to run 'ls'; run in a folder of reads without subdirectories
for i in *1.fq.gz;
#to check and see if you're working with the correct files
do echo $i;
name=$(echo $i | cut -d '.' -f 1);
echo $name;
bowtie -x [path to the index] -1 "$name".1.fq.gz -2 "$name".2.fq.gz -S | samtools view -bS > "$name".pe.bam;
done

This will align the sample, convert to a BAM (reduces your file size), and write it to a file.

ADD COMMENTlink modified 2.5 years ago • written 2.5 years ago by Brice Sarver2.5k

Many thanks indeed genomax2 and Brice Sarver for such fast responses.

I had a play with both scripts. Brice it may well have been something I did, but your script gave an error each time of running of a problem finding inputs for the paired end specs -1 and -2. Genomax, your version seems to have run ok and I'm now running it on the full data set to test it. One query, the line 'for sample in `ls /data/omicsScratch/mpx247/script/rad/*.1.gz' it seemed to run OK but should I have had that as *.fq.gz?

Once again, really appreciate the generosity of you both taking the time to respond

ADD REPLYlink written 2.5 years ago by jt3580

Ah, should have specified: you'll want to run this in a folder of reads, hence the for i in *.fq.gz. I bet you were running it a level up. My code looks for reads with the naming structure you specified, then gets the stem from them and uses that in the bowtie call. I'll clarify my code for future users.

ADD REPLYlink written 2.5 years ago by Brice Sarver2.5k
0
gravatar for jt358
2.5 years ago by
jt3580
jt3580 wrote:

Hey Brice - I was running it in the folder of reads, the exact error was

Error: 0 mate files/sequences were specified with -1, but 1 mate files/sequences were specified with -2. The same number of mate files/ sequences must be specified with -1 and -2. Error: Encountered internal Bowtie 2 exception (#1) Command: /share/apps/bowtie2/2.2.8/bin/bowtie2-align-s --wrapper basic-0 -x [path -S -1 /tmp/48513.inpipe1 -2 /tmp/48513.inpipe2 to the index] (ERR): bowtie2-align exited with value 1

Cheers

Jamie

ADD COMMENTlink written 2.5 years ago by jt3580

Please use ADD COMMENT/ADD REPLY when responding to existing posts. This helps keep the threads logically organized. This post belongs against @Brice's post above this one.

ADD REPLYlink written 2.5 years ago by genomax64k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2210 users visited in the last hour