Question: for loop over bwa aln
0
gravatar for phylofun
7 months ago by
phylofun30
phylofun30 wrote:

My for loop around bwa aln and bwa sampe doesn't seem to work. First, I created paths to the read data:

ls -1r  reads/*/cleaned/*READ1.fastq.gz > READ1_list
ls -1r  reads/*/cleaned/*READ2.fastq.gz > READ2_list

Next, I made a for loop using READ1_list and READ2_list around the bwa aln function:

for i in READ1_list 
do
  bwa aln reference/reference.unaligned.fasta $i > ${i%READ1.fastq.gz}read1.sai 
done

for i in READ2_list
do
  bwa aln reference/reference.unaligned.fasta $i > ${i%READ2.fastq.gz}read1.sai 
done

Then, I use a for loop around bwa sampe:

 for i in $(cat READ1_list | sed s'/\-READ1.fastq.gz//' | sort) 
do
  bwa sampe reference/reference.unaligned.fasta $i\-READ1.sai $i\-READ2.sai $i\-READ1.fastq.gz $i\-READ2.fastq.gz \
  | samtools view -bS - > $i\-pe.bam
done

Can anyone spot the issue?

bwa next-gen alignment • 374 views
ADD COMMENTlink modified 7 months ago by jrj.healey13k • written 7 months ago by phylofun30
1

I can't spot the issue. What happens? Are there error messages?

Unless you reads are shorter than 70bp, you should be using bwa mem instead of bwa aln.

ADD REPLYlink written 7 months ago by h.mon26k

There is an error message when I run bwa sampe, but I think the issue begins with bwa aln. No error message occurs when I run bwa aln; however, the .sai output file goes to the current working directory instead of following the path in READ1_list, and the sample name is not written for the .sai file (e.g. READ1_listread1.sai instead of sample_name-read1.sai)

ADD REPLYlink written 7 months ago by phylofun30

Put an echo in your loop to see how it looks like:

 for i in $(cat READ1_list | sed s'/\-READ1.fastq.gz//' | sort) 
do
  echo "bwa sampe reference/reference.unaligned.fasta $i\-READ1.sai $i\-READ2.sai $i\-READ1.fastq.gz $i\-READ2.fastq.gz \
  | samtools view -bS - > $i\-pe.bam"
done

Please note that you could also do the following and pipe directly to samtools sort, without samtools view:

bwa sample ref.fa R1 R2 | samtools sort -o alignment.bam
ADD REPLYlink written 7 months ago by WouterDeCoster40k

I see two problems in your code:

  1. The second alignment outputs a read1.sai file, it should be read2.sai
  2. The final BAM file requires files named as READ but your alignments are read (file names are generally case-sensitive)

Also it is hard to find the errors without any detail, are you running this on Linux/Mac or Windows?, what are the names of the input files?, what is inside the READ1_list and READ2_list files?

ADD REPLYlink written 7 months ago by JC8.2k
for i in READ1_list

That shouldn't even work. You should instead e.g.

for i in $(cat READ1_list); do ...

Just try e.g.

for i in READ1_list; do echo "$i"; done

vs.

for i in $(cat READ1_list); do echo "$i"; done

Only the latter works in bash (as intended)..

But anyway, I wouldn't even use a for loop for this but instead just define a function and use it with find and GNU parallel. Also, why are you aligning the reads separately? Surely it would make more sense to align read pairs properly in the same runs?

ADD REPLYlink modified 7 months ago • written 7 months ago by 5heikki8.4k
0
gravatar for jrj.healey
7 months ago by
jrj.healey13k
United Kingdom
jrj.healey13k wrote:

This is overkill, but you might be interested in my bash code for aligning/remapping: (You'll mainly want to look at Line 95 onward, everything before that is just picking up commandline args etc.)

With a little more bash magic, you can loop your files and pass the relevant paths as arguments:

paste -- "READS1.txt" "READS2.txt" |
while IFS=$'\t' read -r reads1 reads2; do
 bash align_and_remap.sh -r /path/to/reference.fasta -1 "$reads1" -2 "$reads2" -o /path/to/outdir
done

*Disclaimer, this script is not well tested at all. Do some testing of you own before letting it loose on all your data. This is also based on an old setup with bwa aln where you had to provide each read file separately. That doesn't need to happen with bwa mem I believe.

ADD COMMENTlink modified 7 months ago • written 7 months ago by jrj.healey13k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 798 users visited in the last hour