Aligning Multiple paired end files together
2.5 years ago
David_emir

Hi All,

I have 72 paired end .fastq file for which i need to do Alignment using BWA. Since its a paired end data and my files are named as

1. sam_001_1.fastq

2. sam_001_2.fastq

3. sam_002_1.fastq

4. sam_002_2.fastq & so on

Since its a paired end data i am not able to generate a single .sam file using a shell script. my script is generating individual .sam file for each .fastq files. Please let me know whats the best strategy for handling this.

EDIT:1 My script

for i in *.fastq;

do

for i in *.fastq;

do

bwa mem -t 10 /data2/Exome/HG19/BWA/hg19 1.$i 2.$i > /data2/validation_samples/fastq2sam/$i.sam; done

the logic to be applied is mentioned here

2.5 years ago
Joe

Your loop isnt working quite how you think it is. Each iteration of the loop is only operating on a single fastq file.

What you need to do is, loop over all the R1s, strip the filename down to its ‘base’, without the R1/R2, then reconstitute the second filename on the fly (or use file pairing with GNU parallel but I can’t find the link right now).

There are quite a few similar questions to this task on the forum, so there will be a pre-existing loop template you can try.

Thanks a lot for your help, i am able to sucessfully run BWA on my .fastq files. with following code,

for f in $(ls *.fq_filtered | sed -e 's/_1.fq_filtered//' -e 's/_2.fq_filtered//' | sort -u) do bwa mem -t 20 hg19${f}_1.fq_filtered ${f}_2.fq_filtered > /path/to/be/saved/${f}.sam
done

Awesome @David_emir! To get my bam files direct I simply replaced

/path/to/be/saved/${f}.sam] with l samtools sort -o${f}.bam