perl script for BWA-mem on multiple different files
1
1
Entering edit mode
5.0 years ago
brunobsouzaa ▴ 830

Hello everyone,

I have the following fastq files:

  • Sample_1_R1.fastq.gz
  • Sample_1_R2.fastq.gz
  • Sample_2_R1.fastq.gz
  • Sample_2_R2.fastq.gz
  • Sample_3_R1.fastq.gz
  • Sample_3_R2.fastq.gz
  • Sample_4_R1.fastq.gz
  • Sample_4_R2.fastq.gz
  • Sample_5_R1.fastq.gz
  • Sample_5_R2.fastq.gz

I'm wondering if anyone know a perl or bash script that allows me to run BWA-mem simultaneously on those samples. My output has to be:

  • Sample_1.sam
  • Sample_2.sam
  • Sample_3.sam
  • Sample_4.sam
  • Sample_5.sam
BWA Exome • 2.4k views
ADD COMMENT
0
Entering edit mode

More than one way of doing this in addition to ATpoint 's example below. No perl needed.

BWA mem on multiple samples
Need Coding To Run Bwa Mem In Batch Mode

ADD REPLY
5
Entering edit mode
5.0 years ago
ATpoint 82k

Here is a basic function that is then called in parallel for each sample using GNU parallel. Using -j in parallel allows to decide how many jobs run in parallel. Please see the manual of parallel for details. The output will be a BAM file for each sample (there is no need/advantage keeping SAM files as they are not compressed and only take up space).

function BWA {

  INDEX=$1
  BASENAME=$2
  bwa mem "${INDEX}" "${BASENAME}"_R1.fastq.gz "${BASENAME}"_R2.fastq.gz | samtools view -o "${BASENAME}"_aligned.bam

}; export -f BWA

ls *_R1.fastq.gz | awk -F "_R1.fastq.gz" '{print $1}' | parallel "BWA /path/to/bwa/index {}"
ADD COMMENT

Login before adding your answer.

Traffic: 1949 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6