Assigning variables programmatically for bwa-mem
1
1
Entering edit mode
2.5 years ago
Joy P ▴ 10

I have the following script:

bwa mem -t 10 -R “@RG\tID:xxx\tSM:xxxx\tLB:LB-1\tPU:xxx\tPL:ILLUMINA” ref_genome.fa sample_1_1.fastq sample_1_2.fastq | samtools view -@ 10 -b - | samtools s sort -@ 10 -o sample_1.bam

I also have a spreadsheet with a column for the forward reads (sample 1, sample 2, sample 3 etc), reverse reads and each of the read group variables. Each row contains all the information for one sample

How can I assign the values of ID, SM and PU, fastq file names and bam file names programmatically from my spreadsheet and run the samples in parallel so that I don't have to input them all manually and can make the most of my computing resources?

I'm using bash script and I'm fairly new to coding.

Thanks!

bwa-mem • 551 views
ADD COMMENT
2
Entering edit mode
2.5 years ago

use GNU parallel.

Suppose the file ids.csv contains:

A,X,1
B,Y,2
C,Z,3

then using parallel you could write:

cat ids.csv | parallel --colsep=',' echo First={1}, Second={2}, Third={3}

it prints:

First=A, Second=X, Third=1
First=B, Second=Y, Third=2
First=C, Second=Z, Third=3

Look for tutorials like this:

Gnu Parallel - Parallelize Serial Command Line Programs Without Changing Them

ADD COMMENT

Login before adding your answer.

Traffic: 2010 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6