Question: Parallel for bwa mem - problem with -R argument for ID and SM
0
gravatar for Korsocius
12 months ago by
Korsocius100
Korsocius100 wrote:

Dear all,

I need help with the Read group ID and SM. I have parallel syntax to alignment via bwa mem.

ls *R1_001.fastq | parallel 'bwa mem -k 19 -A 1 -B 4 -O 6 -L 5 -R '@RG\tID:'{.}'\tSM:'{.}'\tLB:'Trusight_custom_amplicon_CARR'\tPL:'ILLUMINA'\tPI:150' $REFERENCE {} {= s/_R1_001/_R2_001/ =} > {= s/_R1_001.fastq/.sam/ =}'

For the -R argument I would like to get name of sample.. I tried save the name like variable and echo it and so on, but still time doesn't work. Could you help me please to get name of sample to ID and SM ?

Thank you.

bwa alignment parallel • 615 views
ADD COMMENTlink modified 3 days ago by ATpoint14k • written 12 months ago by Korsocius100
3
gravatar for Pierre Lindenbaum
12 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum118k wrote:

second answer, use a Makefile: the same way than in my previous answer C: Is it possible to run variant calling software in parallel, are there any Shell

to run 16 parallel jobs invoke with

make -j 16

ADD COMMENTlink written 12 months ago by Pierre Lindenbaum118k

Make file looks like the best option for all these process :-), i tried it without quotes, but I received error log : No input for ID.....

ADD REPLYlink modified 12 months ago • written 12 months ago by Korsocius100
0
gravatar for Pierre Lindenbaum
12 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum118k wrote:

(not tested) try to remove those single quotes ?

'@RG\tID:'{.}'\tSM:'{.}'\tLB:'Trusight_custom_amplicon_CARR'\tPL:'ILLUMINA'\tPI:150'

to

'@RG\tID:{.}\tSM:{.}\tLB:Trusight_custom_amplicon_CARR\tPL:ILLUMINA\tPI:150'
ADD COMMENTlink written 12 months ago by Pierre Lindenbaum118k
0
gravatar for ATpoint
3 days ago by
ATpoint14k
Germany
ATpoint14k wrote:

For parallel jobs it is convenient to wrap the actual job script into a function and use parallel to parallelize this function:

## Want to do parallel alignments:
function BWAMEM {

  BASENAME=$1
  IDX=$2
  bwa mem (options...) -R '@RG\tID:'${BASENAME}'_ID\tSM:'${BASENAME}'_SM\tPL:Illumina' $IDX ${BASENAME}.R1_001.fastq | \
  samtools view -o ${BASENAME}_unsorted.bam

}; export -f BWAMEM

## Call the function within parallel using awk to extract the basename of the files without the ".R1_001.fastq":
ls *.R1_001.fastq | awk -F ".R1_001.fastq" '{print $1}' | parallel "BWAMEM {} /path/to/bwa_index 2> {}.log"

Using 2> {}.log will print all stderr messages from within the function to one log file per processed fastq file. Say you have a file test.R1_001.fastq, you'll get as output test_unsorted.bam and test.log. Using this approach you do not have to bother yourself with quotes within the parallel function. You can do basically everything you want inside the BWAMEM function and parallel only executes it. Try squeezing an awk command with all its single-and double quotes into parallel, it will be...fun. Or simply write it into a wrapper function to avoid the issues ;-)

ADD COMMENTlink modified 3 days ago • written 3 days ago by ATpoint14k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1868 users visited in the last hour