Question: Parallel for bwa mem - problem with -R argument for ID and SM
0
gravatar for Korsocius
17 months ago by
Korsocius130
Korsocius130 wrote:

Dear all,

I need help with the Read group ID and SM. I have parallel syntax to alignment via bwa mem.

ls *R1_001.fastq | parallel 'bwa mem -k 19 -A 1 -B 4 -O 6 -L 5 -R '@RG\tID:'{.}'\tSM:'{.}'\tLB:'Trusight_custom_amplicon_CARR'\tPL:'ILLUMINA'\tPI:150' $REFERENCE {} {= s/_R1_001/_R2_001/ =} > {= s/_R1_001.fastq/.sam/ =}'

For the -R argument I would like to get name of sample.. I tried save the name like variable and echo it and so on, but still time doesn't work. Could you help me please to get name of sample to ID and SM ?

Thank you.

bwa alignment parallel • 861 views
ADD COMMENTlink modified 5 months ago by ATpoint21k • written 17 months ago by Korsocius130
3
gravatar for Pierre Lindenbaum
17 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum122k wrote:

second answer, use a Makefile: the same way than in my previous answer C: Is it possible to run variant calling software in parallel, are there any Shell

to run 16 parallel jobs invoke with

make -j 16

ADD COMMENTlink written 17 months ago by Pierre Lindenbaum122k

Make file looks like the best option for all these process :-), i tried it without quotes, but I received error log : No input for ID.....

ADD REPLYlink modified 17 months ago • written 17 months ago by Korsocius130
0
gravatar for Pierre Lindenbaum
17 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum122k wrote:

(not tested) try to remove those single quotes ?

'@RG\tID:'{.}'\tSM:'{.}'\tLB:'Trusight_custom_amplicon_CARR'\tPL:'ILLUMINA'\tPI:150'

to

'@RG\tID:{.}\tSM:{.}\tLB:Trusight_custom_amplicon_CARR\tPL:ILLUMINA\tPI:150'
ADD COMMENTlink written 17 months ago by Pierre Lindenbaum122k
0
gravatar for ATpoint
5 months ago by
ATpoint21k
Germany
ATpoint21k wrote:

For parallel jobs it is convenient to wrap the actual job script into a function and use parallel to parallelize this function:

## Want to do parallel alignments:
function BWAMEM {

  BASENAME=$1
  IDX=$2
  bwa mem (options...) -R '@RG\tID:'${BASENAME}'_ID\tSM:'${BASENAME}'_SM\tPL:Illumina' $IDX ${BASENAME}.R1_001.fastq | \
  samtools view -o ${BASENAME}_unsorted.bam

}; export -f BWAMEM

## Call the function within parallel using awk to extract the basename of the files without the ".R1_001.fastq":
ls *.R1_001.fastq | awk -F ".R1_001.fastq" '{print $1}' | parallel "BWAMEM {} /path/to/bwa_index 2> {}.log"

Using 2> {}.log will print all stderr messages from within the function to one log file per processed fastq file. Say you have a file test.R1_001.fastq, you'll get as output test_unsorted.bam and test.log. Using this approach you do not have to bother yourself with quotes within the parallel function. You can do basically everything you want inside the BWAMEM function and parallel only executes it. Try squeezing an awk command with all its single-and double quotes into parallel, it will be...fun. Or simply write it into a wrapper function to avoid the issues ;-)

ADD COMMENTlink modified 5 months ago • written 5 months ago by ATpoint21k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1193 users visited in the last hour