Hello! I am working on a script for a VCF pipeline.
I currently receive the following error in my bwa step....
line 22: 3354610 Killed bwa mem bwaParts my_lovely_file_trimmed.fq > $bwa_sam
...This is a memory issue that I need to speak to my supervisor about, nonetheless because of this, the SAMtools and GATK commands downstream don't work.
Even though I am dealing with this error early in the pipeline I still want to know if the samtools and GATK syntax is correct, and will work once the memory issue is resolved.
Here is the code
samtools sort $bwa_sam -o $sam_sam
gatk AddOrReplaceReadGroups -I sorted_sam -O GATK_bam -RGLB lib1 --RGPL illumina --RGPU unit1 --RGSM file
If you're well seasoned- like a steak at Gordan Ramsey's restaurants- at these commands, please take a peak for my sanity.
All the best, Gilgy
P.S bwaParts are the amb,ann,bwt,pac,sa files from the bwa index command.
Hey Pierre,
How come using AddOrReplaceReadGroups with BWA is discouraged? Also, thank you for the recommendation on using BAM, I will try with .bam files.
edit: Seem to be getting an error that makes no sense.
returns a 'killed' or 'segmentation fault - core dumped', even at 32 threads & 128G
Meanwhile
returns an indexing error.
because read groups should be already set at the beginning with the option '-R' of bwa mem. Using AddOrReplaceReadGroups is an extra time-consumming step.