Entering edit mode
2.5 years ago
chansik
▴
10
Hello
I'm using
ls *.sorted_markduplicates.bam | parallel --progress --eta -j 3 'gatk BaseRecalibrator -I {} -R ../0.Reference/CH-PICR.fasta -O {.}.recal.bam'
to run multiple Bam files.
But an error has occurred like this:
Computer:jobs running/jobs completed/%of started jobs/Average seconds to complete
ETA: 0s Left: 16 AVG: 0.00s local:3/0/100%/0.0s /usr/bin/bash: gatk: command not found
When I run
gatk BaseRecalibrator -I input.bam -R reference.fasta -O output.recal.bam
It worked.
I think there's something that I missed while setting path. After download, I added
alias gatk="~/Bio/gatk-4.2.2.0/gatk"
in .bashrc
Can you please help me running multiple files in Linux terminal??
Thank you.
parallel
is cool, but you should use a workflow manager (snakemake, nextflow, etc...)I'd tried to use nextflow nf-core chip-seq but I had trouble and I could not solve it The problem was like this:
Nextflow needs to be executed in a shared file system that supports file locks. Alternatively you can run it in a local directory and specify the shared work directory by using by
-w
command line option.Thank you
echo path, try which gatk, if gatk is not found, add gatk folder to the path and try which again.
my gatk is in the PATH and it was found with echo $PATH.
Running gatk with single bam file has no problem, but with the command above, it showed error.
Thanks,
Not sure what's happening. Can you please try furnishing full path to gatk executable to parallel instead of just gatk?