Hi, We have all reference bacterial genome in our server under A,B,C,D.....Z subfolder. Each of them contains subfolder for each bacterial genome. Under that subfolder fasta file is situated. For example: ~/bacterial_genome/P/Pseudomonas_genome1/genome1.fna Now, I want to know the resistance status of each genome by running resfinder. I concatenated the whole database but due to large size of the input file (~430 gb) our server killed the job. So, I downloaded GNU parallel to handle it. I need to run resfinder 15 times for each of the genome ( because the parameter -a will change every time with different resistance phenotype).
Generally the command for one genome is following: cd ~/resistance mkdir aminoglycoside cd aminoglycoside export PATH=$PATH:/usr/local/bin/blast-2.2.26/bin resfinder.pl -d /Volumes/scratch/databases/resfinderdb/ -i ~/bacterial_genome/P/Pseudomonas_genome1/genome1.fna -a aminoglycoside -k 90.00 -l 0.60
Main problem is this software create results file in the current directory and if the program is run again in the same directory, it deletes previous and save new one.
I am looking for a solution to write script in parallel for running the resfinder software for each genome .