Hi everybody still a newbie in bioinformatics, stuck on masurca 3.3.0... any help will be more than welcome. I am trying to assemble a bacterial genome from miseq paired end reads in masurca 3.3.0. without grid options. my compilation file looks like:
> PE= aa 519 844
> /home1/cascarano/projects/miseq/pant_bact/DLK2/DLK2_S1_L001_R1_001.fastq
> /home1/cascarano/projects/miseq/pant_bact/DLK2/DLK2_S1_L001_R2_001.fastq
>
>
> #Illumina mate pair reads supplied as <two-character prefix> <fragment mean> <fragment stdev> <forward_reads> <reverse_reads>
> #JUMP= sh 3600 200
> #pacbio OR nanopore reads must be in a single fasta or fastq file with absolute path, can be gzipped
> #if you have both types of reads supply them both as NANOPORE type
> #PACBIO=/FULL_PATH/pacbio.fa
> #NANOPORE=/FULL_PATH/nanopore.fa
> #Other reads (Sanger, 454, etc) one frg file, concatenate your frg files into one if you have many
> #OTHER=/FULL_PATH/file.frg END
>
> PARAMETERS
> #set this to 1 if your Illumina jumping library reads are shorter than 100bp
> #EXTEND_JUMP_READS=0
> #this is k-mer size for deBruijn graph values between 25 and 127 are supported, auto will compute the optimal size based on the read data
> and GC content GRAPH_KMER_SIZE = auto
> #set this to 1 for all Illumina-only assemblies
> #set this to 0 if you have more than 15x coverage by long reads (Pacbio or Nanopore) or any other long reads/mate pairs (Illumina MP,
> Sanger, 454, etc) USE_LINKING_MATES = 1
> #specifies whether to run mega-reads correction on the grid
> #USE_GRID=0
> #specifies grid engine to use SGE or SLURM
> #GRID_ENGINE=SLURM
> #specifies queue (for SGE) or partition (for SLURM) to use when running on the grid MANDATORY
> #GRID_QUEUE=all.q
> #batch size in the amount of long read sequence for each batch on the grid
> #GRID_BATCH_SIZE=300000000
> #use at most this much coverage by the longest Pacbio or Nanopore reads, discard the rest of the reads
> #LHE_COVERAGE=25
> #set to 1 to only do one pass of mega-reads, for faster but worse quality assembly MEGA_READS_ONE_PASS=0
> #this parameter is useful if you have too many Illumina jumping library mates. Typically set it to 60 for bacteria and 300 for the
> other organisms
> #LIMIT_JUMP_COVERAGE = 60
> #these are the additional parameters to Celera Assembler. do not worry about performance, number or processors or batch sizes -- these
> are computed automatically.
> #set cgwErrorRate=0.25 for bacteria and 0.1<=cgwErrorRate<=0.15 for other organisms. CA_PARAMETERS = cgwErrorRate=0.25
> #minimum count k-mers used in error correction 1 means all k-mers are used. one can increase to 2 if Illumina coverage >100
> KMER_COUNT_THRESHOLD = 1
> #whether to attempt to close gaps in scaffolds with Illumina data CLOSE_GAPS=1
> #auto-detected number of cpus to use NUM_THREADS = 20
> #this is mandatory jellyfish hash size -- a safe value is estimated_genome_size*estimated_coverage JF_SIZE = 460000000
> #set this to 1 to use SOAPdenovo contigging/scaffolding module. Assembly will be worse but will run faster. Useful for very large
> (>5Gbp) genomes from Illumina-only data SOAP_ASSEMBLY=0 END
I get an error
[Mon Mar 4 12:18:14 EET 2019] Overlap/unitig failed, check output under CA/ and runCA1.out
with less on runCA1.out i get:
----------------------------------------END Mon Mar 4 12:18:14 2019 (0 seconds) Created 13 overlap jobs. Last batch '001', last job '000013'.
----------------------------------------START Mon Mar 4 12:18:14 2019 sbatch -D `pwd` -J "ovl_genome[1-13]" -a 1-13 \ -o /home1/cascarano/projects/miseq/pant_bact/DLK2/MASURCAnoGrid/CA/1-overlapper/%A_%a.out \ /home1/cascarano/projects/miseq/pant_bact/DLK2/MASURCAnoGrid/CA/1-overlapper/overlap.sh
sh: 1: sbatch: not found
----------------------------------------END Mon Mar 4 12:18:14 2019 (0 seconds) ERROR: Failed with signal 127
================================================================================
runCA failed.
---------------------------------------- Stack trace:
at /mnt/big/Assembly/MaSuRCA-3.3.0/bin/../CA8/Linux-amd64/bin/runCA line
1613.
main::caFailure("Failed to submit batch jobs.") called at /mnt/big/Assembly/MaSuRCA-3.3.0/bin/../CA8/Linux-amd64/bin/runCA line 87
main::submitBatchJobs(" -D `pwd` -J \"ovl_genome[1-13]\" -a 1-13 \\\x{a} -o /home1/casca"..., "ovl_genome[1-13]") called at /mnt/big/Assembly/MaSuRCA-3.3.0/bin/../CA8/Linux-amd64/bin/runCA line 3809
main::createOverlapJobs("normal") called at /mnt/big/Assembly/MaSuRCA-3.3.0/bin/../CA8/Linux-amd64/bin/runCA line 6523
---------------------------------------- Failure message:
Failed to submit batch jobs.