Question: Allpaths- Keep getting ConvertToFastbQualb.pl failed for group 'paired_ends' when i run PrepareAllPaths
0
gravatar for mafireyi
4.1 years ago by
mafireyi50
South Africa
mafireyi50 wrote:

I am trying to use Allpaths for denovo assembly.

My data summary looks like the following.

Hiseq_Run12_17122014        25GB        Mate-pair    (Size selected to 3KB)    
Hiscan_Run20_12022012        15GB        Paired-end (Nextera V1)    180bp insert    
Hiscan_Run19_17102012        17GB        PE (Nextera V2)    500bp insert    
Hiscan_Run15_12042012        6,5GB        Paired-end (Nextera V1)    180bp insert    
Hiscan_Run14_22032012        3,94GB        Paired-end (Nextera V1)    380bp insert    
Hiscan_Run12_01032012        3,75GB        Paired-end (Nextera V1)    380bp insert    
Hiscan_Run5_08092011        3,58GB        Single-end(Nextera V1)    380bp insert    
Hiscan_Run4re_26072011        1,3GB        Single-end(Nextera V1)    180bp insert
Hiseq_Run14_150313    XXGb    Paired end 250bp insert size

I used Hiseq14_150313 as P E reads as fragment and Hiseq_Run12_17122014 matepairs as the jumping reads for my csv files. I keep getting the following error when I run PrepareAllPaths.pl

Here's my PBS script:

#!/bin/bash
#PBS -N PrepareAllpaths
#PBS -q batch
#PBS -l nodes=1:ppn=16

cd $PBS_O_WORKDIR
mkdir -p NewGuava/data

#export PATH:/scratch/sysusers/godwin/allpaths-bin/bin:$PATH

/scratch/sysusers/godwin/allpaths-bin/bin/PrepareAllPathsInputs.pl DATA_DIR=$PBS_O_WORKDIR/NewGuava/data  PLOIDY=2 IN_GROUPS_CSV=in_groups.csv IN_LIBS_CSV=in_libs.csv OVERWRITE=True

exit 0

The error I see:

Call to new failed, memory usage before call = 17169108k.

AND

**** 2015-06-29 13:10:03 (CG): ConvertToFastbQualb.pl failed for group 'paired_ends'.
---- 2015-06-29 13:10:04 (CG): Importing group 'mate_ends'.

Please assist. What may be the problem

next-gen assembly • 1.6k views
ADD COMMENTlink modified 4.1 years ago by Biostar ♦♦ 20 • written 4.1 years ago by mafireyi50

It would help if you gave us the exact command you used.

ADD REPLYlink written 4.1 years ago by RamRS23k

Try adding a memory usage PBS directive explicitly to the PBS header.

ADD REPLYlink written 4.1 years ago by RamRS23k

Thanks I have tried that. Will see the results tomorow

ADD REPLYlink written 4.1 years ago by mafireyi50

Wow, preparing datasets for ALLPATHS shouldn't take that long (unless you have tons of data). Also, ALLPATHS performs way faster on intel than on the AMD processors, FYI (we are talking 20 hrs vs. 120 hrs here).

 

ADD REPLYlink written 4.1 years ago by arnstrm1.7k

I have abt 40G frag lib and 25G mate pair lib. Is that considered tonnes of data. Have a 100x coverage.

ADD REPLYlink written 4.1 years ago by mafireyi50

I had total of 86Gb (compressed data 36G pe + 50Gb mp), with little over 35X coverage. For preparing dataset, it used 127 mins wall time (32 CPUs, 512GB memory requested). Where as for actual assembly, it needed 565Gb RAM, 32 procs and ran for 6166.73 mins (both steps on AMD machine). It was a different story with Intel machine!

ADD REPLYlink modified 4.1 years ago • written 4.1 years ago by arnstrm1.7k

Intersecting fact to know. Thanks

ADD REPLYlink written 4.1 years ago by mafireyi50

Try adding ulimit -s unlimited to your PBS script. I know ALLPATHS team recommends it, but don't know what it does :)

ADD REPLYlink written 4.1 years ago by arnstrm1.7k

Also, make sure the fastq files have fq or fastq extension (gzipped or uncompressed). No spaces after last , in both of the csv files, and space for the empty field eg: 2000bp, trialrun, genspp, jump, 1, , , 2000, 500, outward, ,

ADD REPLYlink modified 4.1 years ago • written 4.1 years ago by arnstrm1.7k

Oh. Saw your response late. My fastq files have fastq.gz extensions. Will it fail again?

ADD REPLYlink written 4.1 years ago by mafireyi50

Oh just realised you said gzipped or uncompressed. Thot that was gunzipped.

ADD REPLYlink written 4.1 years ago by mafireyi50

Sorry for the confusion. I meant compressed or uncompressed (fastq.gz or fastq)! I normally put like this:

103, 2000bp, /home/path/to/fastqfiles/2000bp/some_saple_number_R?.fastq.gz

ADD REPLYlink written 4.1 years ago by arnstrm1.7k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1257 users visited in the last hour