PBS scripts for CD-HIT
1
1
Entering edit mode
5.3 years ago
lucia ▴ 10

Has anyone used PBS to run CD-HIT, I try to run it but it has error like this:

> my job id is 4967.admin1 run nodes is following: begin time is Fri Dec
> 28 16:07:39 CST 2018
> -------------------------------------------------------------------------- Open MPI tried to fork a new process via the "execve" system call but
> failed.  Open MPI checks many things before attempting to launch a
> child process, but nothing is perfect. This error may be indicative of
> another problem on the target host, or even something as silly as
> having specified a directory for your application. Your job will now
> abort.
> 
>   Local host:        smp1   Application name:  hit   Error:           
> Exec format error
> -------------------------------------------------------------------------- end time is Fri Dec 28 16:07:39 CST 2018

And my command is:

cd-hit-est -i merge.fna -o unigene.fasta -c 0.95 -n 8

I try to run it on the command node but it shows:

 ./hit
================================================================
Program: CD-HIT, V4.7 (+OpenMP), Jul 13 2018, 17:17:44
Command: cd-hit-est -i merge.fna -o unigene.fasta -c 0.95 -n 8

Started: Fri Dec 28 16:18:11 2018
================================================================
                            Output
----------------------------------------------------------------
total seq: 180956

Warning:
Some seqs are too long, please rebuild the program with make parameter MAX_SEQ=new-maximum-length (e.g. make MAX_SEQ=10000000)
Not fatal, but may affect results !!

longest and shortest : 662596 and 500
Total letters: 433057744
Sequences have been sorted

Approximated minimal memory consumption:
Sequence        : 457M
Buffer          : 1 X 155M = 155M
Table           : 1 X 3M = 3M
Miscellaneous   : 2M
Total           : 619M

Table limit with the given memory limit:
Max number of representatives: 40000
Max number of word counting entries: 22619494

comparing sequences from          0  to         82
./hit: line 1:  2160 Segmentation fault      (core dumped) cd-hit-est -i 
cd-hit-est -i merge.fna -o unigene.fasta -c 0.95 -n 8

I have no experience in PBS, so I wonder if it is because of my PBS script. Suggestions will be appreciated:

#!/bin/sh

#PBS -N hit
#PBS -l nodes=1:ppn=1
#PBS -j oe
#PBS -l walltime=480:00:00
#PBS -q smp

source /public/home/software/mpi/openmpi.sh
echo my job id is $PBS_JOBID | tee  openmpi.log
echo run nodes is following: | tee -a openmpi.log
echo begin time is `date` | tee -a  openmpi.log
mpirun /public/home/wuyue/data_P101SC18100564-01-B1-21/htest/hit | tee -a openmpi.log
echo end time is `date` | tee -a  openmpi.log
CD-HIT PBS • 2.6k views
ADD COMMENT
1
Entering edit mode
5.3 years ago

From a first look it seems that you are inputting too long sequences, longer than CD-HIT can process. It therefore provides the message to recompile CD-HIT with an option indicating you want to process longer sequences than 'default'

Warning: Some seqs are too long, please rebuild the program with make parameter MAX_SEQ=new-maximum-length (e.g. make MAX_SEQ=10000000) Not fatal, but may affect results !!

I'm not really sure why you | tee all your commands in your PBS script, those will normally already be captured by the stderr/stdout streams of your job (the .e and .o files) . I also think that CD-HIT does not support openMPI , you can use the -T option though to run CD-HIT in multiple cores (but this is something different from MPI), ans since you only request a single node to run the software on, MPI is likely not useful. I suggest to change thsi line in your PBS script: #PBS -l nodes=1:ppn=1 to #PBS -l nodes=1:flags=allprocs and then add the -T 0 to your cd-hit command line (=while make cd-hit use all available cores on the machine)

ADD COMMENT
0
Entering edit mode

for some more info on this have a look at this stackoverflow post : https://stackoverflow.com/questions/32464084 (thx to h.mon )

ADD REPLY

Login before adding your answer.

Traffic: 1949 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6