SURPI is a pipeline to find out the pathogens from clinical metagenomics samples. It is tested only on Ubuntu and it assumes many things about your installation. I recently installed in on CentOS. These are few key points you need to take care before running the pipeline.
The github page of SURPI is https://github.com/chiulab/surpi and it's published in Genome Research.
- The
create_snap_to_nt.shprogram uses-Ofactoras 1000, on line 29, which may not work for your machine. You need to figure out the correct value and make necessary changes. Read snap aligner document for details. - The abyss installation requires
mmap. Make sure you have installed it before compiling abyss. http://hackage.haskell.org/package/mmap-0.5.9/mmap-0.5.9.tar.gz - Make sure
formatdbis there in your path. It can be downloaded from ftp://ftp.ncbi.nlm.nih.gov/blast/executables/release/LATEST/ - The
taxonomy_lookup.plprogram, at line 84 hassort --parallel=$cores, where you may need to remove--parallel=$coresoption, if the sort utility on you machine does not support--paralleloption. - The
abyss_minimus.shprogram tries to usempirunto make it parallel. If the mpirun is not configured properly, you need to remove the option 'np=$cores' in line 86, so that it will not be run parallelly. - The
ribo_snap_bac_euk.shprogram is hardcoded to use the 10,75 as arguments tocrop_reads.csh, which you may need to change in line 43. - The
coveragePlot.pyprogram usesmlab.load()at line 47, which is deprecated in latest version of matplotlib. Hence, you may need to change it tonp.loadtxt()
I will update this post as and when I find more issues with the pipeline.
I forgot to mention that the configuration file that gets created, will keep the wrong path for the reference sequences. For example, in your case the path to db is dbname=/reference/taxonomy/gi_taxid_nucl.db but I am sure it's not /reference/ instead either it should be reference/ or <full path>/reference/...
Change all the paths carefully in the configuration file.