Speed up computation time for metasv
0
0
Entering edit mode
3.6 years ago
kmsh410 ▴ 40

HI Biostars,

Currently, I am using metasv to merge SVs from the outputs of BreankDancer, CNVNATOR, and Pindel for a human genome. I was wondering if there are some tricks that I could accelerate the computational time? I also posted the same question[issue #134] to the author on metasv GitHub, but I am not sure if I can receive any reply from the developer. Any suggestions will be appreciated.

I downloaded metasv from anaconda by using the command below:

conda install -c bioconda metasv

The version of metasv:

[ksux 18:11:36 ksux_SVE]$ run_metasv.py --version
run_metasv.py 0.5.4

I performed the run_metasv.py on the example files without any issue, so I moved to my own data. The running time of metasv on our HPC is over 5 days now. Here I listed my bash command.

#!/bin/bash
#SBATCH --qos=long
#SBATCH --time=7-00:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=20
#SBATCH --mem=64G

module load anaconda/2.5.0 bedtools/2.27.1
module load gcc/4.8.2
module load cmake/3.0.2 ROOT/5.34.36
export CONDA_ENVS_PATH=/lustre/project/ksux_SVE
unset PYTHONPATH
source activate SVE

metaSV_ref=Homo_sapiens_assembly38.fasta
breakdancer_our=/data/BreakDancer_out/Subject_ID.sv.tbl
cnvnator_call=/data/CNVnator_out/Subject_ID.cnv.xls
pindel_out=/data/pindel_out/Sample_dir/Subject_ID/*
sample_idSubject_ID_tbl
alignments_bam=/data/Subject_ID.bam
spades_exe=/ksux_SVE/SVE/bin/spades.py
age_align_exe=/ksux_SVE/SVE/bin/age_align
threads=20
work=/data/metaSV_work2
OUTDIR=/data/metaSV_out2
insert_size_mean=260.04
insert_size_sd=56.34
metaSV_svs_to_assemble={'DEL','INS','INV','DUP'}

run_metasv.py --reference $metaSV_ref
--breakdancer_native $breakdancer_our
--cnvnator_native $cnvnator_call
--pindel_native $pindel_out
--sample $sample_id
--bam $alignments_bam
--spades $spades_exe
--age $age_align_exe
--num_threads $threads
--workdir $work
--outdir $OUTDIR
--isize_mean $insert_size_mean
--isize_sd $insert_size_sd

I didn't find any issues in the log file so far but the running time is over than I expected.

Last, thanks for reading this post.

structural variants metasv ensemble approach • 875 views
ADD COMMENT
0
Entering edit mode

Update: The author suggested to turn off the assemble function in the metasv program, and the running time was reduced to ~10 minutes for a subject. If you want to use the assemble function, that can take quite some time depending on the data. In my case, I have 2x 350bp reads (21x) for my bam file, and the overall computational time will be over 7 days. I like to keep it running, but our server allows the maximum computational time in one week only.

ADD REPLY
0
Entering edit mode

Out of curiosity, which platform generates PE 350bp reads? Even the MiSeq is capped at 2x300.

ADD REPLY

Login before adding your answer.

Traffic: 2510 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6