Question: GATK GermlineCNVCaller & PostprocessGermlineCNVCalls
gravatar for rajitz
22 months ago by
rajitz20 wrote:

Hi, I was wondering if anyone here has experience in running GATK GermlineCNVCaller & PostprocessGermlineCNVCalls for calling CNVs in germline samples?

The VCF files that I'm getting always have ALT to be "< DEL>,< DUP>". Shouldn't ALT be just one of them or neither? Somehow both the interval and segment VCF files I'm looking at have all positions marked as "< DEL>,< DUP>".

If anyone here has experience with this, I would really appreciate some feedback. Thanks!

cnv gatk software error • 957 views
ADD COMMENTlink modified 12 months ago by Z-F20 • written 22 months ago by rajitz20
gravatar for Matt Miossec
17 months ago by
Matt Miossec350
UK/Oxford/Wellcome Centre for human genetics
Matt Miossec350 wrote:

Yes, this appears to be normal for the moment, I imagine it will probably change as the tool is further developed.

The information you're looking for is in the last column. The first element in that column, GT, stands for the call of expected ploidy (0), deletion (1) and duplication (2):

The following tutorial ends with a screen grab of what a typical gCNV VCF should look like:

Undoubtedly, this is what your VCF looks like too. Hope this helps!

ADD COMMENTlink modified 17 months ago • written 17 months ago by Matt Miossec350
gravatar for Z-F
12 months ago by
Z-F20 wrote:

Hi everyone,

I am trying to use the CNV caller. a) GATK version used: gatk-

I used the following command in this step.

../gatk- -L Filtered_annotated_preprocessed_intervals_Twist.interval_list --interval-merging-rule OVERLAPPING_ONLY -I S1071Nr10.counts.hdf5 -I S1071Nr11.counts.hdf5 ( added 200 samples here as input, skipped those lines here to save the space) --contig-ploidy-priors ../contig_ploidy_priors.tsv --output . --output-prefix ploidy --verbosity DEBUG --mapping-error-rate 0.01 --global-psi-scale 0.001 --sample-psi-scale 1.0E-4 --mean-bias-standard-deviation 0.01

I installed the conda environment following

Everything was working until I got the following error, which I cannot understand what it is and how I can solve it.

16:54:47.473 DEBUG ScriptExecutor - --output_model_path=/data/NGS/Reanalysis-Package/CNV/ploidy-model /homefolder/zfatahi/miniconda3/envs/gatk/lib/python3.6/site-packages/h5py/ FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type. from ._conv import register_converters as _register_converters Traceback (most recent call last): File "/tmp/", line 79, in <module> args.contig_ploidy_prior_table) File "/homefolder/zfatahi/miniconda3/envs/gatk/lib/python3.6/site-packages/gcnvkernel/io/", line 182, in get_contig_ploidy_prior_map_from_tsv_file delimiter=delimiter) File "/homefolder/zfatahi/miniconda3/envs/gatk/lib/python3.6/site-packages/gcnvkernel/io/", line 50, in read_csv input_pd = pd.read_csv(fh, delimiter=delimiter, dtype=dtypes_dict) # dtypes_dict keys may not be present File "/homefolder/zfatahi/miniconda3/envs/gatk/lib/python3.6/site-packages/pandas/io/", line 705, in parser_f return _read(filepath_or_buffer, kwds) File "/homefolder/zfatahi/miniconda3/envs/gatk/lib/python3.6/site-packages/pandas/io/", line 451, in _read data = File "/homefolder/zfatahi/miniconda3/envs/gatk/lib/python3.6/site-packages/pandas/io/", line 1065, in read ret = File "/homefolder/zfatahi/miniconda3/envs/gatk/lib/python3.6/site-packages/pandas/io/", line 1828, in read data = File "pandas/_libs/parsers.pyx", line 894, in File "pandas/_libs/parsers.pyx", line 916, in pandas._libs.parsers.TextReader._read_low_memory File "pandas/_libs/parsers.pyx", line 970, in pandas._libs.parsers.TextReader._read_rows File "pandas/_libs/parsers.pyx", line 957, in pandas._libs.parsers.TextReader._tokenize_rows File "pandas/_libs/parsers.pyx", line 2200, in pandas._libs.parsers.raise_parser_error pandas.errors.ParserError: Error tokenizing data. C error: Expected 5 fields in line 58, saw 7

16:54:55.812 DEBUG ScriptExecutor - Result: 1 16:54:55.813 INFO DetermineGermlineContigPloidy - Shutting down engine [February 3, 2020 4:54:55 PM IRST] done. Elapsed time: 0.78 minutes. Runtime.totalMemory()=3370123264 org.broadinstitute.hellbender.utils.python.PythonScriptExecutorException: python exited with 1 Command Line: python

So, it seems that the error is;

pandas.errors.ParserError: Error tokenizing data. C error: Expected 5 fields in line 58, saw 7

I googled a lot but I could not figure out what the problem is ( I have no experience working with python, I am just following the steps in here;

Can anyone help me to solve the issue?

Thanks in advance,


ADD COMMENTlink written 12 months ago by Z-F20

The most likely explanation is that the file you're using to define contig ploidy priors has 7 columns instead of 5 in one or more of its rows. Even if it looks like that is not the case, double check you don't have any extra tabs (you can check with vim on the command line or notepad++). Hope this helps.

ADD REPLYlink modified 11 months ago • written 11 months ago by Matt Miossec350

Thanks! That solved the issue.

ADD REPLYlink written 7 months ago by Z-F20
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1310 users visited in the last hour