Question: GATK GermlineCNVCaller & PostprocessGermlineCNVCalls
2
gravatar for rajitz
11 months ago by
rajitz20
rajitz20 wrote:

Hi, I was wondering if anyone here has experience in running GATK GermlineCNVCaller & PostprocessGermlineCNVCalls for calling CNVs in germline samples?

The VCF files that I'm getting always have ALT to be "< DEL>,< DUP>". Shouldn't ALT be just one of them or neither? Somehow both the interval and segment VCF files I'm looking at have all positions marked as "< DEL>,< DUP>".

If anyone here has experience with this, I would really appreciate some feedback. Thanks!

cnv gatk software error • 449 views
ADD COMMENTlink modified 7 weeks ago by Z-F10 • written 11 months ago by rajitz20
1
gravatar for Matt Miossec
6 months ago by
Matt Miossec330
Universidad Andrés Bello
Matt Miossec330 wrote:

Yes, this appears to be normal for the moment, I imagine it will probably change as the tool is further developed.

The information you're looking for is in the last column. The first element in that column, GT, stands for the call of expected ploidy (0), deletion (1) and duplication (2):

The following tutorial ends with a screen grab of what a typical gCNV VCF should look like: https://software.broadinstitute.org/gatk/documentation/article?id=11684

Undoubtedly, this is what your VCF looks like too. Hope this helps!

ADD COMMENTlink modified 6 months ago • written 6 months ago by Matt Miossec330
0
gravatar for Z-F
7 weeks ago by
Z-F10
Z-F10 wrote:

Hi everyone,

I am trying to use the CNV caller. a) GATK version used: gatk-4.1.4.0

I used the following command in this step.

../gatk-4.1.4.0/gatk -L Filtered_annotated_preprocessed_intervals_Twist.interval_list --interval-merging-rule OVERLAPPING_ONLY -I S1071Nr10.counts.hdf5 -I S1071Nr11.counts.hdf5 ( added 200 samples here as input, skipped those lines here to save the space) --contig-ploidy-priors ../contig_ploidy_priors.tsv --output . --output-prefix ploidy --verbosity DEBUG --mapping-error-rate 0.01 --global-psi-scale 0.001 --sample-psi-scale 1.0E-4 --mean-bias-standard-deviation 0.01

I installed the conda environment following https://gatk.broadinstitute.org/hc/en-us/articles/360035889851?flash_digest=f2aaedc26749c67b8005def080fde44460155fb6#

Everything was working until I got the following error, which I cannot understand what it is and how I can solve it.

16:54:47.473 DEBUG ScriptExecutor - --output_model_path=/data/NGS/Reanalysis-Package/CNV/ploidy-model /homefolder/zfatahi/miniconda3/envs/gatk/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type. from ._conv import register_converters as _register_converters Traceback (most recent call last): File "/tmp/cohort_determine_ploidy_and_depth.1941148667013278511.py", line 79, in <module> args.contig_ploidy_prior_table) File "/homefolder/zfatahi/miniconda3/envs/gatk/lib/python3.6/site-packages/gcnvkernel/io/io_ploidy.py", line 182, in get_contig_ploidy_prior_map_from_tsv_file delimiter=delimiter) File "/homefolder/zfatahi/miniconda3/envs/gatk/lib/python3.6/site-packages/gcnvkernel/io/io_commons.py", line 50, in read_csv input_pd = pd.read_csv(fh, delimiter=delimiter, dtype=dtypes_dict) # dtypes_dict keys may not be present File "/homefolder/zfatahi/miniconda3/envs/gatk/lib/python3.6/site-packages/pandas/io/parsers.py", line 705, in parser_f return _read(filepath_or_buffer, kwds) File "/homefolder/zfatahi/miniconda3/envs/gatk/lib/python3.6/site-packages/pandas/io/parsers.py", line 451, in _read data = parser.read(nrows) File "/homefolder/zfatahi/miniconda3/envs/gatk/lib/python3.6/site-packages/pandas/io/parsers.py", line 1065, in read ret = self._engine.read(nrows) File "/homefolder/zfatahi/miniconda3/envs/gatk/lib/python3.6/site-packages/pandas/io/parsers.py", line 1828, in read data = self._reader.read(nrows) File "pandas/_libs/parsers.pyx", line 894, in pandas._libs.parsers.TextReader.read File "pandas/_libs/parsers.pyx", line 916, in pandas._libs.parsers.TextReader._read_low_memory File "pandas/_libs/parsers.pyx", line 970, in pandas._libs.parsers.TextReader._read_rows File "pandas/_libs/parsers.pyx", line 957, in pandas._libs.parsers.TextReader._tokenize_rows File "pandas/_libs/parsers.pyx", line 2200, in pandas._libs.parsers.raise_parser_error pandas.errors.ParserError: Error tokenizing data. C error: Expected 5 fields in line 58, saw 7

16:54:55.812 DEBUG ScriptExecutor - Result: 1 16:54:55.813 INFO DetermineGermlineContigPloidy - Shutting down engine [February 3, 2020 4:54:55 PM IRST] org.broadinstitute.hellbender.tools.copynumber.DetermineGermlineContigPloidy done. Elapsed time: 0.78 minutes. Runtime.totalMemory()=3370123264 org.broadinstitute.hellbender.utils.python.PythonScriptExecutorException: python exited with 1 Command Line: python

So, it seems that the error is;

pandas.errors.ParserError: Error tokenizing data. C error: Expected 5 fields in line 58, saw 7

I googled a lot but I could not figure out what the problem is ( I have no experience working with python, I am just following the steps in here; https://gatkforums.broadinstitute.org/gatk/discussion/11684

Can anyone help me to solve the issue?

Thanks in advance,

Zohreh

ADD COMMENTlink written 7 weeks ago by Z-F10

The most likely explanation is that the file you're using to define contig ploidy priors has 7 columns instead of 5 in one or more of its rows. Even if it looks like that is not the case, double check you don't have any extra tabs (you can check with vim on the command line or notepad++). Hope this helps.

ADD REPLYlink modified 4 weeks ago • written 4 weeks ago by Matt Miossec330
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1347 users visited in the last hour