Error in CNV Calling with Annotation
2.4 years ago
wei.wei ▴ 10

Hi, I was trying to call CNV in yeast genomes by running batch Sample1.bam Sample2.bam -n Control1.bam Control2.bam -m wgs -f sacCer.fasta --annotate refFlat.txt

command, but I ran into the error below:

Traceback (most recent call last):
  File "/Users/wwei/anaconda2/bin/", line 13, in <module>
  File "/Users/wwei/anaconda2/lib/python2.7/site-packages/cnvlib/", line 113, in _cmd_batch
    args.count_reads, args.method)
  File "/Users/wwei/anaconda2/lib/python2.7/site-packages/cnvlib/", line 74, in batch_make_reference
    bam_fname, *autobin_args, bp_per_bin=50000.)
  File "/Users/wwei/anaconda2/lib/python2.7/site-packages/cnvlib/", line 96, in do_autobin
    tgt_bin_size = depth2binsize(tgt_depth, target_min_size, target_max_size)
  File "/Users/wwei/anaconda2/lib/python2.7/site-packages/cnvlib/", line 62, in depth2binsize
    bin_size = int(round(bp_per_bin / depth))
    ValueError: cannot convert float NaN to integer

When I omit the --annotate option, it worked fine and I was able to obtain the .cnr and .cns files. Just wondering if anyone has ever encountered similar issues and if there's anything I could do about it. Thank you.

2.3 years ago
Eric T. ★ 2.7k

Thanks for reporting, I've filed this issue in the project's GitHub repo:

It looks like the autobin step here ran into a NaN when doing some basic arithmetic with bin depths to estimate a reasonable average bin size. It's surprising that --annotate is responsible for the crash, as autobin shouldn't be doing anything with gene names.

Once you've determined a reasonable average bin size, do you still see the crash with batch ... --target-avg-size=<that-bin-size> --annotate?

Hi, thanks for the help. When I used --target-avg-size option it told me the real issue was that the chromosome names did not match in my input I was able to fix that. I don't know why it threw me a bin size error in the beginning.

5 weeks ago
linehammer ▴ 10

You can avoid this with a mask method. Note first that in python NaN is defined as the number which is not equal to itself:

float('nan') == float('nan')      

The "ValueError: cannot convert float NaN to integer" raised because of Pandas doesn't have the ability to store NaN values for integers. From Pandas v0.24, introduces Nullable Integer Data Types which allows integers to coexist with NaNs. This does allow integer NaNs . This is the pandas integer, instead of the numpy integer. So, use Nullable Integer Data Types (e.g. Int64).


NB: You have to go through numpy float first and then to nullable Int32, for some reason.


