Cannot figure out the reason for this error message: GISTIC 2.0 input error detected: All input data were removed after NaN processing.
0
0
Entering edit mode
3.6 years ago
morten.rye • 0

I am running GISTIC on a number of copy number segmented data files downloaded/generated from various datasets for comparison. GISTIC works fine for all segmentation files, except for one, where I keep getting the following message.

(base) mh-ikom01:~/GISTIC$ ./run_gistic_ICGC_corrected2
--- creating output directory ---
--- running GISTIC ---
Setting Matlab MCR root to /home/local/mortenry/GISTIC/MATLAB_Compiler_Runtime
GISTIC version 2.0.23

GISTIC 2.0 input error detected:
All input data were removed after NaN processing.

However I cannot find any reason why this file won't work, since the structure of the segment file is no different than for the other datasets. I paste in the first few example lines:

First segments of file that works fine!

V219    1       604268  842413  10      -0.0421


V219    1       850905  2814971 217     0.078

V219    1       2832087 4094205 136     0.0446

V219    1       4113823 5630812 86      -0.0101

108415 segments in total

First segments of file that creates error message!

CPCG0368-F1     1       10000   10467   0       0.000959037722634586

CPCG0368-F1     1       11447   177416  165     0.000959037722634586

CPCG0368-F1     1       227417  267718  40      0.000959037722634586

CPCG0368-F1     1       317719  471367  153     0.000959037722634586

CPCG0368-F1     1       521368  564448  43      0.000959037722634586

785129 segments in total

I should mention two things which I corrected to try to make the file run:

  1. First GISTIC could not run becuase of many overlapping segments. I modified the segmentation file to non-overlapping segments, which removed the overlap-error, But instead I got the NaN error.

  2. Since the segments were from WGS data, there were no info on number of markers for each segment. Since I first added a column with "NA"s for markers, and thought that this might be the reason the error. I then replaced this column wuth some marker pseudo-numbers scaled according to the segment length. But I still got the same error message.

I have also read at a previous post on the topic : GISTIC 2.0.23 error However, there seem to be segments defined for all samples, and I have also double-checked that the sample ids are identical for the _segment and _arraylist files. So this does not seem to be the problem.

I'm stuck at the moment, so any help would be appreciated.

Added to the original message: I have done some more checking, and I have been able to generate two segmentation files from the beginning of chr1. Both are subsets from the segmentation file that would not run in the first place. The first segmentation file seem to run fine. The other segmentation file includes one additional segment compared to the first file (the next segment sorted according to position), and generates the error message. I cannot see anything wrong with this segment, so I still cannot figure out why I get an error. I will be happy to provide these two files to work with for anyone who wish to figure this out the problem.

software error genome GISTIC • 1.7k views
ADD COMMENT

Login before adding your answer.

Traffic: 2417 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6