Question: How to perform GISTIC analysis on GenePattern?
0
gravatar for Dmitri Ivanovich
12 months ago by
Dmitri Ivanovich0 wrote:

Hi guys, I have got a CNV data from TCGA and the marker file as shown below,

  • Seg file:
  • Sample Chromosome Start End Num_Probes Segment_Mean

  • TCGA.05.4249.01A 1 3218610 120527361 67456 -0.1725

  • TCGA.05.4249.01A 1 149881398 167526508 10663 0.5859
  • TCGA.05.4249.01A 1 167526675 167526823 2 -1.1518
  • TCGA.05.4249.01A 1 167526972 247813706 50571 0.5816
  • TCGA.05.4249.01A 2 484222 242476062 130861 0.0585

  • Markers file:

  • Probe.Name Chromosome Start

  • CN_473963 1 61735
  • CN_473964 1 61808
  • CN_473965 1 61823
  • CN_477984 1 62152
  • CN_473981 1 62920
  • CN_473982 1 62937

And my goal is to perform GISTIC analysis with the GISTIC 2.0 module in GenePattern, but the result is always like this: "GISTIC version 2.0.23 GISTIC 2.0 input error detected: 76606 segment start or end positions in '/opt/gpcloud/gp_home/users/genye/uploads/tmp/run8835511072592266907.tmp/seg.file/1/biguoshu1.txt' do not match any markers in '/opt/gpcloud/gp_home/users/genye/uploads/tmp/run6032895478607754030.tmp/markers.file/2/markersMatrix.txt'. First bad position is 10:24732567 at line 33."

I have uploaded my files in .txt format and choose the GISTIC version 6.15.28 and Human_hg19.mat as the refgene file. All other parameters were by default. Could anyone please tell me what the problem is and how to solve? Thank you !

cnv rna-seq tcga snp gistic • 858 views
ADD COMMENTlink modified 6 months ago by 295353195510 • written 12 months ago by Dmitri Ivanovich0
1

Hi,did you fix the problem? I had the same problem

ADD REPLYlink written 6 months ago by 295353195510

Please show the exact error message, a sample of your input data, and all commands that you have tried. Thank you.

ADD REPLYlink written 6 months ago by Kevin Blighe65k

thank you for your reply!

error: GISTIC 2.0 input error detected:
198278 segment start or end positions in '/opt/gpcloud/gp_home/users/yuduoduo/uploads/tmp/run1102255665308516451.tmp/seg.file/1/MaskedCopyNumberSegment.txt' do not match any markers in '/opt/gpcloud/gp_home/users/yuduoduo/uploads/tmp/run5183191344025581000.tmp/markers.file/2/genome.info.6.0_hg19.na31_minus_frequent_nan_probes_sorted_2.1(1).txt'.
First bad position is 1:2116145 at line 1.

input file:

TCGA-MQ-A4LJ-01A    1   62920   2116145 358 0.0051
TCGA-MQ-A4LJ-01A    1   2125269 3259074 359 -0.0884
TCGA-MQ-A4LJ-01A    1   3259896 12779433    5732    0.0459
TCGA-MQ-A4LJ-01A    1   12792599    12922922    33  -0.7534

marker file:

genome.info.6.0_hg19.na31_minus_frequent_nan_probes_sorted_2.1.txt
 CN_473963  1   61735
CN_473964   1   61808
CN_473965   1   61823
CN_477984   1   62152
CN_473981   1   62920
CN_473982   1   62937
CN_497980   1   72704

All other parameters were by default thank you !

ADD REPLYlink modified 6 months ago by Kevin Blighe65k • written 6 months ago by 295353195510

You could try without the markers file, which is now possible with later versions of GISTIC. Also, just double-check that the formatting of your files is correct.

ADD REPLYlink modified 6 months ago • written 6 months ago by Kevin Blighe65k

Thank you for you help Kevin! but another problem arised:I got many regions amplified/deleted. The plots are very noisy, amplification/deletion occurred in almost every gene.Could you please tell me what the problem is and how to solve? Thank you !

ADD REPLYlink written 5 months ago by 295353195510

Hi, you are not giving me much information with which I could use to begin to help. Please share, in detail, the data that you obtained, and the code that you used to process it.

ADD REPLYlink written 5 months ago by Kevin Blighe65k

sorry! The plot is always like this: https://ibb.co/swrq1qD

my maskedCNVsegment data was from TCGA and I perform GISTIC analysis with the GISTIC 2.0 module in GenePattern. All other parameters were by default .The plots are very noisy, amplification/deletion occurred in almost every gene

ADD REPLYlink modified 5 months ago • written 5 months ago by 295353195510

You took data from the GDC? That data is segmented copy number data produced by DNAcopy, I believe. You then used that as input to GISTIC?

Could you take a look here to see how this matches up to what you have done? - A: How to extract the list of genes from TCGA CNV data

ADD REPLYlink written 5 months ago by Kevin Blighe65k

Follow up on this part of the error:

First bad position is 10:24732567 at line 33.

As it alludes to chromosome 10, perhaps one of your files is not sorted numerically, and is instead sorted lexicographically

ADD REPLYlink written 12 months ago by Kevin Blighe65k

Thank you for you helping Kevin! But after I sorted my files numerically, it still showed similar result (... do not match any markers...) . Is it possible that the Marker File I submitted doesn't fit TCGA data, or that the online version of GISTIC2 doesn't work at all?

ADD REPLYlink written 12 months ago by Dmitri Ivanovich0

Perhaps you can contact the GISTIC team directly:

ADD REPLYlink modified 5 months ago • written 5 months ago by Kevin Blighe65k

I think that I read somewhere, by the way, that the IDs have to be like this:

TCGA-MQ-A4LJ

So, less the final part. Can you try?

ADD REPLYlink written 5 months ago by Kevin Blighe65k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1704 users visited in the last hour