gistic for whole genome sequencing copy number analysis
1
0
Entering edit mode
9.0 years ago
owen ▴ 10

Hello, I am analyzing whole genome sequencing data from cancer samples.

For copy number analysis, I have two questions:

  1. How can I convert chromosome segments to gene level copy numbers?
  2. How can I apply GISTIC to segmented copy number data? I want to identify recurrent copy number changes in cancer population, but GISTIC is originally desinged for array data with probes.
whole-genome-sequencing copy-number gistic • 9.8k views
ADD COMMENT
1
Entering edit mode
9.0 years ago
Irsan ★ 7.8k

1) The simplest way is to use the CNtools bioconductor package. Other more flexible but more intensive option is to use genomic overlap tools like bedtools, bedops or GenomicRanges to map back the segments to your gene/transcript annotation of interest

2) If you have segmented copy numbers, a file describing the genomic windows that were used to quantify the reads per sample and a reference genome than you have everything thats required as input for GISTIC2. Also see GISTIC2 documentation

ADD COMMENT
0
Entering edit mode
  1. Thanks for your help. Results of CNtools is genomic segment, but not gene-centric. Other tools seem to work.
  2. How can I find the file for genomic windows? Is it in the _ratio.txt file?
ADD REPLY
0
Entering edit mode

Hi owen, were you able to run gistic?

ADD REPLY
0
Entering edit mode

Hi irsan, i have the segmented file. But how to get the number of markers?

ADD REPLY
0
Entering edit mode

I ran into the same problem. My solution was to create a marker file which included markers at every segment boundary plus markers every 2Kb across the whole genome. Did you solved the problem differently?

ADD REPLY
0
Entering edit mode

Hi Irsan, I am trying to run the GISTIC2.0 in the Genepattern to analyse the segmented results generated by the CBS.However, I always get an error "Index exceeds matrix dimensions", and the possible measures given by the website says "check your markers file format. Please see the sections on the markers file format in the GISTIC documentation for more details and examples." I carefully examine and compare the format of my datasets with the samples' and I fail to find any difference between them.So I urgently hope that you can do me a favor and give me some advices in your convenient time. The segmentation file, output of the CBS, removes the first title row and the first serial number column, and saves in the tab-delimited txt file. Then the marker file made by myself holds three columns ,SNP marker label, Chromosome number and Chromosome position, that is a subset of marker annotation file of SNP array before segmentation and saves in the same mathod with segmentation file. It's my data processing measure, and I desire to get your instruction or teach me how to get the input files of the GISTIC.

ADD REPLY
0
Entering edit mode

Hi Benche, It seems that I have met similar problem. Have you solve yours?

ADD REPLY
0
Entering edit mode

I have called somatic CNVs using GATK4 on canine (dog) data. I have standard output from GATK which is a seg file as shown below SEG Format file

I am not sure how should I convert above format to GISTIC input format. Any comments would be helpful.

Also, for the markers file this link suggested a simple format for markers file:
sample_name chr start_pos
sample_name chr stop_pos

Do you think would be the correct marker file to use.

ADD REPLY
0
Entering edit mode

Hi Sutturka,

I also called somatic CNVs using GATK4. It seems that I have met similar problem. Have you solve yours? I urgently hope that you can give me some advices in your convenient time if you have completed it. Thank you very much.

ADD REPLY

Login before adding your answer.

Traffic: 2602 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6