Question: Why CNV calling using VarScan need two steps of fragments merging?
0
gravatar for CY
10 weeks ago by
CY90
United States
CY90 wrote:

I have been using VarScan to call CNV for a while but have not got a chance to look into it carefully.

The workflow basically like this: 1) Run copynumber to compare read depth between normal and tumor and get small fragments based on ration of NT/NT 2) Run circular binary segmentation (CBS) and some how merge small fragments into larger fragments again based on ration of NT/NT 3) Run mergeSegements.pl to further merge fragments and this is the final result of your CNV

I seem to understand the purpose of 1). It generate some intervals and assign each with the mean depth. These intervals are somehow like individual data point for further analysis.

What I don't understand are 2) and 3). Why do we need 2 steps of merging? What is the difference of these 2 steps? Why can't we just merge once and achieve the purpose? Why can't we set up more appropriate criteria / perimeters at the very first step (step 1) and spare the merging step?

ADD COMMENTlink modified 10 weeks ago by arta200 • written 10 weeks ago by CY90
1
gravatar for arta
10 weeks ago by
arta200
Sweden
arta200 wrote:

Circular Binary segmentation (CBS) is an external tool which segments the fragments based on significant change-points by fitting a Gaussian distribution. It was written in R and Varscan uses the CBS as an intermediate tool, so they did not reimplement in C and Perl. The aim of step 3, mergeSegements.pl, is to find similar copy-number-variants and classify them into large-scale and focal.

Taken form paper:

Adjacent segments of similar copy number from the CBS algorithm were merged by an internally developed Perl script (MergeSegments), and classified by size. Events encompassing >25% of a chromosome arm were classified as large-scale; all others were considered focal events.

ADD COMMENTlink written 10 weeks ago by arta200

Thanks for explaining. But way can't we just use the result of first step? The first step already identified a number of break point.

ADD REPLYlink written 10 weeks ago by CY90

CDS does not classify the segments as amplification, deletion or neutral. By applying MergeSegments algorithm, these segments are classified as amplification (log ratio > 0.25), deletion (log ratio < -0.25) or neutral based (between -0.25 and 0.25) and merge the adjoints as same class. Moreover, amplifications and deletions are categorized as large-scale and focal. It is informative in terms of interpretation such as whole chromosome loss or chromosome arm lost or gain.

Hope it is clear now. :)

ADD REPLYlink modified 9 weeks ago • written 9 weeks ago by arta200

Yes, it is really helpful. Thanks :)

ADD REPLYlink written 9 weeks ago by CY90
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 499 users visited in the last hour