Question: Why CNV calling using VarScan need two steps of fragments merging?
0
gravatar for CY
8 months ago by
CY140
United States
CY140 wrote:

I have been using VarScan to call CNV for a while but have not got a chance to look into it carefully.

The workflow basically like this: 1) Run copynumber to compare read depth between normal and tumor and get small fragments based on ration of NT/NT 2) Run circular binary segmentation (CBS) and some how merge small fragments into larger fragments again based on ration of NT/NT 3) Run mergeSegements.pl to further merge fragments and this is the final result of your CNV

I seem to understand the purpose of 1). It generate some intervals and assign each with the mean depth. These intervals are somehow like individual data point for further analysis.

What I don't understand are 2) and 3). Why do we need 2 steps of merging? What is the difference of these 2 steps? Why can't we just merge once and achieve the purpose? Why can't we set up more appropriate criteria / perimeters at the very first step (step 1) and spare the merging step?

ADD COMMENTlink modified 8 months ago by arta520 • written 8 months ago by CY140
1
gravatar for arta
8 months ago by
arta520
Sweden
arta520 wrote:

Circular Binary segmentation (CBS) is an external tool which segments the fragments based on significant change-points by fitting a Gaussian distribution. It was written in R and Varscan uses the CBS as an intermediate tool, so they did not reimplement in C and Perl. The aim of step 3, mergeSegements.pl, is to find similar copy-number-variants and classify them into large-scale and focal.

Taken form paper:

Adjacent segments of similar copy number from the CBS algorithm were merged by an internally developed Perl script (MergeSegments), and classified by size. Events encompassing >25% of a chromosome arm were classified as large-scale; all others were considered focal events.

ADD COMMENTlink written 8 months ago by arta520

Thanks for explaining. But way can't we just use the result of first step? The first step already identified a number of break point.

ADD REPLYlink written 8 months ago by CY140

CDS does not classify the segments as amplification, deletion or neutral. By applying MergeSegments algorithm, these segments are classified as amplification (log ratio > 0.25), deletion (log ratio < -0.25) or neutral based (between -0.25 and 0.25) and merge the adjoints as same class. Moreover, amplifications and deletions are categorized as large-scale and focal. It is informative in terms of interpretation such as whole chromosome loss or chromosome arm lost or gain.

Hope it is clear now. :)

ADD REPLYlink modified 8 months ago • written 8 months ago by arta520

Yes, it is really helpful. Thanks :)

ADD REPLYlink written 8 months ago by CY140
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1142 users visited in the last hour