Question: Library size normalization during CNV calling from genomically doubled tumor tissue
0
gravatar for CY
8 months ago by
CY470
United States
CY470 wrote:

I realized library size may be an issue and most CNV tools seem to ignore this.

Say, we try to call CNV out of a tumor tissue those genome is almost doubled. If we directly compare the depth of each bin between tumor and normal control and the library size of tumor (fastq size) and normal tissue is equal, the depth of genomically doubled tumor tissue will still have the same depth as the normal tissue. If CNV caller ignore this library size issue, CNV called from genome doubled tissue will be incorrect (estimated diploid baseline is actually double genome). Can anyone share some insight on this? Thanks

cnv • 334 views
ADD COMMENTlink modified 8 months ago by markus.riester490 • written 8 months ago by CY470

The library preparation and quantification that determines the library size is normally done based on the amount of DNA, not the number of cells.

ADD REPLYlink modified 8 months ago • written 8 months ago by igor9.9k

Exactly. If both tumor and normal tissue require same DNA amount during library prep and output roughly same size of fastq, DNA molecule in genomically doubled tumor tissue is "diluted" by requiring the same DNA amount. The genomically doubled region of tumor tissue will have the same depth as in normal tissue and CNV will not be called. Am I right?

ADD REPLYlink written 8 months ago by CY470

A perfectly doubled genome couldn't be distinguished from normal, but chromosomal instability by definition results in many gains and losses. In simple terms, algorithms designed for this problem see that it cannot be a copy number of 2 when there are extensive losses that would correspond to copy numbers 1, 0, -1, -2.

ADD REPLYlink written 8 months ago by markus.riester490

See what you mean, but I think what CY means with library size is simply total sequencing read coverage. But maybe I was interpolating too aggressively.

ADD REPLYlink written 8 months ago by markus.riester490

Exactly, by library size I mean the total sequencing depth or fastq size.

ADD REPLYlink written 8 months ago by CY470
0
gravatar for markus.riester
8 months ago by
markus.riester490 wrote:

Every purity and ploidy aware copy number caller takes this into account. Have a look at the ASCAT or ABSOLUTE paper.

ADD COMMENTlink written 8 months ago by markus.riester490

Yes, they estimate ploidy, but it is based on allele frequencies, not library size.

ADD REPLYlink written 8 months ago by igor9.9k

? Allele frequencies are used in these algorithms to eliminate wrong purity and ploidy combinations.

ADD REPLYlink written 8 months ago by markus.riester490

I just wanted to clarify that library size is not used, which is what the original question was interested in. I was not disagreeing with anything that was stated.

ADD REPLYlink written 8 months ago by igor9.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1118 users visited in the last hour