Question: Library size normalization during CNV calling from genomically doubled tumor tissue
0
gravatar for CY
8 days ago by
CY370
United States
CY370 wrote:

I realized library size may be an issue and most CNV tools seem to ignore this.

Say, we try to call CNV out of a tumor tissue those genome is almost doubled. If we directly compare the depth of each bin between tumor and normal control and the library size of tumor (fastq size) and normal tissue is equal, the depth of genomically doubled tumor tissue will still have the same depth as the normal tissue. If CNV caller ignore this library size issue, CNV called from genome doubled tissue will be incorrect (estimated diploid baseline is actually double genome). Can anyone share some insight on this? Thanks

cnv • 128 views
ADD COMMENTlink modified 7 days ago by markus.riester480 • written 8 days ago by CY370

The library preparation and quantification that determines the library size is normally done based on the amount of DNA, not the number of cells.

ADD REPLYlink modified 8 days ago • written 8 days ago by igor8.1k

Exactly. If both tumor and normal tissue require same DNA amount during library prep and output roughly same size of fastq, DNA molecule in genomically doubled tumor tissue is "diluted" by requiring the same DNA amount. The genomically doubled region of tumor tissue will have the same depth as in normal tissue and CNV will not be called. Am I right?

ADD REPLYlink written 7 days ago by CY370

A perfectly doubled genome couldn't be distinguished from normal, but chromosomal instability by definition results in many gains and losses. In simple terms, algorithms designed for this problem see that it cannot be a copy number of 2 when there are extensive losses that would correspond to copy numbers 1, 0, -1, -2.

ADD REPLYlink written 7 days ago by markus.riester480

See what you mean, but I think what CY means with library size is simply total sequencing read coverage. But maybe I was interpolating too aggressively.

ADD REPLYlink written 7 days ago by markus.riester480

Exactly, by library size I mean the total sequencing depth or fastq size.

ADD REPLYlink written 6 days ago by CY370
0
gravatar for markus.riester
8 days ago by
markus.riester480 wrote:

Every purity and ploidy aware copy number caller takes this into account. Have a look at the ASCAT or ABSOLUTE paper.

ADD COMMENTlink written 8 days ago by markus.riester480

Yes, they estimate ploidy, but it is based on allele frequencies, not library size.

ADD REPLYlink written 7 days ago by igor8.1k

? Allele frequencies are used in these algorithms to eliminate wrong purity and ploidy combinations.

ADD REPLYlink written 7 days ago by markus.riester480

I just wanted to clarify that library size is not used, which is what the original question was interested in. I was not disagreeing with anything that was stated.

ADD REPLYlink written 7 days ago by igor8.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1199 users visited in the last hour