Question

Filter out Non-Diploid segments in FACETS data

0

Entering edit mode

7.0 years ago

sviatoslav.kendall ▴ 880

I've got some sequencing data (tumor/normal pairs) that has had mutations called and also been subjected to copy number analysis using FACETS; these two analyses have been merged such that each mutation has been mapped to a FACETS segment and the associated values.

I want to run an analysis that focuses on diploid regions of the genome but am unsure how best to do this. The people whose work I'm trying to reproduce simply filtered out based on copy-number log ratio being outside the range, (-0.5, 0.5) but I am not sure that this is appropriate with FACETS data.

As far as I can tell from reading the publication and documentation, FACETS computes the sample's overall ploidy and copy-number log ratios get shifted to set this value equal to 0. The copy-number log ratio that corresponds to a diploid segment is reported as the dipLogR value.

Since I'm trying to filter out non-diploid regions, I expect I need to base my filter on the dipLogR value but I'm not sure how wide it should be to best approximate a range of (-0.5, 0.5) on a scale where 0 represents the diploid state.

I know how to convert between copy number and copy-number log ratio:

copy_number = 2 * 2^copy_number_log_ratio 
copy_number_log_ratio = log2(copy_number/2)

But I'm having a hard time figuring out how to use the formulas to come up with an appropriate range of values above and below the dipLogR value.

sequencing R FACETS CNV • 2.6k views

ADD COMMENT • link 7.0 years ago by sviatoslav.kendall ▴ 880