I'm working with CNVkit .cnr files generated from WGS of matched primary and metastatic HER2-positive breast cancer samples. My aim is to determine whether the level chromosome 8 amplifications differ significantly between primary and metastatic samples.
Each .cnr file contains log2 copy number values for genomic bins, where log2 is calculated as: log2(cn/2). What is the most appropriate statistical approach to compare chr8 amplification levels between paired samples? Should I Subset only chr8 regions, filter bins with log2 CN > 0.3 (i.e., gain),then compute the mean log2 CN per sample, and finally compare primary vs metastatic using a paired test (e.g., Wilcoxon signed-rank)?
I would greatly appreciate anyones help on this.