I am using the R-package PureCN to process predominantly tumor-only samples, but I do have 4 tumor-normal pairs.
I understand that one processed matched normal sample (ideally derived from young and healthy individuals, sample processed using the same methods and probe kit as the tumor samples, same sequencing machine, etc.) is better than not having a technically matched normal sample. I have 4 technically matched normal samples and I was able to create a NormalDB, but does anyone know if 4 is sufficient?
I have run the "minimal test run" (Section 4.3 of PureCN Best Practices 19 July 2020) and PureCN.R did produce a number of files:
02_amplification_pvalues.csv 02_chromosomes.pdf 02.csv 02_dnacopy.seg 02_genes.csv 02_local_optima.pdf 02.log 02_loh.csv 02.pdf 02.rds 02_segmentation.pdf 02_variants.csv
So I believe that the minimal test ran as expected. However, to proceed with the "Production pipeline run," the code in section 4.3 indicates that the run is to be executed with a mappingbiasfile.
Assuming that 4 samples are sufficient to create a NormalDB, then I have a question about the generation of the mappingbiasfile that's created from the NormalDB file. Specifically, can 4 samples can be used to create the mappingbiasfile? If so, do any values in the calculateMappingBiasVcf function of PureCN be used as is, or do some of the values have to be changed?
calculateMappingBiasVcf <- function(normal.panel.vcf.file, min.normals = 2, min.normals.betafit = 7, min.median.coverage.betafit = 5, yieldSize = 5000, genome)