I am currently trying to establish a pipeline to call CNVs from WES data using CNVkit to process the data into cnr format and then segment manually using segment function of DNAcopy package from Bioconductor.
However after I try to use segment.p to get p values and other characteristics what happens is I get mostly almost all NAs in the result file and even if there is some number, the pvalue is really really high...
> ID chrom loc.start loc.end num.mark seg.mean bstat pval lcl ucl > Sample.1 chr17 1964785 28548633 57 0.004 NA NA NA NA > Sample.1 chr18 13387721 53254275 38 -0.0087 NA NA NA NA > Sample.1 chr19 32836513 50370310 29 0.0653 NA NA NA NA > Sample.1 chr2 32288900 200320591 220 0.0049 NA NA NA NA > Sample.1 chr22 51113475 51154096 19 0.0217 5.13427042369266 5.31224265912416e-06 51143391 51158892
and it goes like this... As I already mentioned I am using CNVkit to calculate cnn and cnr files by normal recommended route, which I am analyzing by this sequence in DNAcopy
cn <- read.table(file = args,header = TRUE) CNA.object <- CNA(genomdat = cn[,5], chrom = cn[,1], maploc = cn[,2], data.type = 'logratio') CNA.smoothed <- smooth.CNA(CNA.object) segs <- segment(CNA.smoothed, alpha = 0.01, nperm = 10000, p.method = c("hybrid"), kmax=25, nmin=200, eta=0.05, trim= 0.025, min.width= 2, undo.splits = c("sdundo"),undo.SD=4, verbose=3) segsout = segs$output write.table(segsout[,2:6], file = cns_name, row.names = F, col.names = T, quote = F, sep="\t") seg.pvalue <- segments.p(segs, ngrid=100, tol=1e-6, alpha=0.05, search.range=100, nperm=1000) write.table(seg.pvalue, file= seg_name, row.names=F, col.names=T, quote=F, sep="\t")
What does this indicate? Where should I search for the problem? Any help would be appreciated.
Thank you very much for your answers