Hi,
I am working with whole exome data, and I am trying to follow this pipeline in order to get a CNV analysis. When I reach the segments.pl step, I am able to get an output, but the end of every chromosome happens to have NA value in the last 4 columns .
This is part of the output
ID chrom loc.start loc.end num.mark seg.mean bstat pval lcl ucl
1 X30 1 69445 249211976 20269 0.2667 NA NA NA NA
2 X30 2 41347 21266186 955 -0.0224 7.877737 5.703810e-13 21265217 21266186
3 X30 2 21360274 21364172 6 1.4990 8.077258 1.906060e-13 21364172 21585074
4 X30 2 21585074 140426651 7026 -0.0006 6.685767 5.260426e-09 140426103 140426651
5 X30 2 140990738 141004571 4 1.5195 9.203246 1.189773e-17 141004571 141026628
6 X30 2 141026628 243037012 7369 0.0425 NA NA NA NA
I would like to ask for the possible reason this is happening. Also, I feel I shouldn't remove those columns, but getting a proper value out of them. I tried to change (just to try) the loc.end of the last segment of one chromosome to the "last base" of the chromosome (for example, the last segment of chromosome 2 ends in 243037012 so I changed it to see what happened if it was 243199373 instead), to see if the length of this last segment was the problem, but I still got NA values.
I will thank any help or advice!