I have TCGA data that needs to be reformatted according to the following:
# 'x' is a matrix of segmented output from ASCAT, with at least the # following columns (column names are not important): # 1: sample id # 2: chromosome (numeric) # 3: segment start # 4: segment end # 5: number of probes # 6: total copy number # 7: nA # 8: nB # 9: ploidy # 10: contamination, aberrant cell fraction
However, I'm not sure how to do this. For one, I'm assuming
nB refer to allele-specific copy numbers. But TCGA only has the following data:
A segmentation file:
Sample Chromosome Start End Num_Probes Segment_Mean DRAMA_p_TCGA_276_278_N_GenomeWideSNP_6_A04_1322446 1 61735 98602 17 0.3913 DRAMA_p_TCGA_276_278_N_GenomeWideSNP_6_A04_1322446 1 228706 603590 16 -0.2696
A raw copynumber data file:
Composite Element REF Signal CN_473963 2.87 CN_473964 2.044
A allele-specific copynumber file:
Composite Element REF Signal_A Signal_B SNP_A-8575125 1.865 0.026 SNP_A-8497791 1.843 -0.426
I'm not sure how to reformat these files into what is needed. Specifically, I don't see how I can get the allele-specific copy numbers (
nB) for each segment.
Does anyone have any suggestions?