On the document for MutSigCV: http://software.broadinstitute.org/cancer/software/genepattern/modules/docs/MutSigCV
I know you can use the datasets come along with the software, but it is not going to be the best if you can provide details from your own data. So I am trying to provide information from myown data.
But this is very confusing. How are these defined?
7 is clear. (1) How is CpG defined? Is it a CpG as long as ref_allele is C/G and it has adjacent nucleotide G/C, or it has to be CpG island? (2) What are C:G and A:T?
If I look at the data set comes along with software:
gene effect categ coverage A1BG noncoding A(A->C)A 12
A1BG noncoding A(A->C)C 14
A1BG noncoding A(A->C)G 15
A1BG noncoding A(A->C)T 9
A1BG noncoding A(A->G)A 12
A1BG noncoding A(A->G)C 14
A1BG noncoding A(A->G)G 15
A1BG noncoding A(A->G)T 9
the categ column is not consistent with how it is defined in other input datasets.
What is coverage here? Is this tumor alternative count? The documentation is so confusing.