Hello everyone,
I am trying to use Machine Learning approaches to predict cancer outcomes, i.e. to classify samples (patients) according to a certain clinical outcome such as risk, recurrence or survivability.
I am trying to use genomic-based data from The Cancer Genome Atlas (TCGA) and my aim is to come up with datasets in a gene-patient matrix format with the corresponding class labels (for example tumor/normal) for each patient.
I tried to download some breast cancer (BRCA) DNA Methylation data but I am confused as to how to proceed in order to get the matrix in question. So my question is what files should I download and how do process it in order to obtain such a matrix from TCGA data ?
Thank you very much for your help.