I have the TCGA-READ RNA-seq data obtained from GDC Portal.
The samples are of types:
"primary tumor" and
"solid tissue normal" collected from individuals. The solid tissue normal is a normal tissue sample that is adjacent to the primary tumor. Henceforth, the solid tissue normal may not necessarily be a normal tissue as the sample is still from an individual who has a tumor. So, it would be incorrect to label such samples as normal.
For the binary classification problem, I need
tumor samples and
normal samples from disease and healthy individuals respectively.
I don't know if I am right but I think the normal samples from any given TCGA data that are of blood-derived/ solid tissue normal tissue sample types may not be having samples collected from normal individuals (disease-free).
Can anyone please suggest on where/how do I get the normal samples?
If there is some website with normal samples, how do I match the genes from the current tumor data?
Any suggestions are highly appreciated. Thanks