This is the first part of the overall analysis pipeline, mainly documenting introductions of important resources and the data downloading methods.
- TCGA (The Cancer Genome Atlas)：Human cancer database, on one hand, there is a huge number of molecular data (including DNA, RNA and protein levels) based on a series of collections of cancer tissue samples, tumor_matched_normal samples and a few normal tissue samples. On the other hand, it also contains multiple clinical data (such as the TNM grading of tumor, patient survival time, patients' age, sex, race and so on). Until now, it has documented nearly 2.5PB multi-level data volume of more than 10,000 patients, stretching across over 30 tumor types. From 2016, the TCGA database has migrated to GDC (Genomic Data Commons), it is said TCGA will come to close in 2017, come on, there is only 2 weeks left!
The overall post is here: Fundamental Analysis of TCGA RNA-seq Data-01
Welcome to my blog: bioinfostar, I would like to record and share my learning experience here, waiting for new ideas and thoughts~~~