Dear All, I am new on data mining and I would like to do it in R. I did read the documents on http://www.rdatamining.com/. There is quite informative information but I am not sure if I could use the same packages for mining genes/Proteins/mutations from bio-medical data bases (TCGA, COSMIC ...). Do anyone has such an experience in R and which packages can be used? Many thanks in advance, Rahel
If you want to analyze the TCGA data, I would recommend the TCGA-biolinks, which is a Bioconductor package in R. Here is the newest version https://github.com/BioinformaticsFMRP/TCGAbiolinks. You can refer to the tutorials to learn how to extract the different levels of information in normal tissues and tumors, including mutations, gene expression, gene methylation and so on.
As for the cosmic database, you can download the data from http://cancer.sanger.ac.uk/cosmic/download, the majority of data was mutation-related. It is just the tab format file, and you can simply import it to R and perform the furthermore analysis.