I have been doing some bioinformatic analysis of TCGA data as an adjunct to my PhD. I am wanting to take this a little further and analyse RNASeq expression data according to clinical parameters also, such as disease stage, age and sex, histological markers of invasion etc. Has anyone done this analysis? What sort of statistical analyses do you use for this? I am assuming that performing ANOVA analysis is not appropriate with such a large dataset with so many multiple comparisons being made across the dataset? Bioinformatic statistics is a very new area to me.
Also could anyone recommend packages to do this type of analysis? I have started using the amazing R studio and TCGAbiolinks but there doesnt appear to be a package in the Bioconductor guide that is suitable for this.
Really grateful for any advice guys :-)