I would like to plot RNASeq data that I have downloaded from TCGA in a PCA plot. I have found some great guides on how to plot the actual data in PCA using r in ggplot2 and such but my main question is what format data should I plot?
I currently have raw counts and RSEM data. Should I input raw counts into something like edgeR or deseq2 and filter for expression by cpm first? Should I normalise it? Should I stabilise variance using rlog2? Or convert to TPM and plot that? Argh I'm so confused. Grateful for any advice you can give me :)