Hi! I am actually looking at TCGA level3 RNASeqV2 data. My goal is to look at the DEGs (tumor vs. normal) and I'm looking at LUAD now.
I am using edgeR at the moment since the original rsem paper mentioned that those rsem can be processed by edgeR/ DESeq.
I have a couple questions that I was wondering if anyone might have any suggestion -
1. Does it make sense to include all the tumor samples available, including those that don't have the matching normal samples from the same participant, and analyze for the DEGs? What kind of normalization method would be recommended if I do so? Or can I just use the default normalization of edgeR?
2. I started out looking at only the matched TN and NT samples. Using the above I'm getting 5639 DEGs out of 20531 genes (FDR <=0.05, FC >=2) which seems like a lot? ( even a lot more if I don't use any FC filter)
3. There seems to be various discussion regarding what tools to use (http://seqanswers.com/forums/showthread.php?t=28515, https://groups.google.com/forum/#!topic/rsem-users/H1cswrvvmPs) I wonder if anyone has more experienced in analyzing TCGA dataset has any thought as to whether it is OK to use EdgeR, or should I use some other tools like EBSeq for RNASeqV2 data?
Any suggestion is greatly appreciated. Thanks a lot in advance!