I have a question regarding the examination of Lung Adenocarcinoma (LUAD) data from TCGA database, specifically RNAseqV2 level3
In order to examine the batch effects in these data, I used mbatch website http://bioinformatics.mdanderson.org/main/TCGABatchEffects:Overview.
Based on the number of samples, I saw that they used the same samples as I did, but I could not repeat their results of PCA. I'm sure that the reason for that is that I'm not using the same data as they did.There are many files belonging to Level 3 in RNAseqV2: gene level, exon level, raw data, normalized data, estimated data.
My question is what data did they use for their PCA analysis. I posted this question in the mbatch forum, but I didn't get any answers.
This is why I'm posting the same question here, hoping that may be someone would know the answer.
Thank you very much!