I couldn't find a proper documentation of the softwares used for generating the read counts of the TCGA level 3 data.
I have done 21 normal sample Vs 21 tumor samples analysis using TCGA RNASeq level3 data to find deferentially expressed genes using DESeq.
And further I have taken illumina body map SRA file, processed using TOPHAT and generated counts using HTSeq. The HTSeq read counts generated using TOPHAT bam files were compared with 21 tumor sample from TCGA level 3 data.
So, now as expected the differential expressed genes using DESeq between "illumina body map comparison with 21 tumor samples" and "21 normal sample Vs 21 tumor samples from TCGA" should have good overlap of deferentially expressed genes. But the overlapping genes are very less.
Does this means there is something wrong with the processing of illumina body map file and or due to the variation in protocol followed for TCGA data?
Could anyone tell me how the read counts in the TCGA level 3 data is generated? using which program?