the selection between legacy archive and harmonization in TCGA when compared to GTEx in the differential expression analysis
5.0 years ago
Yun ▴ 230

I want to do a differential expression analysis between prostate cancer and normal tissue with the former RNA-seq read counts from TCGA and the latter RNA-seq read counts from GTEx. I realize the workflow of RNA-seq and alignment difference may cause problem. I find TCGA has two different part-Legacy and harmonization, their reference genomes are different(harmonization used hg38 and legacy data hg18 or hg19), and GTEx reference genome is hg19. however, the raw counts in TCGA legacy archive are RSEM values which are decimal and GTEx data are integer, I don't know if I can select harmonization data of TCGA to make differential expression? which should I choose?

