How to handle data_RNA_Seq_v2_expression_median from TCGA
1
2
Entering edit mode
6.7 years ago

hi, I currently downloaded the data_RNA_Seq_v2_expression_median.txt from cbioportal. I found the read_counts is not integer and I don't know how to process this type of normalized data with Deseq2.

Should I download the previous-leveled data from portal and use Deseq2 from bottom to the top, or is there other package to find out the differentially expressed genes?

Hugo_Symbol TCGA-BJ-A0YZ-01 TCGA-BJ-A0Z0-01 TCGA-BJ-A0Z2-01
UBE2Q2P2    1.8867  2.6927  10.0867
HMGB1P1 139.6335    181.2141    203.7297
LOC155060   45.3978 131.8725    248.4856
RNU12-2P    0.4165  0.3948  0.9502
SSX9    0   0   0
CXORF67 0   1.1845  2.3756

Hope some with experience handling the pre-processed data from TCGA could answer this question.

Many thanks!

Michael

rna-seq TCGA • 3.3k views
ADD COMMENT
1
Entering edit mode
5.9 years ago

Edit: 14th May, 2020:

better to obtain the HT-seq raw counts from Xena Browser and process those in DESeq, following this guidance: A: Normalisation of RNAseq data from UCSC Xena Browser

Original answer:

------------------------------

For DESeq2, you should obtain the raw counts. This data from cBioPortal is already normalised.

However, if you obtain the Z-scores from cBioPortal, then, according to cBioPortal, you can infer that something is higher in tumour if it has a Z-score >=2.

If you want the raw counts, you can obtain those from the GDC Data Portal.

Kevin

ADD COMMENT

Login before adding your answer.

Traffic: 1817 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6