RNA-seq cBioPortal differential gene expression
1
1
Entering edit mode
4.4 years ago

Hello, I am very new to RNA-seq data and have no bioinformatics background. I have downloaded the TCGA data from cBioPortal. I would like to know about ‘data_RNA_Seq_v2_expression_median’. I presume they are absolute expression values. Why is it that for some tumors, the values are zero?

Can these values be converted to log scale for comparison of expression between different histologic subtypes of tumors (as for microarrays)? I would like to identify differentially expressed genes and generate heat maps from the expression data. How do I go about it and which software would be useful for this ?

Thanks, Deepti

rna-seq • 3.9k views
2
Entering edit mode
4.4 years ago
Ron ★ 1.0k

-Yes, those should be RSEM expression values.The expression values could be 0 for genes.

-cBio portal has z-scores as well,(which could be in negative).

-Expression values can be converted to log scale for heat maps/visualizations.

-For doing differential expression analysis,you can follow these tutorials:

https://www.r-bloggers.com/tutorial-rna-seq-differential-expression-pathway-analysis-with-sailfish-deseq2-gage-and-pathview/

http://www.bioconductor.org/help/workflows/rnaseqGene/

0
Entering edit mode

careful because

log(0) = -Inf

!! you need to add a small constant if you want to go down that road

0
Entering edit mode

Yes,usually we add log2(RPKM + 1) or log2(FPKM + 1). Some people also add 0.1 instead of 1.Both of them work.

0
Entering edit mode

Dear Tris,

As mentioned above I have downloaded the TCGA data from cBioPortal. I would like to know about ‘data_RNA_Seq_v2_expression_median’. This have values like following:

Hugo_Symbol    TCGA-2V-A95S-01  TCGA-2Y-A9GS-01    TCGA-2Y-A9GT-01
UBE2Q2P3                1.5051       26.412                0
UBE2Q2P3                3.7074       2.6663              4.4833
LOC149767              90.1124       71.0054             95.5122
TIMM23                1017.1038      639.2311           742.4344
MOXD2                     0            0                    0
LOC155060              141.3911      22.7206              95.046
LOC100132347          1285.5514     1281.4194            535.306
EFCAB12                20.1987        27.5998            151.5355
LOC147680             22282.261     22641.9271           77669.9484
A1CF                583.1569      1572.6959            1280.4304


To do differential expression analysis "Ron" mentioned some tutorials but those are with raw counts data. I would like to know how to do differential analysis with the above mentioned table. Do I need to transform the data before using for Differential analysis? Could you please tell something about this?

Thank you

0
Entering edit mode

Hi Bioinfo,

I would be really glad if you could share the answer you found for your question also

0
Entering edit mode

Hello, Thank you for the replies.