Question: RNA-seq cBioPortal differential gene expression
gravatar for kashyap_path10
2.9 years ago by
kashyap_path1010 wrote:

Hello, I am very new to RNA-seq data and have no bioinformatics background. I have downloaded the TCGA data from cBioPortal. I would like to know about ‘data_RNA_Seq_v2_expression_median’. I presume they are absolute expression values. Why is it that for some tumors, the values are zero?

Can these values be converted to log scale for comparison of expression between different histologic subtypes of tumors (as for microarrays)? I would like to identify differentially expressed genes and generate heat maps from the expression data. How do I go about it and which software would be useful for this ?

Thanks, Deepti

rna-seq • 2.5k views
ADD COMMENTlink modified 2.9 years ago by Ron970 • written 2.9 years ago by kashyap_path1010
gravatar for Ron
2.9 years ago by
United States
Ron970 wrote:

-Yes, those should be RSEM expression values.The expression values could be 0 for genes.

-cBio portal has z-scores as well,(which could be in negative).

-Expression values can be converted to log scale for heat maps/visualizations.

-For doing differential expression analysis,you can follow these tutorials:

ADD COMMENTlink written 2.9 years ago by Ron970

careful because

log(0) = -Inf

!! you need to add a small constant if you want to go down that road

ADD REPLYlink modified 2.9 years ago • written 2.9 years ago by TriS3.9k

Yes,usually we add log2(RPKM + 1) or log2(FPKM + 1). Some people also add 0.1 instead of 1.Both of them work.

ADD REPLYlink written 2.9 years ago by Ron970

Dear Tris,

As mentioned above I have downloaded the TCGA data from cBioPortal. I would like to know about ‘data_RNA_Seq_v2_expression_median’. This have values like following:

Hugo_Symbol    TCGA-2V-A95S-01  TCGA-2Y-A9GS-01    TCGA-2Y-A9GT-01
UBE2Q2P3                1.5051       26.412                0
UBE2Q2P3                3.7074       2.6663              4.4833
LOC149767              90.1124       71.0054             95.5122
TIMM23                1017.1038      639.2311           742.4344
MOXD2                     0            0                    0
LOC155060              141.3911      22.7206              95.046
LOC100132347          1285.5514     1281.4194            535.306
EFCAB12                20.1987        27.5998            151.5355
LOC147680             22282.261     22641.9271           77669.9484
  A1CF                583.1569      1572.6959            1280.4304

To do differential expression analysis "Ron" mentioned some tutorials but those are with raw counts data. I would like to know how to do differential analysis with the above mentioned table. Do I need to transform the data before using for Differential analysis? Could you please tell something about this?

Thank you

ADD REPLYlink modified 2.2 years ago • written 2.2 years ago by Biologist160

Hi Bioinfo,

I would be really glad if you could share the answer you found for your question also

Thanks in advance

ADD REPLYlink written 2.1 years ago by gokce.ouz50

Hello, Thank you for the replies.

ADD REPLYlink modified 2.8 years ago • written 2.8 years ago by kashyap_path1010
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1639 users visited in the last hour