Question: Analysis of gene expression data
0
gravatar for shivangi.agarwal800
7 months ago by
shivangi.agarwal80040 wrote:

Hi

I have list of transcripts (~70,000) with their expression values in cancer and normal adjacent samples, I want to calculate p-value for each pair of transcripts. How to do that? Thanks in advance.

Regards

p-value • 636 views
ADD COMMENTlink modified 7 months ago • written 7 months ago by shivangi.agarwal80040
2

You should provide further information, such as:

  • the source of your data (e.g. web resource, microarray, RNA-seq, NanoString, or something else)
  • how many samples you have
  • any pre-processing that has been performed

Nobody can help you sufficiently with the information that you have currently provided.

ADD REPLYlink written 7 months ago by Kevin Blighe37k

OK, thanks for the information. The data is raw TCGA expression data and contains transcript id, expression value (TPM) in cancer and adjacent normal samples

Transcript_id       Expression value(TPM) in cancer     Expression value(TPM) in normal adjacent
uc001aaa.3      0.519993743     2.07946602736613E-95
uc001aab.3      1.195267176     0.4213373079
uc001aac.3      0.2816408622        0
uc001aae.3      4.16436457205392E-67        6.37487575711405E-207
uc001aah.3      1.24270443179236E-241       0.1938754949
uc001aai.1      16.1663933608       5.3783318364
uc001aak.2      0       0.0286585119
ADD REPLYlink modified 7 months ago by Devon Ryan88k • written 7 months ago by shivangi.agarwal80040

Hello again . With this data, you cannot produce a p-value per gene. With your data, you just have a single value for each gene in the tumour and normal samples. Can you elaborate (describe further) the source of the data? Many third-party websites (i.e. outside of the National Cancer Institute of the USA) host TCGA data, which is in various stages of processing.

ADD REPLYlink written 7 months ago by Kevin Blighe37k

Hi The data is taken from TCGA by cancerrna nexus and we have got from there. Can I apply student's t test for the same?

ADD REPLYlink written 7 months ago by shivangi.agarwal80040
1

No, with the data that you have, you cannot use the Student's t-test. For example, if you wanted to derive a p-value for the uc001aaa.3 gene, your comparison would just be 0.519993743 Vs. 2.07946602736613E-95. A p-value cannot be derived from just 2 values.

From Cancer RNA-seq Nexus, you should try to obtain the expression values for the genes across all tumours and all normal samples. Then, you could begin to think about conducting differential expression analysis.

ADD REPLYlink modified 7 months ago • written 7 months ago by Kevin Blighe37k

If you do not have much experience with bioinformatics, then can I suggest that you reach out to (that is, contact) a local collaborator (in your university / college, or some other), and ask them for assistance.

Also, there are web-based GUIs that allow you to analyse TCGA data, such as cBioPortal

ADD REPLYlink written 7 months ago by Kevin Blighe37k
1

If you have count data, try following one of the RNA-seq expression tutorials online. Here's a good one to start with: https://f1000research.com/articles/4-1070/v1

ADD REPLYlink written 7 months ago by James Ashmore2.6k

I want to do it using t-test or any other statistical test in spss.

ADD REPLYlink written 7 months ago by shivangi.agarwal80040
1

Those tests are not appropriate for expression data.

ADD REPLYlink written 7 months ago by WouterDeCoster36k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 847 users visited in the last hour