Question: Analysing gene expression across clinical parameters
1
gravatar for elizabethR
22 months ago by
elizabethR40
elizabethR40 wrote:

Hi

I have been doing some bioinformatic analysis of TCGA data as an adjunct to my PhD. I am wanting to take this a little further and analyse RNASeq expression data according to clinical parameters also, such as disease stage, age and sex, histological markers of invasion etc. Has anyone done this analysis? What sort of statistical analyses do you use for this? I am assuming that performing ANOVA analysis is not appropriate with such a large dataset with so many multiple comparisons being made across the dataset? Bioinformatic statistics is a very new area to me.

Also could anyone recommend packages to do this type of analysis? I have started using the amazing R studio and TCGAbiolinks but there doesnt appear to be a package in the Bioconductor guide that is suitable for this.

Really grateful for any advice guys :-)

clinical rna-seq tcga • 697 views
ADD COMMENTlink modified 22 months ago • written 22 months ago by elizabethR40

Hi elizabethR I am not very familiar with TCGA data but If you want to do a class comparison test between two or more phenotypes ,first you should preprocess your data(including log transformation,summarization,normalization),if you want to use GEO data,I offer you using InsilicoDB https://insilicodb.com to retrieve preprocessed data, and it provides you a pipeline to do your job simply.In addition limma is a robust package for analyzing gene expression data it could create a liner model for finding markers of each phenotype https://bioconductor.org/packages/release/bioc/html/limma.html

ADD REPLYlink written 22 months ago by Shamim Sarhadi200
1
gravatar for elizabethR
22 months ago by
elizabethR40
elizabethR40 wrote:

Thank you guys, EagleEye that link was very useful. EdgeR manual says you should use raw counts that haven't been normalised because it's normalised and log transformed as part of its mathematical modelling. I've used edgeR to perform differential expression analysis across clinical parameters

ADD COMMENTlink written 22 months ago by elizabethR40

In that case have a look at this post,

A: How to work with Level 3 data (RPKM values) from TCGA database

ADD REPLYlink written 22 months ago by EagleEye5.0k

Hi elizabethR,

I am working on TCGA data of Lung cancer adenocarcinoma. Refereeing to your above question and the comments, as you said that you have used edgeR to perform differential expression analysis across clinical parameters, me also working on the same type of analysis. I want to find the DEG between tumor stages (Stage I to IV) but im not able to find the right way to do so. here is the my post on biostars where i have provided the R code i have used and how i have got the samples from the FireBrowse data.

Not able to get DEG from FireBrowse data using R

I think i'm going wrong at some posint in the analysis, If you have a sample R code, please share it with me or please help me find what is wrong I am doing in the DEG analysis.

Thank you in advance

ADD REPLYlink written 20 days ago by lawarde.ankita110
0
gravatar for EagleEye
22 months ago by
EagleEye5.0k
Sweden
EagleEye5.0k wrote:

Check this out, A: Use of TCGA database to get information on protein Expression

ADD COMMENTlink written 22 months ago by EagleEye5.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 735 users visited in the last hour