RNA seq analysis ( pre-processing)
1
0
Entering edit mode
3.4 years ago
leticia ▴ 20

Hi everyone, i really need your help, I am a beginner in R and also bioinformatics, my problem is that I want to do a differential analysis in R, I have a dataset from TCGA with 27 genes from 1083 patients , i want to see the expression of gene in two conditions when i was searching about that I find that i have to start by a pre-treatment analysis first but when i found some scripts i noticed that my dataset is very different from others and i don't know how to start. Please can anyone help me to find a solution? thank you in advance

RNA-Seq R • 1.1k views
ADD COMMENT
4
Entering edit mode

Well, nobody can help you because we can neither see the format in which your data is stored. To maximise the possibilities of receiving help, you need to provide a minimal reproducible example [of data]. You should also show some lines of code that you have already tried so that we are sure that we are not just doing your work for you. You should also explain from where, exactly, you got your data. Finally, which "scripts" did you find? Thanks.

ADD REPLY
1
Entering edit mode

With only 27 genes you will have a hard time normalizing data properly, but as Kevin says, please add details and clarity.

ADD REPLY
0
Entering edit mode

Thank you very much for your reply, in fact i noticed that i need first the control database of TCGA but i don't know how to find it , do you have any idea ?

ADD REPLY
0
Entering edit mode

It's not clear what you want - is it control samples?

I would recommend following a user friendly workflow, such as those provided by TCGAbiolinks: https://bioconductor.org/packages/release/bioc/html/TCGAbiolinks.html

ADD REPLY
0
Entering edit mode

Yes exactly I am looking normal samples in TCGA that matched cancer samples

ADD REPLY
1
Entering edit mode

I see. If you are a beginner with minimal experience, I really encourage you to try TCGAbiolinks.

ADD REPLY
1
Entering edit mode

I will thank you Kevin

ADD REPLY
1
Entering edit mode
3.2 years ago
Elucidata ▴ 270

Following are some sources that shall help to get tumor along with control samples from TCGA:

  • Xena Browser is a consortium of processed TCGA, GDC, and other public cancer genomics data sources. It has a dataset with a combined cohort of GTEx, TCGA, and TARGET samples.
    1. TCGA is a database of 33 cancer types with matched control samples.
    2. TARGET database is more focused on the development of effective medical treatment for the pediatric population.
    3. GTEx project is focused to build tissue-specific gene expression and regulation. A batch correction was performed while combining such a large dataset.

To download the dataset: TPM Gene Expression and Phenotype. The differential expression can be performed on any cohorts of interest using the Limma package in R.

  • TCGAbiolinks is an R package and freely available through the Bioconductor. It handles data retrieval and query from TCGA & GDC databases. It also provides multiple methods for data analysis (e.g., differential expression analysis, identifying differentially methylated regions) and visualization (e.g., survival plots, volcano plots, starburst plots).
ADD COMMENT

Login before adding your answer.

Traffic: 1651 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6