Question: testing set for RNA-seq AML
0
gravatar for Ali
4 weeks ago by
Ali0
south africa
Ali0 wrote:

Hi everyone I use GDC TCGA to train machine learning, due to the low number of samples in the dataset I use I can't extract the testing set. So, I want to get testing set for RNA-seq AML from different resources.

Kind regards

rna-seq • 137 views
ADD COMMENTlink written 4 weeks ago by Ali0

Have you tried GEO or cBioPortal to see if they have any dataset you could use?

ADD REPLYlink written 4 weeks ago by newbio17240

Thanks for your reply

I use TCGA data was fine to download and run the analysis where the sample and genes are gathered in one csv file. GEO is not the same. looking for similar Kind regards

ADD REPLYlink written 4 weeks ago by Ali0

If the model that you have developed is 'robust' (yes, that magic word again...), then it should replicate in another dataset, including from GEO. Perhaps think through your experimental design again.

ADD REPLYlink written 4 weeks ago by Kevin Blighe66k

to test the model I have to use the clinical file as well where I can specify the sample class All the issue I am facing based on the clinical file (sample information)

ADD REPLYlink written 4 weeks ago by Ali0
2

Great, but please provide a minimal reproducible example of the problem; otherwise, I can only speculate what is the exact problem that you are facing.

ADD REPLYlink written 4 weeks ago by Kevin Blighe66k

Your TCGA expression data was probably generated using GDC workflow for RNA-Seq. You should be able to generate the dataset on your own by following the workflow.

ADD REPLYlink written 4 weeks ago by newbio17240

Thanks for your reply the analysis was for classification where sample grouped based on clinical information so any data I use for testing the model should have some clinical information.

regards

ADD REPLYlink written 4 weeks ago by Ali0
1

Realistically, it's going to be tough to find what you need. Since you're working with TCGA, I think you might want to try pan-cancer models first to increase the size of your dataset. Or you can really dig through GEO to find RNA-Seq data and accompanying clinical information in the published manuscript, assuming it is available.

ADD REPLYlink written 4 weeks ago by newbio17240

Thanks I will try

Regards

ADD REPLYlink written 4 weeks ago by Ali0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1314 users visited in the last hour