Often times, it is very hard to find the right omics data for your precision oncology research project. Learning about the impact of next-generation sequencing and the explosive growth of publically available data, one might just wonder where the RNA-seq dataset on cancer is and how easy is it to find what you are looking for.
Interacting with many students during our OmicsLogic educational programs, we realized the need for high-quality data sources that anyone can learn about and use. Good data is data from collections that follow a certain level of meta-data annotation with minimal restrictions and easy access to all the files. For example, detail of phenotypic information associated with samples as well as file size or sequencing instruments being used. Another criterion is the number of replicates, whether they are technical or biological - best repositories contain many samples per stud.
We compiled a small list of resources where you can find RNA-seq data to start your oncology bioinformatics project:
1. Elixir’s Expression Atlas
2. NCBI – National Center for Biotechnology Information (https://www.ncbi.nlm.nih.gov/bioproject)
While it is not the easiest place to find a dataset you are interested in, once you learn to navigate the NCBI site, you can find a lot of good datasets. A BioProject is a collection of biological data related to a single initiative, originating from a single organization or a consortium. A BioProject record provides users a single place to find links to the diverse data types generated for that project. As you search, you can narrow down the results to include RNA-seq, type of cancer and organisms that you want to be included in your results:
3. TCGA – The Cancer Genome Atlas
Finally, we cannot ignore the Cancer Genome Atlas – a huge repository of data that can be very useful for a variety of reasons.
Did we forget to mention a major resource for the RNA-seq database for cancer you like? Let us know by posting a comment below!
Mohit Mazumder, Ph.D.Machine learning & Computational Biology Pine Biotech, Inc. USA