I want to find differential expressed miRNAs between tumor and normal tissues in TCGA data sets.
However when I download whole miRNA transcriptomic data sets, the number of samples belong to adjacent normal tissues or healthy individuals was extremely low(52 tumor samples against only 3 normal samples for Asians, for example).
I want to know if I have downloaded the miRNA data sets correctly.
Yes, you have most likely downloaded them correctly. For RNA-seq and micro-RNA-seq, the number of normals that were sequenced was minimal. For some of the TCGA cancers, there were no (zero) normals profiled for gene expression. One thing to remember is that most of the 'normal' samples are just from the tissue surrounding the primary tumour, so, they are not representative of true normal tissue.
The TCGA dataset that has most matched Tumour-Normal is breast cancer (BRCA), with ~108 pairings.
You may have to look for other public data if you want to increase your number of normals.
You mean add normal samples into your TCGA dataset? That may be problematic because the other normals may have been processed very differently.
If you need to look at gene expression in normal tissues, then maybe look at GTEx.
Also, you may be more interested to look at the normal samples listed at the ICGC Data Portal. Here I have configured the search for you: https://dcc.icgc.org/repositories?filters=%7B%22file%22.... There appears to be ~50,000 normal samples of different types, spread across patients with different cancers, and also these samples represent different data-types. This number will include the original TCGA normals, which are in the minority.
To access that data, you will most likely have to request access via the DAC.
Thank u.
So I will try to add normal samples from other databases.
You mean add normal samples into your TCGA dataset? That may be problematic because the other normals may have been processed very differently.
If you need to look at gene expression in normal tissues, then maybe look at GTEx.
Also, you may be more interested to look at the normal samples listed at the ICGC Data Portal. Here I have configured the search for you: https://dcc.icgc.org/repositories?filters=%7B%22file%22.... There appears to be ~50,000 normal samples of different types, spread across patients with different cancers, and also these samples represent different data-types. This number will include the original TCGA normals, which are in the minority.
To access that data, you will most likely have to request access via the DAC.