Filtering somatic point mutations and CNV alterations on the gene level for multi-omics data integration
0
0
Entering edit mode
3.4 years ago
svlachavas ▴ 790

Dear Biostars community,

based on an unsupervised approach for multi-omics data integration for detecting molecular subtypes in a specific cancer type, I have different omics layers for the same patients (360): rna-seq expression data, CNV and somatic point mutations.

All of the different omics layers are on the gene level, with the number of features being around ~20k for both gene expression and mutations. As before fitting the model, I would like initially to perform feature reduction to reduce the number of features:

I was wondering except expression data, in which I could implement a non-specific intensity filtering and/or variance, how I could deal with the mutational data regarding the filtering process ? For example, the range of values in the CNV data are from -2 to 2 (GISTIC values), and for the somatic point mutations is 0 for silent mutations, and 1 elsewhere. Thus, one putative approach would be after gene expression filtering, to keep only the genes also in the mutational data that overlap ? As this could satisfy the approach of mutated genes that are expressed at least in a minimal number of samples?

On this premise, could an alternative filtering approach be implemented for the mutational data ? One major concern is that especially for the somatic point mutations, If I would filter based on the frequency of 0s (like no mutation events), I might loose genes that are mutated in a small number of samples but "within" a specific subtype...

Thank you in advance,

Efstathios

feature reduction somatic mutations multiomics • 701 views
ADD COMMENT

Login before adding your answer.

Traffic: 2067 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6