Question: Multi-Omics integration for non-cancer disease
gravatar for madkraus
9 weeks ago by
madkraus0 wrote:

Dear all,

I've got some data and no research question to answer, but some spare time and some students to explore it :-D It's in total 60 human individuals with a chronic disease (non genetic, non infectious, non cancer):

  • 38 x whole genome (derived from blood) + 0 controls
  • 25 x RNAseq (derived from affected tissue biopsy) + 7 healthy controls (of which only one has no matched proteome)
  • 29 x shot-gun proteome (derived from affected tissue biopsy) + 16 healthy controls

14 patients have the complete set of all omics, some patients have only 2 matched omics, some only 1, some 0. Preprocessing and Quality control went fairly well, differential expression analysis were done on the expression sets separately. Additionally we've got an armada of clinical parameters for the patients.

So far I have considered to do eQTL and pQTL analysis and done some research on multi-omics integration and unsupervised disease subgroup detection, but so many tools were developed for or at least only tested on cancer data. Additionally, our data is now so fat and short (p>>>n), and although it's great to have it, analysis are likely to fail (?).

  • Do you have any ideas, hints, links on fruitful analysis and make the best of it?
  • Do you have a pessimistic/optimistic opinion, if the analysis of the data is meaningful at all?
  • Do you have a strategic opinion regarding research on the set? (E.g. publish all results separately? Or in a whole? First in a data set journal, then results? ...)

I'd appreciate any hint, opinion, help, guidance :-) Milena

snp rna-seq genome • 144 views
ADD COMMENTlink written 9 weeks ago by madkraus0

Am curious as to why this data was generated in first place? Has some other analysis been done on it to answer the original question (I assume the DE analysis may be part of it).

Since this is now a fishing expedition, I suppose you could try to see if you can find correlations between expressed component (RNAseq) and the proteome data. Latter is likely to be sparse so it may be a difficult challenge. You may as well focus on the 14 patients that have the complete datasets to reduce one variable of unmatched datasets.

ADD REPLYlink written 9 weeks ago by genomax59k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 581 users visited in the last hour