Hello people! I am completely new to the topic of ML and Omics and at all in the bioinformatics field. To gather some knowledge I started to work through a book I found on the internet and there is the following task:
"Produce a 10-component ICA from the expression data set. Remove each component and measure the reconstruction error without that component. Rank the components by decreasing reconstruction-error. [Difficulty: Advanced]"
So I kind of understood what the reconstruction error is (so the differences of the points that are "reconstructed" from my component analyses to the acutall points.). But I am kind of struggeling to think through that thing. Would there be someone that could help me with that and explain on the way for a total beginner :) Would be awesome!
Some background for the task:
- expression data is from leukemia patients (ALL, CML, AML, CLL and no-leukemia)
- rows are the genes (ENSG ....)
- colums are the cells (ALL_GSM330151.CEL)
- dim(df) = 60 1000
And here would be the book with the exercise:
(second question from the bottom of the page)