Feature Selection for DNA methylation data
0
0
Entering edit mode
7 weeks ago
Ahmed • 0

Hi, All I am building a supervised model on DNA methylation data for liver cancer. A general practice for feature selection is we remove the highly correlated columns. But, I am concerned that by doing this with methylation data , I can lose important features which may lead incorrect or biased prediction.

Cancer Feature-selection methylation • 348 views
ADD COMMENT
1
Entering edit mode

You will need to provide a bit more info in terms of what you are doing and what specifically you want help with?

What platform is your data - WGBS / methylation array etc?

What structure is your data in - what are the rows / columns?

What are you correlating and why?

What is your model supposed to be used for?

ADD REPLY
0
Entering edit mode
  1. I am training two models to identify top cg sites based on age and race using the methylation array data containing the methylation beta values via Illumina 450. The data used for feature selection has samples as row names and cg ( methylation values as columns )
  1. I have applied the standard deviation to reduce from 450k + cg sites to 5000. 3.Model 1 will predict the cg sites which are involved in causing cancer in middle and older age groups
  2. Model 2 will predict cg sites involved in cancer among in two races ( asian and white )
  3. This will be used to identify the potential biomarkers.
ADD REPLY

Login before adding your answer.

Traffic: 1335 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6