I am currently working on a project for the university: Thyroid Disease. The purpose is to understand the influence of parameters (age, pregnancy, TSH, t4...) on the diagnostic of the patient (hypothyroid, hyperthyroid...).
I have a huge matrix with:
-rows: the patient
age: continuous. sex: M,F. on_thyroxine: f,t. query_on_thyroxine: f,t. on_antithyroid_medication: f,t. thyroid_surgery: f,t. query_hypothyroid: f,t. query_hyperthyroid: f,t. pregnant: f,t. sick: f,t. tumor: f,t. lithium: f,t. goitre: f,t. TSH_measured: y,n. TSH: continuous. T3_measured: y,n. T3: continuous. TT4_measured: y,n TT4: continuous. T4U_measured: y,n. T4U: continuous. FTI_measured: y,n. FTI: continuous. TBG_measured: y,n. TBG: continuous.
DIAGNOSTIC (hyperthyroidn hypothyroid...)
1) My first question: I am not sure if I understand the data-->What does parameters with query mean ?
If the patient takes the thyroxine as the medication (in this case, what does "on thyroxine" mean ?) or if he askes to have the thyroxine ( yes would be yes he askes the doctor to have the drug)
I try to find the relative article to understand the data """See the following for a discussion of relevant experiments and related work: | Quinlan,J.R., Compton,P.J., Horn,K.A., & Lazurus,L. (1986).""" but I didn't find it
2) My second question: I build networks with several algorithm: ARACNE, PC, HC, MMHC
And each time, I get a different network where the link between parameters are different. So I don't know what informations I can get from these networks, which one is correct. Which method do you recomand to compare these network ?
3) I build a network for each diagnosis:
-hyperthyroid: there are 400 patients
-hypothyroid: there are 160 patients
I want to know if the network is the result of reducing my sample number (9000--> 400 patients) or there is a real influence of the diagnosis on the network. So my hint was to select randomly for example 400 patients among the data. And calculate a distance between the network constructed with the 9000 patients and the random networks with 400 patients and the distance between the network with 9000 patients and my network with my real 400 patients. Then I don't know how to compare this distance.
Thank you for your help