I am trying to extract a discrete label (0/1) for a classification (supervised learning) task based on two pieces of information available for each patient, dfs.t and dfs.e, in different cancer related studies. My main concern here is the way that researchers fill in the dfs.e column for patients, this is what I think:
dfs.e = 0: no relapse/recurrence/distant-metastasis within dfs.t time frame
dfs.e = 1: relapse/recurrence/distant metastasis/death-caused-by-cancer occurred at dfs.t
Is this interpretation right? I was wondering if there is any conventional way for dealing with data like this.
Thanks in advance,