Question: How to understand Instance and Array in UKB phenotype data
6 months ago
Shicheng Guo
wrote:

Dear All,

I want to receive some help to understand Instance and array in the following context:

UKBB: The standard dataset (downloadable directly by researchers) contains a record of all the bulk data-files approved, however only the data-file IDs are present rather than the actual contents of the files themselves. These data-file IDs have the format "F_I_A" where F is the field ID, I is the instance index and A is the array index. Hence 8034_4_2 corresponds to Field 8034, Instance 4, Array 2.


phenotype instance ukb field array
6 months ago
6 months ago
New York
Sam wrote:

Array is for fields that can have multiple entries e.g. ICD10. It is basically breaks down the array into multiple file. For example, if someone has 20 ICD code, then those 20 ICD code will be found in 10 different array files.

Instance is something related to the multiple measurement. Some of the samples were assessed multiple time. Instance represents when did the current phenotype were measured. For example, instance 0 is usually used as that is the baseline measurement and have most data point. Instance 1 is the first follow up and instance 2 is the second follow up etc. Hope this help

6 months ago

Data-Field 5986 has 114 array. How to figure out how these 114 array are generated?

5,758,563 items of data are available, covering 95,140 participants. Defined-instances run from 0 to 1, labelled using Instancing 2. Array indices run from 0 to 113. Units of measurement are seconds

Thanks Sam!!

4 weeks ago by Shicheng Guo

have never worked with this phenotype so I don't have any idea. Their document on this also isn't very clear. If I have to guess based on information from the note section, which stated: The Bike Test consists of many phases. A phase is generally divided into number of stages. At various points during the test, called trends, readings about heart rate, workload etc are recorded. This field contains the time spent within the phase of the trend entry.

I'd reckon each array represent the time spent on each stages of each phased.

21 days ago by Sam
