Question: How to understand Instance and Array in UKB phenotype data
gravatar for Shicheng Guo
6 months ago by
Shicheng Guo8.5k
Shicheng Guo8.5k wrote:

Dear All,

I want to receive some help to understand Instance and array in the following context:

UKBB: The standard dataset (downloadable directly by researchers) contains a record of all the bulk data-files approved, however only the data-file IDs are present rather than the actual contents of the files themselves. These data-file IDs have the format "F_I_A" where F is the field ID, I is the instance index and A is the array index. Hence 8034_4_2 corresponds to Field 8034, Instance 4, Array 2.


phenotype instance ukb field array • 207 views
ADD COMMENTlink modified 6 months ago by Sam3.3k • written 6 months ago by Shicheng Guo8.5k
gravatar for Sam
6 months ago by
New York
Sam3.3k wrote:

Array is for fields that can have multiple entries e.g. ICD10. It is basically breaks down the array into multiple file. For example, if someone has 20 ICD code, then those 20 ICD code will be found in 10 different array files.

Instance is something related to the multiple measurement. Some of the samples were assessed multiple time. Instance represents when did the current phenotype were measured. For example, instance 0 is usually used as that is the baseline measurement and have most data point. Instance 1 is the first follow up and instance 2 is the second follow up etc. Hope this help

ADD COMMENTlink written 6 months ago by Sam3.3k

Data-Field 5986 has 114 array. How to figure out how these 114 array are generated?

5,758,563 items of data are available, covering 95,140 participants. Defined-instances run from 0 to 1, labelled using Instancing 2. Array indices run from 0 to 113. Units of measurement are seconds

Thanks Sam!!

ADD REPLYlink modified 4 weeks ago • written 4 weeks ago by Shicheng Guo8.5k

have never worked with this phenotype so I don't have any idea. Their document on this also isn't very clear. If I have to guess based on information from the note section, which stated: The Bike Test consists of many phases. A phase is generally divided into number of stages. At various points during the test, called trends, readings about heart rate, workload etc are recorded. This field contains the time spent within the phase of the trend entry.

I'd reckon each array represent the time spent on each stages of each phased.

ADD REPLYlink written 21 days ago by Sam3.3k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2036 users visited in the last hour