Question: post-mortem brain gene expression, comparing tissue sample points across brains
2
gravatar for avari
3.6 years ago by
avari70
Germany
avari70 wrote:

Dear Biostar experts,

I want to analyze gene expression data from 6 adult brains belonging to the Allen adult human brain dataset. 

Each brain has up to 500 tissue samples with gene expression data per hemisphere across multiple brain structures. For each brain I have a csv file that looks like the following (each row has the MNI space coordinate and the brain structure per tissue sample).

structure_id

structure_name

polygon_id

mni_x

mni_y

mni_z

9137

abducens nucleus, left

978024

-5.1

-44.6

-42.7

9137

abducens nucleus, left

977815

-3.9

-40

-37.9

4329

amygdalohippocampal transition zone, left

73464

-22.1

-7.6

-9.6

4329

amygdalohippocampal transition zone, left

73236

-19.1

-13.6

-11.8

4114

angular gyrus, left, inferior bank of gyrus

27442

-43.9

-76.3

27.4

           
           

Each sample has been taken in a slightly different position within each brain structure, and not all brain structures were 

sampled the same number of times.  I have ordered the samples in each structure across each brain by the

mni coordinates. I have counted the number of samples per structure, per brain like so:

 

 

 

 

 

 

 

           
 

brain_1

brain_2

brain_3

brain_4

brain_5

brain_6

putamen, left

3

3

3

3

3

2

middle temporal gyrus

3

3

3

3

2

2

               

With this data I want to calculate a mean coordinate for each samples accross brains.  However due to the different number of samples points per brain structure, some rows have fewer data points than others.

One approach would be to take the intersect (i.e. in the example below every brain region was sampled at least 2 times, so generate 2 mean coordinates per region). The problem is that might result in losing a lot of data points. Another approach could be to allow 1 (or a certain proportion) of missing data point across brains, so in this example generate 3 mean coordinates for the putamen and 2 mean coordinates for the middle temporal gyrus ?

My question is if you think is this a reasonable strategy, and if you have any alternative suggestions ? I want my final list of coordinates to be as representative as possible of the underlying data. Thanks very much!

 

brain_1

brain_2

brain_3

brain_4

brain_5

brain_6

mean coord

 

 

 

 

 

 

 

putamen, left

x,y,z

x,y,z

x,y,z

x,y,z

x,y,z

x,y,z

 

 

 

 

 

 

 

 

x,y,z,

x,y,z

x,y,z

x,y,z

x,y,z

x,y,z

 

 

 

 

 

 

 

 

x,y,z

x,y,z

x,y,z

x,y,z

x,y,z

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

middle temporal gyrus

x,y,z

x,y,z

x,y,z

x,y,z

x,y,z

x,y,z

 

 

 

 

 

 

 

 

x,y,z,

x,y,z

x,y,z

x,y,z

x,y,z

x,y,z

 

 

 

 

 

 

 

 

x,y,z

x,y,z

x,y,z

x,y,z

 

 

...

 

 

 

 

 

 

 

               

 

 

 

ADD COMMENTlink modified 3.6 years ago • written 3.6 years ago by avari70
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 791 users visited in the last hour