9.4 years ago
dmw610

This is more of a general statistics question, but I've gotten awesome help here before so I figured I'd try it out again.

I noticed that in my department, older professors seem to have more male-dominated labs than younger professors do. I collected the ages and gender makeup of 100 labs, and there is indeed a small correlation between the age of the professor and the gender of his or her employees (Pearson's r = .27). The problem is that the professors all have different numbers of workers. One professor might have 1 male and 0 female employees, while another professor has 14 male and 2 female employees. But calling the first lab "100% male" and the second "87.5% male" obscures the fact that the gender bias is much worse in the second lab. I want to do correlation analysis that takes into account the different sizes of the labs. Does anyone have any thoughts on how to do this? Thanks!

Two important statistical points that I've learned: One, is your sample a random sample? Statistics are only valid given a random sample unless you have access to the whole population. Two, was the correlation (Pearson r=0.27) significant? What was the p-value you obtained? I'm eager to also expand on these learnings. Any pro to pitch in?

9.4 years ago

It sounds like what you really want is a logistic regression. This is relatively easy in R and here's a nice tutorial from UCLA.