How to standardize iHS, NSL and XP-EHH scores
1
0
Entering edit mode
5.7 years ago
eyb ▴ 230

I have calculated a bunch of scores for my data using selscan software. How do I standardize the results? At the moment I am using the following Python line: -np.log10(st.mstats.zscore(t[7])) Am I doing this right?

Description of the function zscore: Calculates the z score of each value in the sample, relative to the sample mean and standard deviation.

selection SNP • 3.2k views
4
Entering edit mode
5.7 years ago
patel.ravip ▴ 60

Hi eyb

selscan itself has a built in standardizing binary you can use ("norm"). However, in the past, I have done it in the following way: 1) read in all derived (or ancestral; it doesn't matter as long as you are consistent across all sites) allele frequencies and iHS scores into a table you can manipulate (pandas in Python works well) 2) bin your sites by 1% allele frequency (or larger depending on your total number of sites. For fewer sites, use larger bins; if you have entire chromosomes/large number of sites, e.g., 100,000+, with iHS scores, use 1%) 3) Use the zscore method you have been using on each bin separately. Or, you can calculate the mean and std.dev. manually for each of the iHS scores for a given AF bin, and use that to get a z-score that way. I am not familiar with the zscore method in scipy, but I imagine it does exactly this.

Voight mentioned in the iHS paper that the score should be standardized among sites that have a similar AF (which is why you bin the sites by AF before calculating a z-score). What you are doing currently is standardizing across the whole spectrum of AF.