Question: P-Value calculation from iHS and XP-EHH scores
3
gravatar for SOHAIL
2.3 years ago by
SOHAIL260
Beijing Institute of Genomics, CAS.
SOHAIL260 wrote:

Hi Everyone,

I am anew in this. So forgive me if anything asking stupid! I calculated genome-wide standardized iHS and XP-EHH scores from 'Selscan' software. They both are present in negative and positive values. i read in some papers people choose |iHS| > 2 as a significant region cut-off, but i have read no cutoff for XP-EHH.

For XP-EHH values, i calculated p-values using 'pnorm' in R something like:

data<-read.table("CHR.xpehh-norm.reformat1.txt",header=FALSE,sep=" ") 
p<-vector()
for (i in 1:dim(data)[1])
{if (data[i,4]>0)
  p[i]<-pnorm(data[i,4],lower.tail=FALSE)
else p[i]<-pnorm(data[i,4],lower.tail=TRUE)
}
write.table(p,file="xpehh.p.chr.txt")

I have later read about p-values/Zscore calculation at C: A Database Of Signatures Of Selection In The 1000 Genomes Dataset

Problems:

  1. can anyone please guide me how to correctly calculate the p-values for XP-EHHH and iHS scores (am i doing right in above mentioned scenario??).

  2. and in iHS case we are considering only absolute values, how to calculate p-values for that?

  3. and is there any general cutoff standardized scores (like |iHS| > 2) for the scores calculated by these two tests, so that selective outliers can be identified?

Thanks a lot in advance!

selection ngs statistics selscan R • 2.0k views
ADD COMMENTlink modified 2.2 years ago by GabrielMontenegro520 • written 2.3 years ago by SOHAIL260
2
gravatar for GabrielMontenegro
2.2 years ago by
United Kingdom
GabrielMontenegro520 wrote:

On the paper introducing iHS they recommended using that threshold, but that still won't give you a P-value. If you check the paper you will see that they decided to compute empirical P-values using an outlier approach. To do that you simply sort all the scores genome-wide and then divide the rank by the total number of values in your distribution.

For iHS you can use the absolute standardised iHS scores. For XP-EHH -because this test is directional-, you should only use positive values.

Then you have another option which is to compute approximate P-values by simulating the distribution of your selection statistics under a neutral demographic model. In this case you should have a fairly good understanding on the history of your population, to be able to accurately reproduce its demographic history.

Hope this helps!

ADD COMMENTlink modified 2.2 years ago • written 2.2 years ago by GabrielMontenegro520

Hi @JM88,

Thanks for your response, However, here in the comments you mentioned:

"For iHS you can use the absolute standardised iHS scores. For XP-EHH -because this test is directional-, you should only use positive values." "Another option which is to compute approximate P-values by simulating the distribution of your selection statistics under a neutral demographic model"

Questions: 1. Can you please explain Why we need to use only positive values for XP-EHH? 2. Can you please suggest any previously accepted software/methods by which i can accurately reproduce the demographic history and intergrate the results with selection model genome-wide?

Thanks!

ADD REPLYlink written 2.2 years ago by SOHAIL260

Check the paper where Sabeti et al. introcuded the XP-EHH statistic. If I remember correctly you will find a detailed explanation of how XP-EHH is computed. Basically it is a ratio of the iHH of popA and popB, therefore directional. Also, if I remember correctly (again) the selscan manual has also a summary of all the statistics included in their software - including XP-EHH.

Probably the simulator ms (Hudson, 2002) would be a good way to start? Also check papers where they used this method instead of the outlier approach. I think the nSL paper used a simulation approach.

ADD REPLYlink written 2.2 years ago by GabrielMontenegro520
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1049 users visited in the last hour