Does The Ihs Calculator From Pritchard Support Missing Data?
1
1
Entering edit mode
10.9 years ago
seth hearld ▴ 30

Hello.

I will have data where the ancestral and derived alleles are sometimes not encoded in the .ihshap file. i.e. The regions I'm interested in vary per individual. If I just take the regions of overlap I lose a lot of data and am not sure if the iHs score will be valid. Stitching together blocks of overlap into one coherent ihshap file would result in the SNPs on the borders of those blocks using SNPs from differing blocks, sometimes very far away physically on the chromosome, for the ehh scores to be integrated.

e.g.

Normal .ihshap input file for iHs:

1 0 1 0 1 0 0 0 0 1 1 1 0 1 1
0 1 1 0 1 0 1 0 1 0 0 0 0 1 0
0 1 0 1 0 1 0 0 0 0 1 0 0 0 1
1 0 0 0 1 0 1 1 1 1 1 0 0 0 1

My data:

? ? ? ? ? ? 0 0 0 1 1 1 0 1 1
0 1 1 0 ? ? ? ? ? ? 0 0 0 1 0
0 1 ? ? ? ? ? ? ? ? 1 0 0 0 1
1 0 0 0 1 0 ? ? ? ? ? ? 0 ? ?

Out of 100,000 SNPs I would have ~80-90% missing data.

selection • 2.2k views
ADD COMMENT
2
Entering edit mode
10.9 years ago

It does not support missing data.

Probably a way to do it. It depends on how much data is missing. You could always impute missing genotypes.

ADD COMMENT
0
Entering edit mode

Thanks for your reply. In this case I have information on all genotypes, but willing exclude a majority of it so as to only look at portions of each individual genome that have the same ancestry (e.g. European). If I included all data then iHs would be comparing SNPs from different ancestries, and my results wouldn't make sense in terms of "this SNP was positively selected for in Europeans". If my understanding is flawed or if there is a reasonable way to implement missing data support (short of writing my own iHs calculator, which I am considering) I would be appreciative to hear it.

ADD REPLY

Login before adding your answer.

Traffic: 2143 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6