Question: Help finding all allele combinations using Python
1
21 months ago by

Hi, I am a student currently studying to obtain my master's degree and for my dissertation we are thinking about using Python in order to find all of the possible combinations of genotypes for a set of 24 SNPs. I am very unfamiliar with coding, so I have no idea where to start or how we should go about getting the outcome that we are looking for. Any help or advice would be greatly appreciated! :)

Our samples came from ancient DNA, so there was a low genome coverage to begin with. Therefore, a lot of the SNPs we are looking at (looking at the 24 SNPs required for use with HIrisPlex) were either missing or had low coverage. Here is what I have so far for one of our samples (possible number of alleles for each SNP);

``````SNP 1 - 0
SNP 2 - 0
SNP 3 - 1
SNP 4 - 0, 1
SNP 5 - 0, 1
SNP 6 - 0
SNP 7 - 0, 1
``````

...

We need to find out all of the possible combinations for all 24 SNPs, so we can obtain a range of probabilities for each hair and eye color so that we can estimate the predicted phenotype for each individual. Any help or advice anyone has on how we would go about doing this would be greatly appreciated! Thank you in advance! :)

Amanda

modified 21 months ago by finswimmer13k • written 21 months ago by amandaskinner9410

I added markup to your post for increased readability. You can do this by selecting the text and clicking the 101010 button. When you compose or edit a post that button is in your toolbar, see image below:

1
21 months ago by
finswimmer13k
Germany
finswimmer13k wrote:

Hello,

to get all combinations of the values of lists one can use itertools.product()

``````import itertools

snp1 = [0,1]
snp2 = [0,1,2]
snp3 = [0,2]

print(list(itertools.product(snp1, snp2, snp3)))
``````

fin swimmer

0
21 months ago by
sacha1.9k
France
sacha1.9k wrote:

I m not sure I understand well... Could you explain what "0,1,2" mean exactly ? And post a tiny example with input and output with 2 snp only ?

An idea would be to create a graphe and find all path from start to end.

Thank you so much for the reply! So the "0, 1, and 2" refers to the number of minor alleles present in the genotype for a particular SNP. This is what the program HIrisPlex uses to estimate eye and hair color, which is what we will be using. For example, if the minor allele for a SNP is A, and the genotype for that SNP in your sample is AA, then the number of alleles would be 2. If it's AG, then its 1 and if its GG then its 0.

Here is an example with using 2 SNPs only: SNP 1 - 0, 1 SNP 2 - 0, 1, 2 All of the possible combinations of genotypes for these two SNPs would be 00, 01, 02, 10, 11, and 12. We need to find all of the possible combinations for all of our 24 SNPs. I hope this helps clear some things up! Thank you so much for your help!!!