Help finding all allele combinations using Python
2
1
Entering edit mode
5.3 years ago

Hi, I am a student currently studying to obtain my master's degree and for my dissertation we are thinking about using Python in order to find all of the possible combinations of genotypes for a set of 24 SNPs. I am very unfamiliar with coding, so I have no idea where to start or how we should go about getting the outcome that we are looking for. Any help or advice would be greatly appreciated! :)

Our samples came from ancient DNA, so there was a low genome coverage to begin with. Therefore, a lot of the SNPs we are looking at (looking at the 24 SNPs required for use with HIrisPlex) were either missing or had low coverage. Here is what I have so far for one of our samples (possible number of alleles for each SNP);

SNP 1 - 0
SNP 2 - 0
SNP 3 - 1
SNP 4 - 0, 1
SNP 5 - 0, 1
SNP 6 - 0
SNP 7 - 0, 1


...

We need to find out all of the possible combinations for all 24 SNPs, so we can obtain a range of probabilities for each hair and eye color so that we can estimate the predicted phenotype for each individual. Any help or advice anyone has on how we would go about doing this would be greatly appreciated! Thank you in advance! :)

Amanda

SNP python combinations HIrisPlex • 1.6k views
0
Entering edit mode

I added markup to your post for increased readability. You can do this by selecting the text and clicking the 101010 button. When you compose or edit a post that button is in your toolbar, see image below:

1
Entering edit mode
5.3 years ago

Hello,

to get all combinations of the values of lists one can use itertools.product()

import itertools

snp1 = [0,1]
snp2 = [0,1,2]
snp3 = [0,2]

print(list(itertools.product(snp1, snp2, snp3)))


fin swimmer

0
Entering edit mode
5.3 years ago
sacha ★ 2.4k

I m not sure I understand well... Could you explain what "0,1,2" mean exactly ? And post a tiny example with input and output with 2 snp only ?

An idea would be to create a graphe and find all path from start to end.

0
Entering edit mode

Thank you so much for the reply! So the "0, 1, and 2" refers to the number of minor alleles present in the genotype for a particular SNP. This is what the program HIrisPlex uses to estimate eye and hair color, which is what we will be using. For example, if the minor allele for a SNP is A, and the genotype for that SNP in your sample is AA, then the number of alleles would be 2. If it's AG, then its 1 and if its GG then its 0.

Here is an example with using 2 SNPs only: SNP 1 - 0, 1 SNP 2 - 0, 1, 2 All of the possible combinations of genotypes for these two SNPs would be 00, 01, 02, 10, 11, and 12. We need to find all of the possible combinations for all of our 24 SNPs. I hope this helps clear some things up! Thank you so much for your help!!!