Question

Allele frequencies in pedigrees

0

Entering edit mode

8.4 years ago

literature.02 • 0

Imagine 10 alleles (A-J), each with a 10% population frequency at one locus.

If you know that your sibling had alleles A,B, what are the probabilities of observing any allele at your position in the pedigree.
If your nephew (on your mother's side) has alleles C,D what are the probabilities of observing any allele at your position in the pedigree.
If your nephew (on your mother's side) has alleles C,D, and your aunt (on your father's side) has allele C,E, what are the probabilities of observing any allele at your position in the pedigree.

Please suggest the fastest method of solving this problem for any arbitrary combination of individuals in a pedigree. One approach would be to catalog all of the observed alleles in the family and re normalize the set back to 1, using degree of separation to weight the contribution of an observation.

frequencies IBD pedigrees allele • 2.8k views

ADD COMMENT • link updated 20 months ago by Ram 43k • written 8.4 years ago by literature.02 • 0

3

Entering edit mode

I think the point of homework is that you do it yourself, not post on Biostars for people to do it for you.

ADD REPLY • link 8.4 years ago by User 59 13k

0

Entering edit mode

Thanks for the snide and entirely unhelpful remark, but this is not homework. I am building this algorithm and seeking input from others with experience. Or is that also not the point of Biostars?

ADD REPLY • link 8.4 years ago by literature.02 • 0

0

Entering edit mode

You were lucky I didn't close the question as it looks like population genetics to me rather than bioinformatics. Biostars get a number of posts with phrasing that is clearly lifted from people's homework assignments, and there was little here to differentiate it from one of those posts.

If you're building the algorithm why ask for the fastest solution, the assumption here that the problem is solved? Perhaps you would get more benefit from the exercise by sharing your work so far.

ADD REPLY • link 8.4 years ago by User 59 13k

0

Entering edit mode

Then I suppose that is a compliment to my meticulous question phrasing which appears to be from a textbook :)

I am seeking a fast solution (not essential that it be the fastest) since this algorithm must be run against a large number of scenarios. See original post above edited to suggest one route.

ADD REPLY • link 8.4 years ago by literature.02 • 0

0

Entering edit mode

So what part of the algorithm did you get stuck on? How long does it take to run? If you need a faster computer, perhaps you can run part of the samples on the Amazon Cloud.

ADD REPLY • link updated 4.4 years ago by Ram 43k • written 8.4 years ago by karl.stamm 4.1k

0

Entering edit mode

The current implementation runs for close to 20 hours (and growing with incoming data). Am unable to send the data outside of the organization.

ADD REPLY • link 8.4 years ago by literature.02 • 0

score 0 · Answer 1 · 2015-11-27

0

Entering edit mode

8.4 years ago

H.Hasani ▴ 990

IMHO, it seems to me like a graph problem, where the edges weight is the probability of observing an allele i (= nodes), depth indicates the generation!

If you could modulate this way, fastest path or maybe shortest path algorithm might be what you want?!

ADD COMMENT • link 8.4 years ago by H.Hasani ▴ 990