Algorithm to find PL indicies containing a specific allele
1
0
Entering edit mode
8.2 years ago
donfreed ★ 1.6k

I'd like to find all of the PL indices in a VCF line for a specific allele at sites with an arbitrary number of alleles and arbitrary sample ploidy. For dipolid samples, the algorithm is straightforward, but things get more complicated for plodies 3 and above. The related question of finding the PL index given a genotype already has a nice algorithm http://gatkforums.broadinstitute.org/gatk/discussion/2157/gatk-vcf-pl-field-ordering-for-pooled-polyploid-samples.

I already have some code to do this (see below), but it's not very elegant. Does anyone have a better algorithm?

VCF sequencing algorithm • 1.7k views
ADD COMMENT
1
Entering edit mode
8.2 years ago
donfreed ★ 1.6k

I was able to come up with a solution and I have updated the gist with a recursive function, non-recursive function and profiling of the results. I am posting here in case anyone finds the code useful.

Now only if I could figure out how to embed the gist...

ADD COMMENT

Login before adding your answer.

Traffic: 2797 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6