Hi, I would like to take a vcf file and output a tab-delimited file with a line per individual-site that includes the individual, site, GT, alternate and reference alleles, DP, and each PL. It doesn't matter what order sites end up being in.
Any suggestions on how to do this most efficiently and/or any links to code that does something like this?
Example input:
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  samp1 samp2
chr1   100  .       C       T       3106.72 SnpCluster      .       GT:AD:DP:GQ:PL  0/0:1,0:1:3:0,3,42      0/0:3,0:3:9:0,9,132
chr1   120  .       C       G       3106.72 SnpCluster      .       GT:AD:DP:GQ:PL 0/1:3,1:4:30:30,0,123   1/1:0,1:1:3:45,3,0
Example output:
samp1    chr1    100    0/0   C   T   1    0   3   42
samp2    chr1    100   0/1    C   T   4   30   0   123
samp1    chr1    120    0/0   C   G   3    0    9    132
samp2    chr1    120    1/1   C   G   1    45   3    0
If you're into python, pyvcf makes this sort of task quite easy.