Question: update genotype calls with PyVCF
0
gravatar for biosol
6 weeks ago by
biosol150
biosol150 wrote:

Hi everyone, I'm parsing my vcf files with PyVCF and I would like to update my genotype calls: for every position in which the DP field is smaller than 10, genotype should be "./.". For the moment I've generated a list with the updated calls as I wish, like this:

vcf_reader = vcf.Reader(filename='myfile.vcf.gz')
new_gt = []
for record in vcf_reader:
    if record.samples[0].data[2] > 10:
        new_gt.append(record.samples[0].data[0])
    elif record.samples[0].data[2] < 10:
        new_gt.append('./.')

However, I've also tried to directly modifying the genotypes in the vcf_reader object, but I get an error message saying 'CallData' object does not support item assignment. In this link they have a similar problem in the FORMAT field. I was wondering: has someone found a way to directly modify genotype calls in PyVCF or to insert a list with the desired output? Thanks a lot in advanced!

genotypes pyvcf • 124 views
ADD COMMENTlink modified 6 weeks ago by Pierre Lindenbaum133k • written 6 weeks ago by biosol150
1
gravatar for Pierre Lindenbaum
6 weeks ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum133k wrote:

a one liner using http://lindenb.github.io/jvarkit/VcfFilterJdk.html

 java -jar dist/vcffilterjdk.jar -e 'return new VariantContextBuilder(variant).genotypes(variant.getGenotypes().stream().map(G->G.isCalled() && G.hasDP() && G.getDP()<10?GenotypeBuilder.createMissing(G.getSampleName(),G.getPloidy()):G).collect(Collectors.toList())).make();' --recalc  input.vcf
ADD COMMENTlink modified 6 weeks ago • written 6 weeks ago by Pierre Lindenbaum133k
1

Yep! Works super fast, thanks a lot!

ADD REPLYlink written 6 weeks ago by biosol150
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 982 users visited in the last hour