Question: How do I deal with duplicate reads when using vg to analyse WGS data?
gravatar for lhomas
9 months ago by
lhomas0 wrote:

I am analyzing WGS data with vg, following the recommendation made on the vgTeam GitHub wiki page ("Working with a whole genome variation graph" and "Whole-genome calling and genotyping"), and I am unsure what to do about the issue of duplicate reads. Does vg take care of this in the commands recommended in the above pages? or is there another vg tool I should be using on the GAMs prior to variant calling?

Thanks in advance.

vg • 228 views
ADD COMMENTlink modified 9 months ago by glenn.hickey170 • written 9 months ago by lhomas0
gravatar for glenn.hickey
9 months ago by
glenn.hickey170 wrote:

This is a great question. There is indeed no vg tool yet to mark duplicates. The only workaround, which isn't great, is to use a BAM file to detect duplicates. Please make a feature request on github! We are working on some changes to replace GAM as a default format which should make it possible to write such a tool more efficiently soon,.

ADD COMMENTlink written 9 months ago by glenn.hickey170
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 847 users visited in the last hour