Question: How to filter out duplicate records in a vcf with bcftools?
0
gravatar for vitor.aguiar
2.6 years ago by
vitor.aguiar10
Brazil
vitor.aguiar10 wrote:

Is there a bcftools command to remove duplicate records in a vcf file?

If a variant has duplicate records, I'd like to remove all entries of such variant in the vcf.

bcftools vcf • 3.4k views
ADD COMMENTlink modified 2.6 years ago by Brice Sarver2.5k • written 2.6 years ago by vitor.aguiar10

beside unique can you take a look at this "Variant Normalization"

http://genome.sph.umich.edu/wiki/Variant_Normalization

http://blog.goldenhelix.com/grudy/variant-normalization-underappreciated-critical-infrastructure/

http://genome.sph.umich.edu/wiki/Vt

ADD REPLYlink modified 2.6 years ago • written 2.6 years ago by Medhat8.2k

https://github.com/vcflib/vcflib#vcfuniq

ADD REPLYlink written 20 months ago by cpad011211k
0
gravatar for Brice Sarver
2.6 years ago by
Brice Sarver2.5k
United States
Brice Sarver2.5k wrote:

Exactly duplicated records?

If the records are truly identical, sort the file and use uniq.

Make sure the vcf isn't compressed before you pass or pipe!

Edit: I should clarify that you don't need to use BCFtools for this. So, use good-old regular uniq not bcftools uniq; that doesn't exist.

ADD COMMENTlink modified 2.6 years ago • written 2.6 years ago by Brice Sarver2.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2496 users visited in the last hour