Need help with BCFtools annotate; not all information is carried forward
1
1
Entering edit mode
4.8 years ago
Hadi M ▴ 50

Hi everyone,

I have a custom tab delimited annotation file that I used to annotate a VCF file using bcftools annotate. It worked just fine but the only problem is that if there are multiple hits on a particular position, only the first information is carried into the VCF file. Here's an example:

This is an example of my annotation file:

#CHROM  FROM    TO  TRAIT
chr3    100001  100010  Disease A
chr3    100005  100005  Disease B
chr3    100005  100005  Disease C

And here's an example of my VCF file:

#CHROM  POS ID  REF ALT
chr3    100005  .   A   T

Annotating the VCF file produce:

#CHROM  POS ID  REF ALT INFO
chr3    100005  .   A   T   Disease A

As you can see, only the first information is carried forward. My ideal output is:

#CHROM  POS ID  REF ALT INFO
chr3    100005  .   A   T   Disease A
chr3    100005  .   A   T   Disease B
chr3    100005  .   A   T   Disease C

Or:

 #CHROM POS ID  REF ALT INFO
 chr3   100005  .   A   T   Disease A | Disease B | Disease C

Is there an option in bcftools annotate that would allow me to get such output? If there is an alternative tool, do recommend as well. Cheers.

genome • 2.1k views
ADD COMMENT
1
Entering edit mode
4.8 years ago
Ram 43k

It is not legal for a VCF file to have multiple entries for the same chr-pos-ref-alt combination.

Option 1:

Try the --merge-logic parameter in bcftools annotate. I've never tried it, but it looks like it might work when used in the manner --merge-logic TRAIT:unique

Option 2:

You should be able to use R (dplyr) to get a new annotation file from your existing annotation file. Group by CHROM, POS, ID, REF, ALT and aggregate TRAIT to paste(TRAIT, collapse = " | "). This, of course, has the downside that range annotations and point annotations cannot be aggregated together (the group-by will only group by identical values, not overlapping ranges), forcing you to convert all range annotations to point annotations before aggregating them.

ADD COMMENT

Login before adding your answer.

Traffic: 2710 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6