concatenate VCFs with same sample set but with varying order
1
0
Entering edit mode
18 months ago
mernster • 0

Hello,

for each chromosome, I have a multisample VCF containing the same set of samples. However, the order of the sample columns varies among the VCFs.

For example:

VCF file 1:

#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  14  11  13  12  3   6   10  7   9   5   8   2   4   1

VCF file 2:

#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  5   9   11  13  2   7   12  6   10  14  4   3   8   1

I'd like to merge them so that the samples with the same ID are concatenated. However, since the sample order differs, I can't use bcftools concat...

Any suggestions of how I could either

  1. change the order of the samples in every multisample-vcf so that I can use bcftools
  2. use a different tool to concatenate the VCF files

Cheers

bcftools vcf concatenate merge • 1.1k views
ADD COMMENT
0
Entering edit mode
18 months ago

use a different tool to concatenate the VCF files

picard MergeVcfs https://broadinstitute.github.io/picard/command-line-overview.html#MergeVcfs

or

picard GatherVcfs (faster if you don't have any overlaping regions) https://broadinstitute.github.io/picard/command-line-overview.html#GatherVcfs

ADD COMMENT
0
Entering edit mode

Hi Pierre,

thanks for you answer!

I tried picard GatherVcfs

java -jar $PICARD_HOME/picard.jar GatherVcfs I=35.vcf I=36.vcf O=all.vcf

The VCFs have the same sample IDs but in different order:

bcftools query -l 35.vcf

14 11 13 12 3 6 10 7 9 5 8 2 4 1

bcftools query -l 36.vcf

5 9 11 13 2 7 12 6 10 14 4 3 8 1

I get an error message from plink saying that the sample IDs don't match. However it can't find any unique names, which makes me think that the problem is, again, the order..

There was a problem with gathering the INPUT.java.lang.IllegalArgumentException: VCFs do not have identical sample lists. Samples unique to first file: []. Samples unique to 36.vcf: [].

ADD REPLY
0
Entering edit mode

Okay, I have tried

java -jar $PICARD_HOME/picard.jar MergeVcfs I=35.vcf I=36.vcf O=all.vcf

instead of

java -jar $PICARD_HOME/picard.jar GatherVcfs I=35.vcf I=36.vcf O=all.vcf

and it seems to work!! thank you very much

ADD REPLY

Login before adding your answer.

Traffic: 2446 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6