Question: Merge vcf files from Multiple Samples
0
gravatar for Ron
5.6 years ago by
Ron990
United States
Ron990 wrote:

Hi,

 

I want to merge vcf files of same format from multiple samples using vcf-tools.But I want to know which method among vcf-merge or vcf-concat should I use for viewing results .Sample name does not matter in results.

This is the current format of each vcf file.

 

#CHROM    POS    ID    REF    ALT    QUAL    FILTER    INFO    FORMAT    SampleName

 

Thanks

vcftools vcf • 4.9k views
ADD COMMENTlink modified 8 months ago by Biostar ♦♦ 20 • written 5.6 years ago by Ron990
5
gravatar for Devon Ryan
5.6 years ago by
Devon Ryan93k
Freiburg, Germany
Devon Ryan93k wrote:

You would want vcf-merge. Concatenating means linking together like in a chain (so end-to-end).

ADD COMMENTlink written 5.6 years ago by Devon Ryan93k

I used vcf-merge, after sorting,bgzip and using tabix.Everything went well.But while merging I am getting this error below:My sample names are Clk51764 and Clk51769 as I am just testing the merge on 2 samples.

 

Using column name 'Clk51764' for 51764.sort.vcf.gz:Clk51764

Using column name 'Clk51769' for 51769.sort.vcf.gz:Clk51769

Could not determine the ploidy (nals=2, nvals=10). (TODO: ploidy bigger than 2)

6

 at /vcftools_0.1.12/perl//Vcf.pm line 172

    Vcf::throw('Vcf4_1=HASH(0x28a4e00)', 'Could not determine the ploidy (nals=2, nvals=10). (TODO: plo...', 6) called at /vcftools_0.1.12/perl//Vcf.pm line 2404

    VcfReader::guess_ploidy('Vcf4_1=HASH(0x28a4e00)', 2, 10) called at /vcftools_0.1.12/perl//Vcf.pm line 1760

    VcfReader::parse_AGtags('Vcf4_1=HASH(0x28a4e00)', 'HASH(0x2909ab8)') called at /vcftools_0.1.12/bin/vcf-merge line 464

    main::merge_vcf_files('HASH(0x289cc20)') called at /vcftools_0.1.12/bin/vcf-merge line 12

ADD REPLYlink written 5.6 years ago by Ron990
1

What's the ploidy of your organism? VCF-tools apparently still only supports diploids.

ADD REPLYlink written 5.6 years ago by Devon Ryan93k

I am getting a similar error message. Did you ever manage to get vcf-merge to work?

ADD REPLYlink written 5.6 years ago by devenvyas600
0
gravatar for mtaschuk
5.6 years ago by
mtaschuk0
mtaschuk0 wrote:

If you can disregard the sample name (i.e. you're only interested in whether the SNPs are present, not where they came from), you can cut the first 9 columns, concatenate them together, and then run uniq over the resulting file.

for i in `ls *vcf`; do cut -f1-9 $i | grep -vE "^#" >> fullfile.vcf; grep ^# $i>uniqfile.vcf; done; sort fullfile.vcf | uniq > uniqfile.vcf

Not tested so syntax might not be exact, but something like that will serve.

 

Edit: trying to make the code properly formatted

ADD COMMENTlink modified 5.6 years ago • written 5.6 years ago by mtaschuk0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 875 users visited in the last hour