Question: Comparing vcf files
gravatar for paraskevopou
2.0 years ago by
paraskevopou20 wrote:

Hi there!! I wanted to ask if there is a way to find shared SNPs between 4 different vcf files created from different library using the same de novo assembly. Supertranscript method was used in order to create a "reference" for the GATK pipeline. I filtered out only the heterozygous SNPs but now I want to compare which SNPs are shared among my 4 libraries/treatments. I tried to do

 vcftools --vcf ./snps_filt_lib05.recode.vcf --diff ./snps_filt_lib02.vcf --diff-site --out snps_shared_lib02_vs_lib05

but i get the following error

Found TRINITY_DN9213_c0_g2 in file 1 and TRINITY_DN15627_c1_g1 in file 2.
Use option --not-chr to filter out chromosomes only found in one file.

The --not-chr filter, if I got it correctly, requires to know a priori which chromosoms (trinity genes) you want to exclude. Moreover, I cannot find any vcf-compare comands in v. 0.1.15 that I use. Any help would be apreciated Thanks a lot!1

snp rna-seq • 1.2k views
ADD COMMENTlink modified 2.0 years ago by Medhat8.6k • written 2.0 years ago by paraskevopou20
gravatar for Medhat
2.0 years ago by
Medhat8.6k wrote:

I think you need to use vcf-isec

Create intersections, unions, complements on bgzipped and tabix indexed VCF or tab-delimited files.

With -n option

-n, --nfiles [+-=]<int> Output positions present in this many (=), this many or more (+), or this many or fewer (-) files.

So in you case will be;

vcf-isec -n +4 f1.vcf f2.vcf ....

Source can be downloaded form;

ADD COMMENTlink modified 2.0 years ago • written 2.0 years ago by Medhat8.6k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1937 users visited in the last hour