Question: Comparing vcf files
gravatar for paraskevopou
8 months ago by
paraskevopou10 wrote:

Hi there!! I wanted to ask if there is a way to find shared SNPs between 4 different vcf files created from different library using the same de novo assembly. Supertranscript method was used in order to create a "reference" for the GATK pipeline. I filtered out only the heterozygous SNPs but now I want to compare which SNPs are shared among my 4 libraries/treatments. I tried to do

 vcftools --vcf ./snps_filt_lib05.recode.vcf --diff ./snps_filt_lib02.vcf --diff-site --out snps_shared_lib02_vs_lib05

but i get the following error

Found TRINITY_DN9213_c0_g2 in file 1 and TRINITY_DN15627_c1_g1 in file 2.
Use option --not-chr to filter out chromosomes only found in one file.

The --not-chr filter, if I got it correctly, requires to know a priori which chromosoms (trinity genes) you want to exclude. Moreover, I cannot find any vcf-compare comands in v. 0.1.15 that I use. Any help would be apreciated Thanks a lot!1

snp rna-seq • 391 views
ADD COMMENTlink modified 8 months ago by Medhat7.9k • written 8 months ago by paraskevopou10
gravatar for Medhat
8 months ago by
Medhat7.9k wrote:

I think you need to use vcf-isec

Create intersections, unions, complements on bgzipped and tabix indexed VCF or tab-delimited files.

With -n option

-n, --nfiles [+-=]<int> Output positions present in this many (=), this many or more (+), or this many or fewer (-) files.

So in you case will be;

vcf-isec -n +4 f1.vcf f2.vcf ....

Source can be downloaded form;

ADD COMMENTlink modified 8 months ago • written 8 months ago by Medhat7.9k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1755 users visited in the last hour