Comparing vcf files
1
0
Entering edit mode
3.8 years ago
paraskevopou ▴ 20

Hi there!! I wanted to ask if there is a way to find shared SNPs between 4 different vcf files created from different library using the same de novo assembly. Supertranscript method was used in order to create a "reference" for the GATK pipeline. I filtered out only the heterozygous SNPs but now I want to compare which SNPs are shared among my 4 libraries/treatments. I tried to do

 vcftools --vcf ./snps_filt_lib05.recode.vcf --diff ./snps_filt_lib02.vcf --diff-site --out snps_shared_lib02_vs_lib05

but i get the following error

Found TRINITY_DN9213_c0_g2 in file 1 and TRINITY_DN15627_c1_g1 in file 2.
Use option --not-chr to filter out chromosomes only found in one file.

The --not-chr filter, if I got it correctly, requires to know a priori which chromosoms (trinity genes) you want to exclude. Moreover, I cannot find any vcf-compare comands in v. 0.1.15 that I use. Any help would be apreciated Thanks a lot!1

RNA-Seq SNP • 1.8k views
ADD COMMENT
2
Entering edit mode
3.8 years ago
Medhat 9.0k

I think you need to use vcf-isec

Create intersections, unions, complements on bgzipped and tabix indexed VCF or tab-delimited files.

With -n option

-n, --nfiles [+-=]<int> Output positions present in this many (=), this many or more (+), or this many or fewer (-) files.

So in you case will be;

vcf-isec -n +4 f1.vcf f2.vcf ....

Source can be downloaded form;

https://github.com/vcftools/vcftools

ADD COMMENT

Login before adding your answer.

Traffic: 1753 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6