vcftools flags --mac and --max-mac giving an empty output file
0
0
Entering edit mode
7.3 years ago

Hi everyone! I'm a newbie in genomic data analysis, that’s why I’m asking for some help in things that might be easy in fact. I have a zipped vcf-file, which contains human chromosome 20 sequences for 269 individuals. I want to filter out singletons and doubletons for subsequent analysis. I use vcftools v0.1.15 installed on a server. Here is what I do:

vcftools --gzvcf chr20_269ind.vcf.gz --mac 1 --max-mac 1 --recode --stdout | gzip -c > output_test.vcf.gz

however, I get an empty output file (only sample names, no sequence information) and a message that says as following:

Outputting VCF file... After filtering, kept 0 out of a possible 991704 Sites No data left for analysis! Run Time = 24.00 seconds

I’ve tried to play around with --mac and --max-mac flags. First I run the following line:

vcftools --gzvcf chr20_269ind.vcf.gz - -max-mac n --recode --stdout | gzip -c > output_tesmaxnt.vcf.gz

where I tried n = 1; 10 or 100. All three attempts gave me the same output file (not empty this time) and the log file saying

Outputting VCF file... After filtering, kept 991704 out of a possible 991704 Sites Run Time = 95.00 seconds

Actually I get the same output if I run this (i.e. with no --mac or max-mac flags)

vcftools --gzvcf chr20_269ind.vcf.gz --recode --stdout | gzip -c > output_tesmaxnt.vcf.gz

Then I’ve tried running

vcftools --gzvcf chr20_269ind.vcf.gz --mac 1 --recode --stdout | gzip -c > output_test.vcf.gz

and got an empty output again. Then I’ve run

vcftools --gzvcf chr20_269ind.vcf.gz --mac 0 --recode --stdout | gzip -c > output_test.vcf.gz

and got the same file as in case of - -max-mac n. It seems to me that these flags ‘see’ my file as if it contained only zeros, which is not the case (I’ve looked at the content of the file manually). If I try to filter for minor allele frequency instead of allele counts (which is not what I want to do, but I was just playing around to better understand what’s going on) I get this:

Outputting VCF file... Error: Require Genotypes in variant file to filter by frequency and/or call rate

I’ve tried vcftools versions 0.1.13 as well with no difference.

Any hints would be greatly appreciated.

Best,

Vasili

:

snp software error next-gen • 3.3k views
ADD COMMENT

Login before adding your answer.

Traffic: 1824 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6