Changing to ancestral/derived states using vcftools
0
0
Entering edit mode
7.6 years ago

I am working with VCF files from the 1000 genomes project:

I downloaded the files from here: ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/

I found that there is an option to change the REF and ALT alleles to ancestral and derived alleles respectively if you have a field specifying the ancestral allele in your vcf file: --derived

However sometimes the ancestral file is not specified (beginning of my file):

22  16050075    .   A   G   100 PASS    AA=.||| GT  0|0 0|0

Give that the Ancestral allele here is unknown AA=.||| using the --derived option will just keep the REF alleles as it was? Or how will it handle this specific cases?

Also, I found these lines here on the manual but I am not sure if I understand them correctly. If I want to use the --derive command should I also use --freqs2 and --counts2?

OUTPUT ALLELE STATISTICS

--freq
--freq2

Outputs the allele frequency for each site in a file with the suffix ".frq". The second option is used to suppress output of any information about the alleles.

--counts
--counts2

Outputs the raw allele counts for each site in a file with the suffix ".frq.count". The second option is used to suppress output of any information about the alleles.

--derived

For use with the previous four frequency and count options only. Re-orders the output file columns so that the ancestral allele appears first. This option relies on the ancestral allele being specified in the VCF file using the AA tag in the INFO field.
vcf 1000genomes • 3.6k views
ADD COMMENT

Login before adding your answer.

Traffic: 2052 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6