Question: Changing to ancestral/derived states using vcftools
0
gravatar for GabrielMontenegro
2.5 years ago by
United Kingdom
GabrielMontenegro510 wrote:

I am working with VCF files from the 1000 genomes project:

I downloaded the files from here: ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/

I found that there is an option to change the REF and ALT alleles to ancestral and derived alleles respectively if you have a field specifying the ancestral allele in your vcf file: --derived

However sometimes the ancestral file is not specified (beginning of my file):

22  16050075    .   A   G   100 PASS    AA=.||| GT  0|0 0|0

Give that the Ancestral allele here is unknown AA=.||| using the --derived option will just keep the REF alleles as it was? Or how will it handle this specific cases?

Also, I found these lines here on the manual but I am not sure if I understand them correctly. If I want to use the --derive command should I also use --freqs2 and --counts2?

OUTPUT ALLELE STATISTICS

--freq
--freq2

Outputs the allele frequency for each site in a file with the suffix ".frq". The second option is used to suppress output of any information about the alleles.

--counts
--counts2

Outputs the raw allele counts for each site in a file with the suffix ".frq.count". The second option is used to suppress output of any information about the alleles.

--derived

For use with the previous four frequency and count options only. Re-orders the output file columns so that the ancestral allele appears first. This option relies on the ancestral allele being specified in the VCF file using the AA tag in the INFO field.
1000genomes vcf • 1.3k views
ADD COMMENTlink modified 2.5 years ago • written 2.5 years ago by GabrielMontenegro510
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1138 users visited in the last hour