Question: Get a table of minor an major allele from a vcf file
0
gravatar for shinken123
2.6 years ago by
shinken12380
México
shinken12380 wrote:

Hi

Probably this question have a simple answer but I could not find it until now. I have a VCF file that integrates the data of SNPs of 90 individuals and I would like to obtain a list that contain the ID of the SNP, the position and the mayor and minor alelle per position.

Something like this:

|    ID    | chr | position | mayor alelle | minor alelle |
|S1_10045  |  1  | 10045    |       A      |        C     |
|S1_157465 |  1  | 157465   |       C      |        G     |
|S1_267848 |  1  | 267848   |       T      |        C     |

Do you know how I could get something like this from a VCF file?

snp sequence genome • 1.1k views
ADD COMMENTlink modified 2.6 years ago by alexisdereeper30 • written 2.6 years ago by shinken12380
1

You can use plink to recode the vcf file into the major minor format

ADD REPLYlink written 2.6 years ago by vakul.mohanty240

Ok, thank you, I will try it.

ADD REPLYlink written 2.6 years ago by shinken12380
1
gravatar for alexisdereeper
2.6 years ago by
alexisdereeper30 wrote:

You can use the SNiPlay web application that takes VCF files as input. http://sniplay.southgreen.fr/cgi-bin/home.cgi It generates a table with this kind of information...

ADD COMMENTlink written 2.6 years ago by alexisdereeper30

Thank you for the information.

ADD REPLYlink written 2.6 years ago by shinken12380
0
gravatar for RamRS
2.6 years ago by
RamRS21k
Houston, TX
RamRS21k wrote:

A VCF can give you REF and ALT alleles. These are reference alleles, which match the reference genome and the alternate alleles, which are mismatches at the position to the reference bases.

To get an idea of major and minor alleles, you will need population frequencies integrated into the analysis. Typically, a data source such as 1000 genomes or ExAC would give you the allele frequencies for the alleles at that position and you could use this information to figure out the major (AF>0.5) and minor (AF<0.5) alleles. If you restrict yourself to just the VCF file, you'd be biasing yourself based on just the samples in the VCF file.

ADD COMMENTlink written 2.6 years ago by RamRS21k

Thank you very much, I forgot to mention that the vcf file is a multisample VCF file with information from about 90 individuals.

ADD REPLYlink written 2.6 years ago by shinken12380
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1622 users visited in the last hour