ShapeIT output File Allele Info
0
0
Entering edit mode
4 months ago
prmshakya • 0

Hi,

I recently used SHAPEIT3 to phase some data. This outputs the phased information in haps/sample format. I want to convert this to VCF format, which can be done using SHAPEIT2 but I was curious about how it differentiates between the reference and alternate allele from the haps/sample format. The hap file after phasing has the following layout :

7 SNP1 123 A G 0 0 1 0 0 0 1 1
7 SNP2 456 T C 0 1 1 0 0 1 0 1
7 SNP3 789 A T 0 1 1 0 1 1 1 1

The SHAPEIT2 documentation mentions the following:

This file is SPACE delimited. Each line corresponds to a single SNP. The first five columns are:

1)Chromosome number [integer]
2) SNP ID [string]
3) SNP Position [integer]
4) First allele [string]
5) Second allele [string]
Then the successive column pair (6, 7), (8, 9), (10, 11) and (12, 13) corresponds to the alleles carried at the 4 SNPs by each haplotype of a single individual. For example a pair "1 0" means that the first haplotype carries the B allele while the second carries the A allele. The haplotypes are given in the same order than in the SAMPLE file. This file should have L lines and 2N+5 columns, where L and N are the numbers of SNPs and individuals respectively.

There's no information on ALT/REF allele( also i'm using SHAPEIT2 documentation since SHAPEIT3 is very poorly documented and the authors claim it's highly similar to SHAPEIT2). Since, I didn't use any reference panel in the phasing process, does it matter which of the alleles in the hap format is assigned as REF/ALT in the conversion process ?

Shapeit vcf impute2 shapeit3 shapeit2 • 226 views
ADD COMMENT

Login before adding your answer.

Traffic: 1074 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6