Missing genotypes, in case control study using Plink
1
0
Entering edit mode
9.9 years ago

Hello,

I have a sequencing in ION PGM.

We sequenced 96 barcodes (individuals) and 310 amplicons (chromosomal regions).

32 barcodes are controls and 64 are cases.

We did the variant calling and get 96 VCF files.

We combine them in a single VCF file using GATK. We have 90 different SNPs in the sample.

We convert the single VCF file to plink format (map and ped files) using vcftools.

Now we want to use plink to make a association test,

the ped file looks like this

1 1 0 0 0 2 0 0 0 0 0 0 0 0 0 0
2 2 0 0 0 1 0 0 0 0 0 0 0 0 T C
3 3 0 0 0 2 0 0 0 0 0 0 0 0 T C
4 4 0 0 0 2 0 0 C T 0 0 0 0 T C
5 5 0 0 0 1 0 0 0 0 0 0 0 0 0 0
6 6 0 0 0 2 0 0 0 0 0 0 0 0 0 0
7 7 0 0 0 2 0 0 0 0 0 0 0 0 0 0
8 8 0 0 0 1 0 0 0 0 0 0 0 0 T C
9 9 0 0 0 2 0 0 0 0 0 0 0 0 T C
10 10 0 0 0 2 0 0 0 0 0 0 0 0 T C
11 11 0 0 0 1 0 0 0 0 0 0 0 0 0 0
12 12 0 0 0 2 0 0 C T 0 0 0 0 T C

You can see that there are a lot of missing genotypes, i would like to know what's the standar in this case?

assume that the missing genotypes are references? because most of them probably are, and other could be missing data, but we can't know that, only checking the bam file i guess

If assume the Missing as reference, is there any command in plink to add them automatically?

thanks

Cristian

missing statistics plink genotypes • 4.3k views
ADD COMMENT
0
Entering edit mode
9.9 years ago

The VCF reference merge documentation describes how to do this for a single genome (it should be pretty straightforward to extend this to all your samples):

https://www.cog-genomics.org/plink2/data#merge_vcf_example

The key step is --merge with --merge-mode 5, which keeps the base genotype if the --merge genotype is missing, and otherwise uses the data in the --merge file. So make the base fileset contain just reference information (and copy the real FID/IID over the reference FID/IID so they match), and you're golden.

ADD COMMENT

Login before adding your answer.

Traffic: 2629 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6