Question: Select one allele at heterozygote position
0
gravatar for stolarek.ir
3.7 years ago by
stolarek.ir640
Poland
stolarek.ir640 wrote:

Hi,

Having a vcf or plink ped file with called variants I want in positions, where there are 2 possible alleles select only one of them (based on which one had better coverage in the sample) and from this forming a plink format file, where this allele is ina form of diploid homozygote.

Is there any straightforward way for this or I have to code my way out of this?

Kind regards

snp plink vcf • 1.1k views
ADD COMMENTlink modified 3.6 years ago by Biostar ♦♦ 20 • written 3.7 years ago by stolarek.ir640

it depends of the VCF INFO and FORMAT available in your input.

ADD REPLYlink written 3.7 years ago by Pierre Lindenbaum124k
CHROM   POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  Sample_028_bwa_LongSeed_sorted_hg19.bam
1   2713327 SNPA2252455 C   T   9.6729  .   .   GT:PL   0/1:37,3,0
1   2865778 SNPA2060958 T   .   67  .   .   GT  0/0
ADD REPLYlink modified 3.7 years ago by Pierre Lindenbaum124k • written 3.7 years ago by stolarek.ir640

If anyone will be interested I worked this out on ped plink files. First I recode A,C,T,G coding to 0,1,2 with --recode12,

then parse the ped file with this python script:

import random
import itertools

infile = open("subset.ped","r")
outfile = open("new_genotypes.ped","w")

check_11 = ('1','1')
check_22 = ('2','2')
def random_gen():
    choice_m = [check_11,check_22]
    res = random.choice(choice_m)
    return res

def parse_ped(ped_infile):
    check_12 = ('1','2')
    check_21 = ('2','1')

for line in ped_infile:
    row = line.split()
    head = row[0:6]
    genotype_old = row[6:]
    genotype_f = row[6:]
    it = iter(genotype_f)
    genotype_t = zip(it, it)
    for n,i in enumerate(genotype_t):
        if i == check_12 or i == check_21:
            genotype_t[n]=random_gen()
    print genotype_t[0:10]
    genotype_t = list(itertools.chain(*genotype_t))
    print genotype_t[0:10]

    new_line = head+genotype_t
    outfile.write(" ".join(new_line)+"\n")
outfile.close()

parse_ped(infile)
ADD REPLYlink modified 3.7 years ago • written 3.7 years ago by stolarek.ir640
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1210 users visited in the last hour