I have whole genome data for variants from several individuals in (1).TFAM and (2).TPED format, as used by plink, as well as a (3)list of variants that are allele specifically expressed (ASE). All of the ASE variants are included in the plink .TPED. I need to find out which variants in the plink .TPED are in LD with each of the ASE variants - some ASE variants may have zero to several hundred variants in LD with it.
Eventually I want to have the original list of ASE variants and, for each, the variants that are in strong LD (defined has having an r-squared greater than a certain threshold) with the ASE variants.
So far, I have been looking at plink's LD calculation documentation:
but I'm not 100% sure how to go about this using plink. Any help - some advice to a step-by-step tutorial would be appreciated.
I have another question now; the output of the latter gives me the file plink.tags, which consists of all the names of the original ASE SNPs as well as the SNPs in LD with those original SNPs.
In the .TPED, it has the data for all the SNPs - I want to extract, from this file, data from only the SNPs in plink.tags. How might I go about doing this?