Question: Merging variant replicates rather than filtering
0
gravatar for dylkot
7 weeks ago by
dylkot0
United States
dylkot0 wrote:

I'm analyzing some Illumina genotype data and noticing that almost 2% of variants on the array are typed more than once. I consider these measurements to be replicate if they are measuring the same chromosome, position, reference allele, and alternate allele as specified in the PLINK BIM file. It seems that the typical thing people do is remove all but the first replicate in the file using the --list-duplicate-vars option in PLINK. Are there any tools that implement something slightly smarter than this? Like, for example, a desired behavior might be to take a consensus vote amongst the calls from the replicates to decide which one to use. Thanks in advance!

vcftools plink gwas • 167 views
ADD COMMENTlink modified 7 weeks ago by chrchang5234.0k • written 7 weeks ago by dylkot0
0
gravatar for chrchang523
7 weeks ago by
chrchang5234.0k
United States
chrchang5234.0k wrote:

You can use plink to merge a fileset with itself.

ADD COMMENTlink written 7 weeks ago by chrchang5234.0k

Thanks for the tip! Are you referring to the --merge-equal-pos option in PLINK 1.9? If so, do you know how it performs the merge? The documentation is ambiguous stating:

If two variants have the same position, PLINK 1.9's merge commands will always notify you. If you wish to try to merge them, use --merge-equal-pos. (This will fail if any of the same-position variant pairs do not have matching allele names.) Unplaced variants (chromosome code 0) are not considered by --merge-equal-pos.

Note that you are permitted to merge a fileset with itself; doing so with --merge-equal-pos can be worthwhile when working with data containing redundant loci for quality control purposes.

There is no reference to this in PLINK 2 as far as I'm aware

ADD REPLYlink written 7 weeks ago by dylkot0

Also, as a quick note, the command:

plink --bmerge inputdata --bfile inputdata --merge-equal-pos --out outputdata

fails when (as in my case), variants and the same position are replicates and in other cases, they are multi-allelic variants and so have different ALT alleles.

ADD REPLYlink written 7 weeks ago by dylkot0

For now, I'd use PLINK 2.0's --set-all-var-ids flag to assign brand new chrom/pos/ref/alt-based IDs to every variant, and then follow up with PLINK 1.9's merge function (since yes, merge is not yet implemented in 2.0).

ADD REPLYlink written 7 weeks ago by chrchang5234.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2097 users visited in the last hour