Question: remove non biallelic SNPs from ped
1
gravatar for jan.breitling
4.8 years ago by
United Kingdom
jan.breitling10 wrote:

Hello

 

when merging a list of bed files in plink I get an error

Error: 1785 variants with 3+ alleles present.

The files were originally in vcf format and were converted to bed files using vcftools (with the --plink flag). Is there a simple way to scan through the bed files with poliek sry plink or another tool and remove them?

 

 

snp sequence R • 10k views
ADD COMMENTlink modified 4.2 years ago by fengchongwang10 • written 4.8 years ago by jan.breitling10
3
gravatar for Scott
4.8 years ago by
Scott80
Canada
Scott80 wrote:

Not sure about your exact question, but in VCFtools you can quickly filter for only bi-allelic sites using:

vcftools --vcf_file1.vcf --min-alleles 2 --max-alleles 2 --recode --out output_file_name.vcf

ADD COMMENTlink written 4.8 years ago by Scott80

is there no way to do it in the vcf sry ped file..otherwise is has to be reconverted before...

ADD REPLYlink written 4.8 years ago by jan.breitling10

I just thought if they were originally in VCF format this would be easiest. I am most familiar with VCF, that's why I suggested this. I'm sure there are other ways. 

ADD REPLYlink written 4.8 years ago by Scott80
3
gravatar for chrchang523
4.8 years ago by
chrchang5235.5k
United States
chrchang5235.5k wrote:

If you want to keep some or all of those variants, and then treat least common alternate allele calls as missing, you should perform the merge with another tool, and then use plink --vcf to import from the merged VCF.

However, if you just want to get rid of all the triallelic variants, refer to the last batch of sample commands under https://www.cog-genomics.org/plink2/data#merge3 .  The .missnp file generated during the failed merge is designed to be used with --exclude.

ADD COMMENTlink written 4.8 years ago by chrchang5235.5k

I am merging a whole list of files and the problm is that I don't know from which subfile the variants in the mssnp file you mention come from. Is there a way to use this fiel to exlcude the snps stil?
 

ADD REPLYlink written 4.8 years ago by jan.breitling10

Yes, it's safe to --exclude [prefix].missnp on every single fileset.  Nothing bad happens if a variant named in the .missnp file is not in the current fileset.

ADD REPLYlink written 4.8 years ago by chrchang5235.5k
1
gravatar for fengchongwang
4.2 years ago by
Germany
fengchongwang10 wrote:

Not sure what you mean exactly. But have you tried

--biallelic-only strict

in plink 1.90_beta_3o? One can find futher info on page https://www.cog-genomics.org/plink2/input

ADD COMMENTlink modified 4.2 years ago • written 4.2 years ago by fengchongwang10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1541 users visited in the last hour