Plink : Variant names are limited to 16000 characters
1
1
Entering edit mode
8 weeks ago
skjobs ▴ 190

Hi All, I'm trying to convert VCF obtained from (Michigan imputed server) to bfile (plink file format) for further operation. When I look the VCF file I found a very long variant name of the REF/ALT. I want to keep the limit to 100 variant long SNPs.

I have tried all plink following command but it is not working

--set-missing-var-ids @:#[b37] 
--set-missing-var-ids @:#[b37]\$1,\$2
--new-id-max-allele-len 50
--snps-only 
--biallelic-only 

At the end, I'm getting the following error: Error: Variant names are limited to 16000 characters.

plink vcftools Varient GWAS bcftools • 546 views
ADD COMMENT
1
Entering edit mode
7 weeks ago

You want to use --new-id-max-allele-len 50 missing with --set-all-var-ids or --set-missing-var-ids; 'missing' specifies that when a longer allele is present, the variant ID will be set to ".", rather than causing the program to error out. Note that this requires plink 2.0.

After you've done this, you may want to filter out or rename the variants which still have "." IDs.

ADD COMMENT
0
Entering edit mode

We have imputed data sets from the Michigan Imputation server (MIS) and when using plink2 throwing the following error "Error: Header line 11 of --vcf file does not have expected FORMAT:GT format."

We want to convert imputed data to ped files.

ADD REPLY
1
Entering edit mode
  1. Please update to a newer plink2 build.
  2. If a similar error message remains, can you post what line 11 of your VCF looks like?
ADD REPLY
0
Entering edit mode

Hi Can you suggest how to set ",", for longer alleles in VCF files? I used your suggested syntax in plink2 but still got a similar error "Variant has longer than 16000 characters". I used the following syntax in plink2. suggested if I'm doing anything wrong. I would appreciate your time and afford.

PLINK v2.00a3.6LM AVX2 Intel Options in effect:
--double-id --make-bed --new-id-max-allele-len 60 missing --out check --real-ref-alleles --set-hh-missing --set-missing-var-ids @:#,,

ADD REPLY
1
Entering edit mode

You omitted a crucial piece of information: what's the input?

If it's a VCF with an overly long variant ID, you have to clear that ID first with e.g. a bash one-liner, because VCF import happens before --set-missing-var-ids in the order of operations.

ADD REPLY

Login before adding your answer.

Traffic: 1156 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6