Handing scafflold names for plink bed file
0
0
Entering edit mode
3.6 years ago
nitinra ▴ 50

Hello all,

I am trying to convert vcf file to bed file using plink pipeline. I used the following command: plink --vcf input.vcf --make-bed --o output

However, I had to change the names of my chromosomes to just numbers using awk command:

awk '{gsub(/^Chromosome_/,""); print}' no_chr1.vcf > no_chr2.vcf

I also have a lot of scaffolds in the genome such as

modScaffold_4_1

How do I go about changing it in a way that plink does not throw this error and that it does not overlap with already present chromosome numbers:

Invalid chromosome code '4_1' on line 13386275 of .vcf file.

My final step is to convert it to a bed file and use it in admixture software.

bed plink snp genome • 1.3k views
ADD COMMENT
0
Entering edit mode

did you try to use the option --allow-extra-chr ?

ADD REPLY
0
Entering edit mode

I did and the analysis runs. however, It shows up as an error when I use it in admixture since the names are still present for scaffolds.

ADD REPLY
0
Entering edit mode

Hi, I am currently having the same issue. Have you managed to solve this?

ADD REPLY
0
Entering edit mode

I was finally successful by setting --allow-extra-chr 0. This seems to convert all scaffold names into 0, but indeed it works.

ADD REPLY

Login before adding your answer.

Traffic: 2670 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6