Question: Remove snps with missing names
0
gravatar for bha
2.1 years ago by
bha60
bha60 wrote:

I pruned the 1000G data with MAF, and some LD filtering. I wonder there are some snps with names as "." (snps indentifiers are as "." dot). Any suggestion how i should remove or pull out that ones?

plink genetics • 661 views
ADD COMMENTlink written 2.1 years ago by bha60
1

Use bcftools. There are two ways (copy/pasted from bcftools manual):

 "." to test missing values

Example:

 bcftools view -i 'ID=="."' test.vcf

.

    -n, --novel
        print novel sites only (ID column is ".")

Example :

    bcftools view -n test.vcf
ADD REPLYlink modified 2.1 years ago • written 2.1 years ago by cpad011213k

How i can remove these IDs (".") from the the datasets?

ADD REPLYlink written 2.1 years ago by bha60

are these datasets in VCF format? if not, please post example dataset/records here.

ADD REPLYlink modified 2.1 years ago • written 2.1 years ago by cpad011213k

yes, these are in VCF format.

ADD REPLYlink written 2.1 years ago by bha60

Like cpad0112 said, use bcftools. bcftools view can be used to subset data when the output is redirected to a file.

ADD REPLYlink written 2.1 years ago by RamRS27k

replace test. vcf with your dataset.vcf

example code:

 bcftools view -n kg.vcf > new.vcf
ADD REPLYlink written 2.1 years ago by cpad011213k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1578 users visited in the last hour