Question: Remove snps with missing names
0
gravatar for bha
15 months ago by
bha60
bha60 wrote:

I pruned the 1000G data with MAF, and some LD filtering. I wonder there are some snps with names as "." (snps indentifiers are as "." dot). Any suggestion how i should remove or pull out that ones?

plink genetics • 406 views
ADD COMMENTlink written 15 months ago by bha60
1

Use bcftools. There are two ways (copy/pasted from bcftools manual):

 "." to test missing values

Example:

 bcftools view -i 'ID=="."' test.vcf

.

    -n, --novel
        print novel sites only (ID column is ".")

Example :

    bcftools view -n test.vcf
ADD REPLYlink modified 15 months ago • written 15 months ago by cpad011211k

How i can remove these IDs (".") from the the datasets?

ADD REPLYlink written 15 months ago by bha60

are these datasets in VCF format? if not, please post example dataset/records here.

ADD REPLYlink modified 15 months ago • written 15 months ago by cpad011211k

yes, these are in VCF format.

ADD REPLYlink written 15 months ago by bha60

Like cpad0112 said, use bcftools. bcftools view can be used to subset data when the output is redirected to a file.

ADD REPLYlink written 15 months ago by RamRS22k

replace test. vcf with your dataset.vcf

example code:

 bcftools view -n kg.vcf > new.vcf
ADD REPLYlink written 15 months ago by cpad011211k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1473 users visited in the last hour