Question: Keeping only common variants in the merged VCF file
0
gravatar for seta
12 months ago by
seta1.3k
Sweden
seta1.3k wrote:

Hi all,

After merging my vcf file containing specific variants with those variants in 1000 genome vcf, the ID column of merged VCF file is like below:

chr1:39440410:SG

rs6722104

rs60323161;chr1:39244787:SG

which only the rs60323161;chr1:39244787:SG are common variants. Please kindly let me know how can keep only common variants in the merged vcf file?

I used bcftools view -T for keeping just common variants, but it didn't work well; actually, the variants like below is still exist in the file, which chr1:39448418:SG should be removed

rs3118014;chr1:39448418:SG

chr1:39448418:SG

I also tested grep -Fwvf and grep -vf for removing those variants, but none of them works well. Please kindly share me your solution?

Thanks

bcftools merge vcf • 505 views
ADD COMMENTlink modified 12 months ago by husensofteng170 • written 12 months ago by seta1.3k
1
gravatar for husensofteng
12 months ago by
husensofteng170
Sweden
husensofteng170 wrote:

I am not sure if I understand the question correctly, but it sounds as a line filtering issue to me. So:

awk '$1~"#" || ($3~"rs" && $3~"chr")' inputfile > outputfile

*Only keep lines that start with # (header lines) or there is rs ID and chr info at the third column of the file.

ADD COMMENTlink modified 12 months ago • written 12 months ago by husensofteng170

Many thank for your nice solution.

ADD REPLYlink written 12 months ago by seta1.3k
0
gravatar for harold.smith.tarheel
12 months ago by
United States
harold.smith.tarheel4.5k wrote:

Two options:

1) use BEDtools 'intersect' for the two original VCFs.

2) use VCFtools 'vcf-annotate' to add the 1000 Genomes rs numbers, then 'grep' to keep the variants that were annotated as such.

ADD COMMENTlink written 12 months ago by harold.smith.tarheel4.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2188 users visited in the last hour