Comparing rsids in two VCF
1
0
Entering edit mode
6.9 years ago
BAGeno ▴ 190

Hi

I have two VCF and I want to find common rsids in those two files. For this purpose I have used grep command by taking list of rsids from one file and running it on other file, but it gave me some extra rsids as well which were not present in the list. I have also used bcftools isec command but it did not give me anything in intersected file. I have also used vcf-isec but it is giving me error of mixed vcf formats. I have checked my files, one file is in v4.1 and other is in v4.2.

Please tell me what should I do to find common rsids in both files?

rsid vcf • 1.8k views
ADD COMMENT
0
Entering edit mode

Since you haven't showed what you did to extract rs ids but said you extracted with grep, following will work to find common rs ids in two files

  • prepare 2 files containing rs ids from 2 input files

    grep -Fwf file1_rs_ids.txt file2_rs_ids.txt

P.S: Please post what you have tried so it would be easy to direct you towards a solution.

ADD REPLY
0
Entering edit mode
ADD REPLY
1
Entering edit mode
6.9 years ago

use comm:

comm  \
  <(cat input1.vcf | grep -v "^#" | cut -f 3 | sort | uniq)  \
  <(cat input2.vcf | grep -v "^#" | cut -f 3 | sort | uniq)

will produce 3 columns: rs uniq to file1, rs uniq to file2, rs common to both files.

ADD COMMENT

Login before adding your answer.

Traffic: 3137 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6