Hi there all! I have 2 vcf files with filtered SNPs and now I want to extract a. The genes that are shared between the 2 vcf files b. The SNPs that are different on the shared genes between these two vcf files. for example 1.vcf
TRINITY_DN17985_c2_g1    539            0      G     A
TRINITY_DN17985_c2_g1   576         0   G   A 
TRINITY_DN17985_c2_g1   710         0   A   T
TRINITY_DN17985_c2_g1   722         0   T   C
TRINITY_DN17985_c2_g1   822         0   A   G
TRINITY_DN17985_c2_g1   1608    0   T   G
TRINITY_DN17985_c2_g1   2353    0   T   A
TRINITY_DN17985_c2_g1   4371    0   T   A
TRINITY_DN17985_c2_g1   6200    0   C   T
TRINITY_DN17985_c2_g1   7444    0   G   T
TRINITY_DN17985_c2_g1   7456    0   A   T
TRINITY_DN17985_c2_g1   2434    0   G   T
TRINITY_DN17985_c2_g1   4190    0   G   A
TRINITY_DN17985_c2_g1   4275    0   A   G
TRINITY_DN15509_c1_g1   626         0   A   C
TRINITY_DN15509_c1_g1   1647    0   G   A
TRINITY_DN15425_c1_g4   741         0   G   A
TRINITY_DN12988_c0_g1   1553    0   T   A
TRINITY_DN14521_c2_g1   826         0   T   A
2.vcf
TRINITY_DN17985_c2_g1   539         0      G     T
TRINITY_DN17985_c2_g1   576         0   G   T 
TRINITY_DN17985_c2_g1   710         0   A   G
TRINITY_DN17985_c2_g1   722         0   T   C
TRINITY_DN17985_c2_g1   822         0   A   G
TRINITY_DN17985_c2_g1   1608    0   T   G
TRINITY_DN17985_c2_g1   2353    0   T   A
TRINITY_DN17985_c2_g1   4371    0   T   A
TRINITY_DN17985_c2_g1   6200    0   C   T
TRINITY_DN17985_c2_g1   7444    0   G   T
TRINITY_DN17985_c2_g1   7456    0   A   T
TRINITY_DN16743_c0_g1   5739    0   G   A
TRINITY_DN16743_c0_g1   5764    0   T   A
TRINITY_DN16743_c0_g2   199         0   A   T
TRINITY_DN16295_c0_g1   982         0   A   G
and I want to extract only the following for which the ALT allele is different between 1.vcf and 2.vcf files.
TRINITY_DN17985_c2_g1   539         0      G     A
TRINITY_DN17985_c2_g1   576         0   G   A 
TRINITY_DN17985_c2_g1   710         0   A   T
TRINITY_DN17985_c2_g1   539         0      G     T
TRINITY_DN17985_c2_g1   576         0   G   T 
TRINITY_DN17985_c2_g1   710         0   A   G
I know that I can use bcftools isec for getting the SNPs that are shared between the two or the SNPs that are unique in the first and second file respectively. But how I can get first the genes that are present in both files?
Thanks a lot in advance for any help! Sofia
I suppose this is not the format that the post should have. The VCF format doesn't look like a VCF format. Can you please wrap it around the code flag? (little button with 101010)
I have just done that (wrapped it in code format)