How to match data from file 2 on 1st with deleting unmatched data
0
0
Entering edit mode
3.2 years ago

I have two .tsv files, 1st file contains variant calling data in which column "gene known gene" contains all variant gene names. In my 2nd file, I have genes that are related to diabetes. Now I want to filter file 1 to contains only genes that match with fie 2. idk how to filter it! Thanks in advance

SNP gene sequence • 580 views
ADD COMMENT
1
Entering edit mode

please provide some file examples, as far I can understand, you can use grep or a perl/python script.

ADD REPLY
0
Entering edit mode

Suppose this is my file 1 and in this column "Gene.knownGene" contains all my variants. Now I have file 2 which contains variants related to a particular disease. So I want to compare file 1 to file 2 and delete all variants which are mismatched Note: files are much bigger than this eg. and file 1 looks like a normal VCF file

Thanks in advance

File 1

Chr Start End Ref Alt Func.knownGene Gene.knownGene

chr1 183401 183401 C G intronic FO538757.3
chr1 183629 183629 G A intronic ABCC8 (6833)
chr1 601515 601515 T C ncRNA_exonic RP5-857K21.4
chr1 601544 601544 G A ncRNA_exonic RP5-857K21.4
chr1 601606 601606 G T ncRNA_intronic RP5-857K21.4
chr1 610767 610767 G A ncRNA_intronic AKT2 (208)
chr1 610795 610795 A G ncRNA_intronic RP5-857K21.4
chr1 611072 611072 A C ncRNA_intronic RP5-857K21.4
chr1 611073 611073 G A ncRNA_intronic RP5-857K21.4
chr1 611317 611317 A G ncRNA_intronic RP5-857K21.4

ADD REPLY
0
Entering edit mode

can you share an example of the second file?

ADD REPLY
0
Entering edit mode

File 2

ABCC8 (6833)

HYMAI (57061)

HYMAI (57061)

KCNJ11 (3767)

PLAGL1 (5325)

ZFP57 (346171)

HNF1B (6928)

AKT2 (208)

ADD REPLY

Login before adding your answer.

Traffic: 1501 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6