Question

Compare pairs of key/value of perl hash tables

1

Entering edit mode

6.9 years ago

Amy • 0

Hi Guys,

Does anyone know how to compare pairs of key/value in two hashtables ? I'm currently working with two tab files. Each one contains list of SNPs with their location and position as information. I need to compare both files and get the common location/position between two files. Input files looks like :

-File1:

LOCATION             POSITION 
LOC105032014         221                 
LOC105032014         222                 
LOC105032014         371                 
LOC105032014         434                 
LOC105032014         1271

-File2:

LOCATION             POSITION          
LOC105032014         193                 
LOC105032014         371                 
LOC105032014         1097                
LOC105032014         1102                
LOC105032014         1111                
LOC105032014         1119                               
LOC105032014         1271

My output should give something like:

LOCATION             POSITION 
LOC105032014         1271                
LOC105032014         371

Any help will be welcome. Thanks !

Perl SNPs Hashes • 5.1k views

ADD COMMENT • link updated 6.9 years ago by EagleEye 7.5k • written 6.9 years ago by Amy • 0

1

Entering edit mode

It seems your key is both, the location and the position. I would use a hash like $hash{"$location:$position"} = 1.

ADD REPLY • link 6.9 years ago by abascalfederico ★ 1.2k

1

Entering edit mode

Not so much an issue here, probably, but watch out for duplicate keys when you make a custom key like this.

ADD REPLY • link 6.9 years ago by Alex Reynolds 35k

score 3 · Answer 1 · 2017-05-22

3

Entering edit mode

6.9 years ago

Jean-Karim Heriche 27k

You're looking for lines that are common between two files. Ther are plenty of solutions on the net already. You could use comm command or if you want to use perl and both files fit in memory, you can read then into arrays and use List::Compare.

ADD COMMENT • link 6.9 years ago by Jean-Karim Heriche 27k

0

Entering edit mode

list::compare is great

ADD REPLY • link 6.9 years ago by BioinfGuru ★ 1.7k

score 3 · Answer 2 · 2017-05-23

3

Entering edit mode

6.9 years ago

EagleEye 7.5k

I would rather prefer to solve it in easy and quicker way,

grep -w -Ff File2.txt File1.txt > commonFile1File2.txt

Sorry if I understood your question wrong.

ADD COMMENT • link 6.9 years ago by EagleEye 7.5k

0

Entering edit mode

You got it right. And it worked as well as the perl script. Such a magic trick. Thank you. :)

ADD REPLY • link 6.9 years ago by Amy • 0

score 2 · Answer 3 · 2017-05-22

2

Entering edit mode

6.9 years ago

Alex Reynolds 35k

I wrote a script that suggests one approach:

Usage:

$ intersect.pl --fileA="A.txt" --fileB="B.txt" > answer.txt

This uses Perl's experimental "smartmatch" feature, which can give an annoying warning message that can be discarded by directing standard error to /dev/null. If you're not comfortable using experimental features, there is a List::Util library that offers limited set-style operations.

ADD COMMENT • link 6.9 years ago by Alex Reynolds 35k

0

Entering edit mode

Just because you mention sets, I'll also mention the Set::Scalar module.

ADD REPLY • link 6.9 years ago by Jean-Karim Heriche 27k

0

Entering edit mode

It exactly gave me what I wanted. Thank you so much for your help. :)

ADD REPLY • link 6.9 years ago by Amy • 0

score 1 · Answer 4 · 2017-05-22

You could iterate through one hash, and for each gene ID (location) key in the hash, check if it is present in the other hash, and whether the values are the same, and print out the Location and Position when you find matching values. But you would have to make hashes of arrays, because hash keys in perl have to be unique values, so each gene ID could only be present once as a hash key. This may not be the most efficient way to go about this.