Why Is Omim'S Mim2Gene.Txt File More Inclusive Than Omim'S Genemap?
1
0
Entering edit mode
11.0 years ago
rolyata47 ▴ 40

To make things less confusing, a description of these files can be found here: http://www.omim.org/downloads

I am getting these files from the link that is emailed me after I subscribe to the site. When you run a few simple commands on them, you'll see they each have a different number of distinct OMIM IDs.

$ wc -l genemap
13890 genemap

$ cut -f 1 mim2gene.txt | uniq | wc -l
22840

In the genemap, each row has a unique OMIM ID. In the mim2gene file, the first column is the OMIM ID, so we get the unique OMIM IDs and a count of them... and, voila, the two counts are very different!

Why would this be? That is, why does mim2gene account for far more OMIM IDs than genemap? Is mim2gene more inclusive? If so, how? And if not... is this an error on the part of OMIM?

I appreciate any feedback :-)

• 4.4k views
ADD COMMENT
1
Entering edit mode
11.0 years ago
Christian ★ 3.0k

Try to sort first before using uniq:

$ cut -f 1 mim2gene.txt | sort | uniq | wc -l

I have no idea how the data looks like and if it is sorted already, so i cannot tell you for sure if it makes a difference.

ADD COMMENT

Login before adding your answer.

Traffic: 2665 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6