Algorithm For Mining The Primary Key In A Comma-Separated File Having 20,000 Entries
1
0
Entering edit mode
12.2 years ago
Sameer • 0

Hello, I am working on a project, i have to write an algorithm for comparing two large databases by defining association rules, so firstly i want to distinguish the primary attributes. That's way i need some help to find primary key in large database of minimum 20,000 attributes. please give me some suggestions that what should i do. sorry if i could not explain properly.
BR Sameer

algorithm • 3.5k views
ADD COMMENT
3
Entering edit mode

How is it related to bioinformatics? As it is defined here, you'd better ask stackoverflow.com.

ADD REPLY
0
Entering edit mode

Yes, please indicate bioinformatics relevance, or we'll have to close this one.

ADD REPLY
0
Entering edit mode

also, provide an example of the data you want to analyze, and the output you want to get.

ADD REPLY
0
Entering edit mode

A problem like this might occur in bioinformatics when trying to map datasets based on identifiers, then it is important to look for some attribute that could serve as primary key. Still I believe it is not bioinformatics specific enough to warrant it on BioStar instead of Stackoverflow without more explanation

ADD REPLY
0
Entering edit mode

I think the relevance to bioinformatics is plainly obvious. I run into this problem a lot in bioinformatics, and find the question and potential answer informative for anyone in this field.

ADD REPLY
0
Entering edit mode

Perhaps I should not have said "relevance". What I want to see is a specific bioinformatics problem, not a general computing problem. And the question could be clearer - it's CSV in the title, but then databases in the main text.

ADD REPLY
0
Entering edit mode
4.2 years ago

Use awk index arrays. With awk, it is also possible to read 2 files in the same command and cross-compare information between these.

Kevin

ADD COMMENT

Login before adding your answer.

Traffic: 1957 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6