Question: Working with big network data
0
gravatar for moranr
3.3 years ago by
moranr250
Ireland
moranr250 wrote:

Hi,

I have ~9.5 TB of data files. Each file contains a network in the format for Gephi. It is numerical data in ASCII format. I want to be able to work on compressed files to do some quantitative surveys on each file and some other analyses.
I would like to get the data down to below 4TB so it can be stored on a single HD at least. I would also like something that is fast , as I will be working with the data continuously. So far I have found lzop (deafult compression) to be the best. Anyone know anything better for this ? Or any advice for working with data like this ?

Thanks , R

big data networks • 815 views
ADD COMMENTlink modified 3.3 years ago by Giovanni M Dall'Olio26k • written 3.3 years ago by moranr250
3
gravatar for Giovanni M Dall'Olio
3.3 years ago by
London, UK
Giovanni M Dall'Olio26k wrote:

A couple of suggestions:

  • remove any unnecessary metadata
  • use a graph database, e.g. neo4j (which by the way works quite well with Gephi)
ADD COMMENTlink written 3.3 years ago by Giovanni M Dall'Olio26k

Never knew about graph databases. Amazing, thank you.

ADD REPLYlink written 3.3 years ago by moranr250
1

One of the limitations of neo4j is that you only have one database per installation. The common practice for when you have multiple graphs is to put them all together, and have a flag or property to differentiate them.

ADD REPLYlink written 3.3 years ago by Giovanni M Dall'Olio26k

so just something simple like concatenate the node defs together and then the edgelists together and combine the two ? could a flag be adding an extra column to each network file with the file ID or something ? Or does neo4j have a way of setting flags ?

ADD REPLYlink written 3.3 years ago by moranr250
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 742 users visited in the last hour