Tool: DelimitedReader: Explore your CSV files like a database (regular expression supported)
gravatar for Changwang Zhang
5.7 years ago by
United Kingdom
Changwang Zhang30 wrote:

DelimitedReader is a sophisticated reader that explores CSV (Comma-Separated Values) files like a database.

It enables you to set the following conditions:

  • From which row to start the reading
  • Read which columns
  • The valid data pattern for each column, e.g. number, text, non-empty, positive, etc.
  • Until which row to stop the reading

You can use regular expression  in the above settings.

According to the conditions set, DelimitedReader can either read the valid rows in a CSV file one by one or return all of them together in a dataset.

You can put all your data into one single CSV file, and then use DelimitedReader to fetch the data sections you need. This is very useful when you have relevant fields between two CSV datasets. DelimitedReader helps you find the connection between two datasets and read the relevant data.

The reader is a part of the LeoTask - a lightweight, productive, and reliable MapReduce framework for computational research on a multicore computer.

Here are the source code and usage demo of the DelimitedReader.

----Example usage: Get data from related data sets----------

The CSV file "dreader.csv":

id name age
mm1 Leo 25
mm2 Emily 18
id date blood pressure heart rate mood
mm1 01-Apr 100 50 Happy
mm1 05-Apr 120 60 Sad
mm2 01-Apr 80 40  
mm2 03-Apr 90   Sad
mm2 05-Aprl   50 Happy

The code to get the clinical data for "Emily" from the CSV file:

  DelimitedReader dr = new DelimitedReader("dreader.csv");
  dr.prep(null, new String[] { "id", "name" });
  dr.setValidRowPattern(new String[] { null, "Emily" });
  String[] row = dr.readValidRow();
  String id = row[0];
  dr.prep(null, new String[] { "id", "date", "mood", "blood pressure", "heart rate" });
  dr.setValidRowPattern(new String[] { id });
  DataTable dt = dr.readValidDataTable();

The dt is the obtained result data table:

id date mood blood pressure heart rate
mm2 01-Apr   80 40
mm2 03-Apr Sad 90  
mm2 05-Aprl Happy   50

Note: If you want, DelimitedReader can provide the data columns in different orders. In the example the "mood" is moved to before the "blood pressure".

Here is the code to print the obtained data table dt:

    for (int i = 0, mi = dt.nRows(); i < mi; i++) {

Here is output:

[id,date,mood,blood pressure,heart rate]
[mm2,01-Apr, ,80,40]


For more details, please refer to this more comprehensive demo code.


java delimited database tool csv data • 1.7k views
ADD COMMENTlink written 5.7 years ago by Changwang Zhang30
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2361 users visited in the last hour