Tool:DelimitedReader: Explore your CSV files like a database (regular expression supported)
0
1
Entering edit mode
9.0 years ago

DelimitedReader is a sophisticated reader that explores CSV (Comma-Separated Values) files like a database.

It enables you to set the following conditions:

  • From which row to start the reading
  • Read which columns
  • The valid data pattern for each column, e.g. number, text, non-empty, positive, etc.
  • Until which row to stop the reading

You can use regular expression in the above settings.

According to the conditions set, DelimitedReader can either read the valid rows in a CSV file one by one or return all of them together in a dataset.

You can put all your data into one single CSV file, and then use DelimitedReader to fetch the data sections you need. This is very useful when you have relevant fields between two CSV datasets. DelimitedReader helps you find the connection between two datasets and read the relevant data.

The reader is a part of the LeoTask - a lightweight, productive, and reliable MapReduce framework for computational research on a multicore computer.

Here are the source code and usage demo of DelimitedReader.

----Example usage: Get data from related data sets----------

The CSV file dreader.csv:

id    name    age
mm1   Leo     25
mm2   Emily   18
id    date      blood pressure   heart rate   mood
mm1   01-Apr    100              50           Happy
mm1   05-Apr    120              60           Sad
mm2   01-Apr    80               40            
mm2   03-Apr    90                            Sad
mm2   05-Aprl                    50           Happy

The code to get the clinical data for "Emily" from the CSV file:

DelimitedReader dr = new DelimitedReader("dreader.csv");
dr.prep(null, new String[] { "id", "name" });
dr.setValidRowPattern(new String[] { null, "Emily" });
String[] row = dr.readValidRow();
String id = row[0];
dr.prep(null, new String[] { "id", "date", "mood", "blood pressure", "heart rate" });
dr.setValidRowPattern(new String[] { id });
DataTable dt = dr.readValidDataTable();

The dt is the obtained result data table:

id    date      mood    blood pressure   heart rate
mm2   01-Apr            80               40
mm2   03-Apr    Sad     90                
mm2   05-Aprl   Happy                    50

Note: If you want, DelimitedReader can provide the data columns in different orders. In the example the "mood" is moved to before the "blood pressure".

Here is the code to print the obtained data table dt:

  log(dt.getColNames());
    for (int I = 0, mi = dt.nRows(); I < mi; i++) {
    log(dt.getRow(i));
  }

Here is output:

[id,date,mood,blood pressure,heart rate]
[mm2,01-Apr, ,80,40]
[mm2,03-Apr,Sad,90,]
[mm2,05-Apr,Happy,,50]

For more details, please refer to this more comprehensive demo code.

CSV Java • 2.3k views
ADD COMMENT

Login before adding your answer.

Traffic: 2139 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6