4.2 years ago by
Seattle, WA USA
Can you use disk as intermediate storage?
If you don't want to keep the entire matrix in (system) memory before writing it out, perhaps you can write placeholder bytes to disk directly, and copy over one cell at a time:
- Walk through your input
m matrix to determine maximum precision.
Write out an intermediate
n CSV matrix file, containing zero-padded elements to the precision found in step 1.
This creates an empty CSV matrix with equally-sized cells. In C, you could use one-byte
char elements for commas and numbers. This is sized in such a way that you can write a function to directly translate or map a position in your input matrix to a byte offset in the empty CSV matrix.
Walk through the input matrix again, writing cell ij from your input matrix over the cell ji in your empty CSV output matrix.
In C, use
fseek(), for example, to jump to the calculated byte offset for cell ji. Then
fwrite() a zero-padded string from the original matrix's cell ij to this stream's offset (leaving commas in place).
Clean things up: Walk through the intermediate CSV matrix one row at a time, stripping zero-pad characters and writing this stripped matrix to the final, transposed output.
The time complexity is O(nm) from walking through your
m matrix four times. Since cells in the intermediate output matrix are equally spaced, mapping cells from the input to output matrix is O(1). This should be very (system) memory efficient as you're only storing one cell at a time in memory, reading/writing that cell from disk. You could need a fair amount of disk storage, however, for caching the intermediate matrix, depending on the desired precision.