Effectively pivoting data

Select data from database and transpose it into one line per part containing several data.

That process is better known as Pivot Table. The principle is not that exciting, once understood, but it is hardly done efficiently. One of our last tasks was to create a report over all production data for each part regarding a configuration that defines data to use and its sort order.

The next picture will show an example of how the data looks in database and what it should look like in report view.

pivot1

I’ve highlighted the corresponding values in both tables to give you an idea of where each value from the left is used on the right side.

Development

Your task as a developer would be to transform the data as fast as possible using a decent amount of memory.

It’s hard to tell what went wrong during the first implementation, but during review it wouldn’t meet my expectations.
I’ll try to describe the process in a few words:

  1. Query data using SqlDataReader converting each record into an object and storing all objects in a List<DataObject>.
  2. Creating Pivot data structure by looping through all DataObject and fetching all VALUEID  values.
    Storing all results in a List<ColumnDefinition> containing display name and sort order information.
  3. Calculating Pivot Table by looping all DataObject entries creating a DataTable ReturnTable.
  4. Looping through all cells in ReturnTable and replace null values with “<No data available>”.
  5. Display ReturnTable as Pivot Report.

That code caused a lot of troubles, neither performance nor memory consumption were acceptable. In my test setup the test began to fail for 10 million data entries.

pivot_nonoptimized_results

The code used for my tests can be found here: https://github.com/dwonisch/PivotExample

Optimization

I’ll try to optimize that code and show up some things that went wrong in the next parts of this series.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>