Loading data from Excel – welcome to my new friend, PALO.SETDATA_BULK

 I have a pathological aversion to large spreadsheets. Probably the most common issue I encounter in consulting is people trying to read or write 20,000+ rows * 150 column spreadsheets from their OLAP system. Then wondering why everything is slow on their client machine.

In my opinion, if you have a decent BI system, and you have managed to create some cultural change in your organisation around how data is surfaced, you don’t really need a 20,000 row spreadsheet on a daily basis. All you do is create further chaos and completely miss the point of a BI system, which should be to simplify how users interact with your organisational data, not make that interaction more complex.

If there is one thing worse than Excel Hell, it is OLAP Hell.

But no matter how much I rant, the fact and the reality is that people do use and need big spreadsheets on a daily basis in their organisations. I cannot escape the reality of reporting pressures that my clients are put under to conform, especially to ‘the way things have always been done’.

So, in light of this, I was pretty impressed with the latest small but important functionality addition in Jedox 5.0 SR1 – the PALO.SETDATA_BULK function. Up until now, we have only had 2 options in Excel: PALO.SETDATA (slow for lots of cells) and Jedox Importer (slow for lots of rows of data, and a bit counter-intuitive for a management accountant). Both of these you can you use in  either spreadsheet cells or vba.

The PALO.SETDATA_BULK allows you to take advantage of the same api’s that the ETL uses to load data in excel. PALO.SETDATA_BULK also takes advantage from a single array formula in a single cell (as opposed to 20,000 PALO.SETDATA formulas all trying to execute at the same time).

Excel Spreadsheet with PALO.SETDATA_BULK

From my testing, it can send in over 20k rows in excel in less than 1.5 seconds. The same data sent in via normal PALO.SETDATA formulas take over 30 seconds .

Both tests were done against a localhost, so against a client server environment there will be other latencies like networks, etc. This is a massive improvement in anyone’s language and certainly will assist people who have the need for those large spreadsheets.

One note: you need to structure the data in a fact-table format to allow the array formulas to work. See the example against the Demo/Sales cube.

EDIT: I was asked to post the example spreadsheet, sans IF statement. You can grab it from here.

5 thoughts on “Loading data from Excel – welcome to my new friend, PALO.SETDATA_BULK

  1. Jos van der Meer

    I have 235K rows (9 columns including the value column). The setdata_bulk handles a maximum (at my computer) of approx. 120K rows. These rows are processed in only a few seconds.

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s