Sometimes vendors do get it (mostly) right. Hewlett-Packard put together a brief white paper in February of this year laying out their view of Business Intelligence (BI) for 2009 (and beyond). I think that they got it largely right. Their #10 trend notes the increasing integration of Complex Event Processing (CEP) engines into traditional data warehouse (DWH) and BI platforms. Below is a summary of the trend, my thoughts on whether HP got it right and what the trend may mean for HIT.
HP Predicts: As with HP's Trend #9, not so much of a prediction as an observation. A little background is necessary – for those of you who know this story in detail, forgive my ellipses and simplifications.
Historically, transactional and operational systems were about fleeting data and fleeting queries, that is, they were optimized to carry out individual transactions at very high rates and with very high fidelity. This posed a problem to management when they wanted to look for patterns in the transactional/operational data and so the DWH was born. In the DWH, data was captured, organized and archived (i.e. it was “warehoused”) in order to make the data persistent with the expectation that queries would be fleeting – subject to the whim of management and the acumen of the analyst. As managers and analysts began to regularly run the same reports over and over, looking for exceptions as drivers of cost and profit, queries began to persist. These persisting queries put ever increasing burdens on the DWH and so BI was born. BI leveraged the persistence and structure of the data in the DWH but offloaded reporting and analysis. BI then was about mixed workloads of persistent (periodic reporting) and fleeting (drill-down on exceptions) queries against persistent data (the DWH).
All was well in the world of DWH/BI and then data volumes began to grow exponentially and the traditional ETL tools and architectures to move data from the transactional and operational systems into the DWH began to groan under the weight of gigabytes and terabytes of data per day. What’s more, line managers and even operators began to ask for access to the DWH in order to make better (near) real-time decisions. To meet these new needs, Operational BI (OBI) was born. In OBI, the queries are still mixed (hence the BI), but the data is fleeting. The thing with OBI is that it still requires a human to consume a report and make a decision, and it is therefore bandwidth limited, especially for automatable decisions and insights. At last we come to CEP!
In CEP, the queries are persistent, they are algorithms proactively scanning fleeting, (near) real-time data, looking for an exception or connection from which to trigger an action. There are no fleeting queries in CEP, so there is no mixed workload and the data is pulled right off of the transactional and operational systems (often with baseline, benchmark or reference data from the DWH systems). In short, as stated in the title, DWH is fleeting queries and persistent data, while CEP is persistent queries and fleeting data.