Skip to content Skip to navigation

Everything You Know About Business Intelligence, Data Warehousing and ETL is Wrong — Part II

Printer-friendly version

A Snapshot of Today and a Glimpse of Tomorrow

So, here we are in 2010 (if you want to know where we've been since 1965, take a look at my last blog post). Between the Open Source Software (OSS) movement and The Cloud, businesses are increasingly asking why they should pay for restrictive software licenses and the dedicated hardware to run it on when there are technologies and business models which can meet 80% or more of their business’ needs for a fraction of the dollar cost and implementation/maintenance effort. This is classic disruptive innovation. Taken to its logical conclusion, commercial, proprietary, best-of-breed enterprise software, especially in the data warehousing market, will be relegated to the extreme high performance margins of the market. The remainder of the market will be owned by services vendors and a handful of mega-vendors.

The services vendors will offer a value proposition based around their ability to integrate, support, innovate and indemnify the existing Free OSS and Hybrid OSS offerings running on or across various flavors of The Cloud. The mega-vendors will offer a value proposition based around “owning-the-stack”, that is, a unified suite of semi-proprietary hardware and software tightly integrated and presented via a consistent graphical user interface (GUI) in order to optimize and simplify the management of the entire data warehouse ecosystem. The most ambitious of the mega-vendors-to-be are already racing to grab everything from the transactional systems which create and collect the data, to the virtual hardware which processes it, to the GUIs which present it, and every architectural component in between.

Yet there is still very much the turn-of-the-millennium belief and expectation that data warehousing is a tool-driven exercise, characterized by multiple distinct stages of processing and preparation; that it is a big and expensive undertaking requiring as much political and change management skills as technical and project management expertise. Such beliefs and expectations set unreasonably high barriers to the initiation and successful implementation of data warehouse initiatives and they perpetuate the manufactured divide between IT and the business (in our case, the business of clinical care). These expectations primarily benefit the best-of-breed holdouts, the services organizations and the mega-vendors who seek to justify their too high costs and too low success rates. They certainly do not benefit you or your organization (whether provider or payer or device-maker).

As I stated in the first paragraph of the first post in this series, data warehousing begins with data and it ends with data, and there is nothing in between but data. Data is the nervous system of your enterprise. Your source and transactional systems are the “senses” through which your enterprise perceives and reflexively responds to its internal health, its patients and/or customers, its competitive market and the larger economy. Your reporting and analytical systems, whatever their form, are the “brains” through which your employees understand, innovate and adapt to changes in the enterprise's internal health, its patients, its customers, its competitive market and the larger economy. Finally, your employees are the muscles and your physical plant the bones through which innovation, adaptation and day-to-day operations are performed.

Any data warehousing initiative should focus first and foremost on the business, in this case the clinical, need: what process, regulation or opportunity requires data in order to generate, either directly or indirectly, value for the enterprise. And it should focus a very close second on the management and governance of that value-generating data, because that value-generating data is the raw material of information products – the actionable outputs from which human and automated decisions are made. From this vantage point, there is no necessary need for an ETL tool or a BI platform or an AA engine or any other ecosystem acronym. But there is an absolute need for careful, considered and correct management of the technologies, methodologies and processes through which data is collected, processed and presented as meaningful and useful, high-value information.

So, here we are in 2010 (if you want to know where we've been since 1965, take a look at my last blog post). Between the Open Source Software (OSS) movement and The Cloud, businesses are increasingly asking why they should pay for restrictive software licenses and the dedicated hardware to run it on when there are technologies and business models which can meet 80% or more of their business’ needs for a fraction of the dollar cost and implementation/maintenance effort.

Pages

Comments

Marc, I'm enjoying this.

I'm voting for a Part III.

If part II includes the "glimpse of tomorrow", what's your vision of where BI, DW, and ETL is going? Do you see the same kinds of evolution of governance that characterize Web 2.0 (democratization of data, people like you and me publishing blogs, etc). There's a vision out there, for example with ICD-11 (eleven) of the development process being more participative and evolutionary. I appreciate that it's heresy to propose vision when we haven't tackled ICD-10 in the US, and some BI/DW/ETL projects have disappointed their goals. That said, I suspect you have some insights:

1) do coxswain's need real-time and historical BI input?
2) how about the rowers?
3) how about those who are putting on the multi-competitor race?

What might that look like for HCIT in 2015 to 2025 timeframe? Sounds like a Part II to me.

Pages