ICU Data Mart: A Non-IT Approach

October 25, 2011
| Share | Print
A Team of Clinicians, Researchers and Informatics Personnel at the Mayo Clinic have Taken a Homegrown Approach to Building an ICU Data Mart

This was the start. We now had version 1.0! No beta versions, no releases. Of course, each new “piece” required careful testing and validation, which were performed by comparing our automated results to the actual EMRs on manual review of the medical records. This step was mandatory before moving the newly developed data elements to a production stage. Additional statistical controls were also used to assess for unanticipated gaps in the data, as well as potential data outliers.

Having moved the initial piece into a production phase, we immediately began working on the next data element. Since we needed to identify arterial blood gas results, our next focus was the source database housing laboratory data. Piece by piece, the database grew (and continues to grow). All the while, previously tested and validated data have been available to the end users. Without this approach, it would have taken years to realize a functional “fully integrated EMR/database.” In contrast, this system was functional from the very beginning. Additional data are simply added to the existing database and the process continues to move forward.

CONCEPT 2: UNIX

While some people fondly remember the command line, most database end users prefer a Windows-based interface. Yet, although this works well when working on the standard office tasks, it is often inadequate when working with complex databases. Furthermore, the development of multiple interfaces adds additional layers of complexity, cost, and potential errors.

Indeed, complex database solutions often require a custom-built query interface. This interface generally translates still algorithmic query language into SQL commands. Database end users must not only understand the interface, but they also must learn the interface query language. Moreover, the varied interfaces often require additional resources such as web-servers and a team that can develop, support, and improve the interface over time-an iterative, ongoing process.

For the ICU data mart, we chose to explore query building tools that reside in the statistical software. Most of these embedded query building tools have the ability to interrogate databases using open database connectivity (OBDC). Microsoft Excel is an example of one such tool.

For most of our analytic needs, we have found that JMP statistical software (from SAS Institute Inc., Cary, N.C.) was quite adequate. Embedded query tools require no additional interfaces and need for data export. The data are simply right there, residing within a powerful statistical program, and immediately available for the desired analyses. For those few circumstances where more robust analyses were needed, we used SAS Institute's SAS Data Management software.

CONCEPT 3: MATRIX

Do you remember the nice green-on-black screen from the Wachowski brothers' movie, The Matrix? How the data visually fell from out of one site to another? Beautiful, raw data! The concept of The Matrix is all about storing raw data-no pre-processing, no massaging, no normalizing. Only the original data are stored.

Don't get us wrong, data parsing, processing, and normalization are extremely important, but this process will vary depending on the specific data need. Moreover, pre-processing and normalization will result in an unnecessary loss of data. Often, this loss of data will prove to be a barrier when future data needs arise. In contrast, post-processing and normalization allows the end users (or applications) to tailor the data to their specific needs, while keeping the full complement of data elements available for future use.

Importantly, filtering data feeds may be necessary as you will likely not need (or want) to store all aspects of the technical data. Rather, what you really want to store are the meaningful data. We advise that you take some time to determine which data elements are meaningful or unnecessary and can be filtered out. Ultimately, when the meaningful raw data are available, it makes organizing, using, and summarizing the data far more powerful. For example, if report requirements change, it is much easier to modify existing code within the data mart than to modify the interfaces with the various source databases.

An additional key element regarding data acquisition is the timing of its availability. Due to the increasingly fast-paced nature of medicine, particularly in high-acuity environments such as the operating room and ICU, near real-time feeds are of increasing importance. However, real-time data feeds can come at a cost, particularly with regard to resource utilization and the stability of the source databases. Therefore, you must determine just how time-sensitive your data needs might be.

Generally, data requirements for quality initiatives, reports, and research do not require real-time data feeds. In most clinical systems, real-time data are not truly real-time; for example “real-time” clinical notes appear only after they are transcribed and finalized by the authoring clinicians. ICD-9 codes are generally assigned only after a patient was discharged. Are these data sources ever truly “real-time?” Often, the ability to choose an appropriate time interval for data retrieval can save significant resources without sacrificing a systems' usefulness.

In summary, our group of clinicians, researchers, and informatics personnel have developed an ICU data mart that contains a near real-time copy of pertinent ICU patient information on a population of 206 ICU beds, with an average of 15,000 ICU admissions per year. This includes historical data going back to 2003. Having been in existence now for almost five years, the approach taken by our team has proved efficient, adaptable, and very well-suited to time-sensitive environments such as the ICU.

PreviousPage
of 3Next