Three hospital systems provide details about how technology has influenced the way they prepare for disasters and what they have learned from their experiences.
Disasters can strike at any time, and there is really no way provider organizations can completely insulate themselves from unforeseen or large-scale natural events such as hurricanes, floods, and fires. Nonetheless, as hospitals continue on their steady march to becoming paperless organizations, many are following strategies that are minimizing their risk of unplanned downtime.
Key to any disaster recovery effort is the ability to protect electronic data, whether the core clinical information systems or ancillary systems such as imaging or business functions, according to experts interviewed for this article. Jeff White, a principal of the Pittsburgh-based Aspen Advisors, LLC, notes that disaster planning is typically a top-down process, and is inclusive of the clinical and business units in a hospital organization. The IT department, he says, should play a central role as implementer, charged with enacting plans, making the investments in technologies, and architecting systems that meet the clinical and business requirements.
How is technology driving better preparedness? Healthcare provider organizations are following various strategies to prepare against unplanned downtime. White provides a few trends that help to explain progress in the disaster recovery arena. More hospital systems are moving toward multiple data center environments, purely for the sake of disaster recovery and business continuity.
Core electronic health record (EHR) systems have an architecture that plays into real-time or near real-time replication of data. Data exists as a single database, so can be replicated in their entirety from a primary site to a secondary site. Replication can be done in near real time, resulting in minimal data lost in case of an interruption.
For ancillary systems, many providers are moving to a virtualized environment. Some providers, especially larger ones, have invested in storage area network (SAN) replication, with a duplicate SAN at a remote site. This can be an expensive set-up, and some mid-sized hospitals are still on that migration path. The advantage is that replication can occur very quickly.
In White’s view most hospitals do a good job planning, configuring, and testing their disaster recovery capabilities, particularly with their core EHRs systems. However, he adds that many organizations struggle with their ancillary systems, because they often lack the people, bandwidth, and time to test major changes adequately on an annual basis.
Disaster recovery system testing should happen annually, White says, adding that regular testing helps train the IT staff in proper procedures. In addition, disaster recovery plans should be revised whenever there is a change in the technology. “Once a disaster is declared, the staff may react differently because the technology has changed. If they don’t have that documented, then it becomes more difficult for them to react once the disaster has happened,” he says.
The following case studies discuss how three hospital systems prepare for potential disasters and the lessons learned from past experiences.
Lessons Learned in Joplin
In May 2011, an EF-5 tornado slammed into St. John’s Regional Medical Center in Joplin, Mo., part of the Mercy Health System, leaving a mile-wide path of destruction. Mike McCreary, chief of services at Mercy Technology Services in St. Louis, says the hospital’s disaster planning is an integrated effort. The IT infrastructure component at the corporate level provides redundancy and connectivity; and a local component operates at the community level. “We follow both hospital emergency and state command systems,” he says. Failover drills are done quarterly, and local disaster drills are done annually in conjunction with the city.
When the tornado struck, it destroyed Joplin’s communication infrastructure. Cell towers were destroyed, removing voice communication (there remained enough bandwidth for text messaging). To fill the gap, the hospital established a command center with a satellite link to provide phone and Internet connectivity, he says. As a result, Mercy now has a mobile communication center with satellite capability and satellite phones. It has incorporated into its plans that text messaging be the primary means of communication when a disaster happens.
McCreary says the hospital’s patient record systems fared well, partly the result of timing and partly due to the remote location of its data center and failover site.
At the time of the tornado, the hospital had been part of the Mercy system for about two and a half years. It was in the process of moving older equipment, including hardware and a variety of systems including nursing documentation and legacy accounts receivable systems, from Joplin to a data center in Washington, Mo., about 250 miles away. “Our model is to have a central suite of applications that is standard on the Mercy system; and the transition was complete at Joplin except for some clean-up,” he says.
The hospital was already live with its EHR (supplied by Epic Systems Corp., Verona, Wis.), which was fully functional when the tornado struck. Had it struck prior to the go-live, it would have been much worse from a data standpoint, McCreary says: “We would have lost all of the systems; and even though there were backups, once something like that happens you are restoring new equipment, and there are always complications.”
McCreary notes that the data center in Washington, which was up and running at the time, works with a failover site in St. Louis. He adds that there is also a local component for each of the hospitals, where local servers would be used for storage and faster access. Those were destroyed at Joplin.
In the short term, the availability of patient data is crucial, he says. During recovery, “You want as much detail around that type of information, because you are dealing with casualties first off, and the more of that information you have available, the better care everybody gets in the beginning,” he says. After that, the hospital moved as quickly as possible to get back to a normal mode.
“Our goal is one patient, one record, and that record should include every encounter he has ever had with a Mercy hospital or clinic,” McCreary says. Since the Joplin tornado, Mercy has hardened its data center in Washington, adding that it centralizes data as much as possible.
In addition to running its own data centers, Mercy is also a data center vendor, selling disaster recovery advisory services to other health organizations, McCreary says. He also thinks that large healthcare organizations should look for ways to work together to back each other’s data.
Florida Hospital Goes Virtual
Florida Hospital, part of the Adventist Health System, is a 2,247-bed acute care organization with seven campuses in the Orlando metropolitan area. In 2004, the hospital experienced a “hurricane trifecta”—three hurricanes in one year—according to Robert Goodman, the hospital’s disaster recovery coordinator. At the time, the hospital used a “tapes and trucks” process, in which data backup tapes were transported physically to a secondary data center.
Although the hospital did not declare a disaster that season, had the data center been destroyed, the backed up data from the tapes would have been several days old, Goodman says. This prompted the hospital to explore alternatives for backing up data, which are centered on remote replication and virtualization.
In 2006, the hospital abandoned the use of backup tapes in favor of continuously replicating data to a secondary data center (operated by SunGard Availability Services, Wayne, Pa., which provides all data recovery services to the hospital) located nearly 1,000 miles from the hospital’s primary data center—and a safe distance from regional disasters. “Basically we mirror asynchronously to our hot site, and that keeps our data current,” Goodman says.
In addition, Goodman reports that Florida Hospital has begun to virtualize its environments in both its home data center and its secondary site. Virtualization of its servers allows applications to be deployed very quickly, because they are not tied to a specific piece of hardware, he says, adding that virtualization also provides scalability, an important factor because the hospital has more than 100 disaster recovery applications. “We’re getting scalability because we don’t have as many physical servers, and replication is quick,” he says.
The hospital conducts a business impact analysis to classify which systems get backed up. “We look at what the impact would be to the enterprise if those applications were down for a certain period of time,” he says. The applications are tiered accordingly.
Goodman says data is written to disk, which is then replicated to the secondary site. With remote replication using virtualized servers, data replication time has gone down by a factor of 10. “That’s a good thing, because underlying technologies are becoming more complex,” he says.
In addition, Goodman says the hospital has begun to use remote access for his IT staff, which he calls “virtualizing the workforce.” Using an encrypted virtual private network (VPN) connection over the Internet, IT staff can access the recovery site and the home site remotely. At one point, Florida Hospital required 19 individuals to be on site during a recovery exercise. During its last exercise it sent four technicians; and in the next two years he hopes to be able to work 100 percent remotely.
In June, Florida Hospital subscribed to SunGard’s Managed Recovery Program (MRP), which Goodman says is in line with the disaster recovery strategies the hospital has taken so far. Under MRP, the hospital has access to a technical team that is a counterpart to the hospital’s IT team. The teams work together during disaster recovery exercises.
The bottom line, Goodman says, is that “we now deploy very few people in disaster recovery; we are current in our backup, and our disk data; and we have individuals trained on the other [MRP] end.” He notes that the hospital’s IT team and its MRP counterpart team work in an integrated way to solve problems. The MRP team also participates in the hospital’s weekly (virtual) change control meetings, where new recovery procedures are discussed resulting from changes to the system’s hardware or software.
Florida Hospital runs disaster recovery exercises twice a year for the purposes of quality improvement. “We want to learn from all of our mistakes,” Goodman says. The hospital’s disaster recovery plans are stored on a server on a private cloud.
Cooley Dickinson's Centralized Approach
Luckily, Cooley Dickinson Hospital, a 140-bed facility in Northampton, Mass., has never experienced a disaster in its data center. Nevertheless, several years ago, it re-evaluated its readiness to reduce the possibility of unplanned downtime. At the time the hospital did not use an enterprise backup system, according to Kipling Morris, manager of systems engineering; it took a silo approach to backup, in which each server was backed up to individual tape drives, he says.
In a disaster recovery situation, the speed with which a hospital can restore its files is crucial, Morris says. “Any advantages you can get in backing up data, that’s the name of the game,” he says. The hospital decided to move to a centralized backup system, choosing a dedicated storage appliance (supplied by STORServer Inc., Colorado Springs, Colo.), which sits on top of IBM’s Tivoli system.
Backup is now done on disks, and the data is spooled off to tape. “You increase the throughput on the front end, which eliminates bottlenecks and reduces the overall backup window,” he says. The underlying Tivoli system uses “Incremental Forever” technology that tracks files and file versions, so it backs up only the most recent files or those that have changed.
The hospital’s business data and primary EMR for its physician practices (supplied by Westborough, Mass.-based eClinicalWorks) is backed up on the hospital’s data center on campus. Remote users access the application from the main campus database to work off site. Business and clinical systems are stored on the hospital’s campus. The main HIS (supplied by Chicago-based Allscripts) is hosted off site by a third party. PACS is located on site, but the hospital keeps secondary copies off site.
In addition to the main data center, the hospital is developing a secondary site on campus, which is used for real-time replication of the hospital’s systems. “This is not so much for backup and recovery as it is for business continuity, because that would allow us to be extremely agile in the event of lost systems at the primary site,” Morris says.
Parting Advice for Future Events
While there is no foolproof way to prepare for every disaster threat, each of those interviewed offered advice for minimizing the risk of unplanned downtime.
Jeff White of Aspen Advisors says that hospital organizations should constantly revise their disaster recovery procedures to reflect changes in the IT systems. Hospitals need to make sure they test their recovery plans frequently.
Robert Goodman of Florida Hospital says that once a hospital has made a decision to go with a paperless system, patient care depends on a hospital’s ability to recover quickly. Hospital administrators and CIOs need to think through what they need to do to bring their systems back to parity. “When you go to these systems, you need to support them from a recovery standpoint,” he says.
Kipling Morris of Cooley Dickinson says that when it comes to backup and data recovery, there is no single approach that satisfies everyone’s needs. “Everyone wants to make sure they are identifying all of the appropriate use cases in their environment and are addressing them appropriately,” he says.
Finally, Mike McCreary of Mercy Technology Services says the disaster recovery procedures that were in place were up to the task, adding that “it’s unfortunate that we had to put them to use.” He calls the mobile communications center a worthwhile improvement.
Be sure to click here or a related story on disaster preparedness and HIEs.