Skip to content Skip to navigation

Speaking of VISTA and voice recogniton

November 1, 2007
by James Feldbaum
| Reprints

First, let me begin with a confession. I can’t type. Well, I do, but with only a few fingers and overly reliant on auto-correct. I took typing in high school (over four decades ago), but at the time could see no future advantage in perfecting the skill. Furthermore, it exposed my inability to spell which my terrible handwriting often obfuscated.

Although the new generation of computer users have substantial typing skills, natural language interfaces have become a viable, if not often preferred, computer interface. Microsoft’s VISTA has a built in voice recognition engine (Bill Gates interview by CNET) which could compete with more costly add-on commercial products.

The compelling need is for physicians to accurately input information electronically in a manner and setting consistent with work and thought-flow. Of course, a driving force behind medical speech recognition is the increasingly prohibitive costs of transcription and a growing concern over the possibility of the mouse and keyboard as a sources for cross-infection.

How have you been using speech recognition in your practices? Any tips or warnings? Have you had any experience with the functionality in VISTA? Where is this segment of IT headed?

Interesting Quote: Talk low, talk slow, and don’t talk too much” John Wayne Advice on acting



I've read a few of your posts regard healthcare IT and how it should enable "thoughtflow" and not disrupt clinician efficiency. I believe you reference this also in your Ambulatory EMR post (Office EMR's may be more Affordable - or something to that effect).

"The compelling need is for physicians to accurately input information electronically in a manner and setting consistent with work and thought flow."

My ponderings are along these lines: If the above assumption is true - where is the break point between tablet PCs and digital pens?

To be sure, the two are very different and yet could play together well in healthcare. Tablets have the advantage in being able to drive applications - and multiple ones at that. They can also be a source of data retrieval (yet another benefit). But digital pens speak to technology that is intuitive by its very nature - we've been using writing instruments for millenia. Adding transcription to the data collected by the digital pen is a feature that will need some careful consideration (accuracy rate is only 90% - not good for healthcare standards) and needs feedback at some point through a thin client application monitored by the clinician. However, if the image is simply captured and sent directly to the EMR - is that also beneficial?

Addmittedly, the lifespan for digital pens is not all encompassing. It is at most a nice solution for clinicians (like yourself, perhaps) that cannot type well. Or to be used in siloed departments (like surgery or anesthesia) to capture drawings or to be used on forms that have little narrative to them - to minimize errors in transcription.

But might the next mental jump in "Intuitive Technology" be, perhaps Surface Technology (like that of Microsoft Surface or iPhone)?

If we are constantly looking for ways to embrace the natural ways that clinicians use and move information, why keep requesting them to use input devices that require skillsets that need to be developed and learned? If the paper chart is the tradition (and books is the civilian media we use to hold and seek information) then Surface technology speaks to this ability to use and move information in a way that is very familiar to us - and has been forever. In that kind of environment, we then begin to see that the possibilities for interface becomes endless (patient walls, desktops, hallways, etc).

Finally (getting back to a familiar theme in your article) we come to data retrieval and data entry. In a world of Surface - how might that be achieved? Well, for hard copy data (items the patient brings with them - the absorption model works with Surface (meaning you can place the Medication List, or Consent, or Biopsy directly on the Surface and it will "absorb" into a digital image).

But for data manipulation we will need something faster - and that brings voice to mind. We communicate every day through speech and speech recognition transcription technologies are much more accurate than those of the digital pen (nearing 99%). If Clinicians can access "voice-activated charting" during their rounds (nurses and practitioners, alike) imagine how much time would be saved throughout each 12 hour shift. How much more accurate charting would be. And what impacts to patient safety this method would provide in scenarios when time is mission critical (like Cardiac Arrest where a patient's room could be recording all voices and translating the speech into data a few seconds later to be displayed on the Surfaces in the patient's room).

I'll end with one final thought about today's HIT. When Epic rolled out at my facility, the nursing staff in ICU had to be doubled. One nurse for patient care. One nurse to chart in Epic.

So I wonder - how efficient and "patient safe" are our clinical documentation systems - how intuitive are they, really?

You bring up number of good points. We all share frustration as we try to adapt to a technology that by its inherent limitations does not match the way we think or work. True, most of us were trained to practice with pen and paper. Most of us made the transition to dictation without difficulty as speech is a natural form of communication, and dictation is merely speaking in a prescribed format.

As I move from one consulting engagement to another I am always impressed by the variety of solutions that have been tried to turn our thoughts into retrievable data. What is clear, however, is that hand writing with pen on paper is most often illegible and therefore useless scanning in (digitizing) the written record is a temporizing measure at best since it only makes the written record more retrievable and that our need and goal must be the creation of data elements that can be reviewed, trended, and shared.

Since the posting of this blog, I have begun dictating on Dragon NaturallySpeaking. In fact, I am dictating this response. I was very impressed how quickly I adapted to dictating my e-mail as well as Word documents. While dictation is quite natural for most of us and now that the software is accurate and affordable, it becomes a viable though still not perfect solution. I suspect we will need to develop in the future a hybrid of transcription that captures discrete data fields. I have demoed software that through the use of templates and branching logic does well at capturing retrievable data. Sadly, however, the resulting documents are often so full of "canned" verbiage that their utility is diminished by the preponderance of boilerplate. I am not dismissing this technology as it holds great promise, but I can't help but comment on its present use that does more to document the level of service provided then transmit important patient information.

Lastly, your comment about the number of nurses it takes to provide care in your ICU is an example of a process gone astray. When I am asked by clients if the software they are considering is beta software I always respond that "all of our software is beta." At some point our software, our devices, our workflow, and our thought flow will all be in sync, but for now, we must settle for "less imperfect."

James Feldbaum

Jim Feldbaum is a physician consultant specializing in clinical transformation, CPOE, and...