Skip to content Skip to navigation

Plucked, Cooked, Enjoyed—Swan 1, Turkey 0

November 19, 2010
by Joe Bormel
| Reprints
Don't let the narrative fallacy cloud healthcare data


Lesson from a Turkey
Tim was born about three years ago. Curious from birth, Tim was particularly good at making astute observations and had a charmingly playful approach to the world. This was very adaptive for Tim. He was well liked by his peers and he was naturally very happy. He was also a great problem solver.

At two years old, Tim started to become a lot more self aware. He was intensely interested in his environment and his future. He was very proud about his mental and physical growth. He wondered when he would be as big as some of his older neighbors and what he might do in the future. Fireman, Policeman, Indian Chief? Typical stuff for some two-year olds.

Tim was careful about his observations. He learned to avoid things that got him into trouble. Part of his contentment came from these reassuring observations. For example, he observed that his living environment was almost always clean, warm, dry and well lit, and that meals were very reliably available at regular times every day. Everyone was pretty friendly.

Using this evidence-based experience and sound, empiric reasoning, Tim concluded that he had a great future. Tim passed away this week. Yes, he was a turkey and yes, he was "sacrificed" for Thanksgiving. Tim had concluded that three years of consistent experience, the availability of food, warmth and safety, with narrow variation portended more of the same. Tim was taken in by a number of fallacies, fallacies that can be shared with most applications of secondary data use, comparative effectiveness research, and observational studies in general. He was unaware of the impact the behaviors and plans of others had on the relevance of his observations.
 

The Narrative Fallacy
... the "narrative fallacy" (also called illusory correlation) which refers to our tendency to construct stories around facts, which in love for example may serve a purpose, but when someone begins to believe the stories and accommodate facts into the stories, they are likely to err.

- From Wikipedia, "The Black Swan (Taleb book)"

As we look at retrospective data and fashion a story to explain it, it's extremely important to realize that most humans, including doctor-doctors (MD, PhDs, as well as executives with and without advanced degrees in every field) are highly prone to these fallacies. The narrative fallacy, the attempt to view data as linear (or Gaussian, with predictably low risks of falling outside of prior experience) (see " Fooled by Randomness"). Or the fallacy of failing to recognize the sampling and cognitive biases that financial incentives routinely cause for the best of us. Healthcare rarely follows relatively simple laws of physics like the law of gravity.

This holiday season, as we contemplate HCIT data collection (sometimes called documentation), assessments (sometimes called BI or quality and performance reporting), and interventions (sometimes called CPOE with EBM), it's important to stop now and then to think about the implications from Tim's life. We need to think about significant, rare events called black swans, and about including the "unknown, unknown" into our planning. We need to keep numerical risk management and prediction in its place. It's not an exhaustive framework that should be blindly trusted, as Tim learned too late.

Enjoy your holiday!

Credits:
1) The turkey story was adapted from The Black Swan by Nassim Nicholas Taleb.
2) Turkey graphic in title from easy-child-crafts.com
This post available at http://bit.ly/bH6XSx
 

Topics

Comments

Mark - thanks for your kind words. Your writing and searching for truth has been an inspiration to me. The concept of a "Confirmation bias" is clearly a hazard to professional journalists, as mock journalist Jon Stewart often points out. This is clearly a challenge with learning from observational data.


Scott, thanks for your perspective as well. Your observation, "our brains are quite poorly designed for unbiased intuition" was a remarkably clear re-statement of the thesis of this post.  For those interested in a terrific, recent, best-seller walk though the human intuition problem, Taleb's aforementioned black swan book is excellent.  He also links that to a broad set of unpredictable events (Google, 9/11, Harry Potter) that have had impact beyond the predictable ones.  Here's a humorous and short video interview by Colbert with Taleb that you'll probably enjoy.  There's a very high level summary of Taleb's book, here.

The writer I've found helpful on the topic of common cognitive distortions of the emotional realm is David Burns.  He points out that we all tend to think in extremes...and when traumatic events happen we think that way even more. Here are some common cognitive distortions that he elaborates: 

  1. All-or-nothing thinking: You see things in black and white categories. If your performance falls short of perfect, you see yourself as a total failure.  (Taleb makes the related point earlier in his book that how we classify the world determines what we see, what we don't see, and our prejudices around significance.)

 

  1. Overgeneralization: You see a single negative event as a never-ending pattern of defeat.

 

  1. Mental filter: You pick out a single negative detail and dwell on it exclusively so that your vision of all reality becomes darkened, like the drop of ink that discolors the entire beaker of water.

 

  1. Disqualifying the positive: You reject positive experiences by insisting they "don't count" for some reason or other. You maintain a negative belief that is contradicted by your everyday experiences.

 

  1. Jumping to conclusions: You make a negative interpretation even though there are no definite facts that convincingly support your conclusion.

 

    • Mind reading: You arbitrarily conclude that someone is reacting negatively to you and don't bother to check it out.

 

    • The Fortune Teller Error: You anticipate that things will turn out badly and feel convinced that your prediction is an already-established fact.

 

  1. Magnification (catastrophizing) or minimization: You exaggerate the importance of things (such as your goof-up or someone else's achievement), or you inappropriately shrink things until they appear tiny (your own desirable qualities or the other fellow's imperfections). This is also called the "binocular trick."

 

  1. Emotional reasoning: You assume that your negative emotions necessarily reflect the way things really are: "I feel it, therefore it must be true."

 

  1. Should statements: You try to motivate yourself with shoulds and shouldn'ts, as if you had to be whipped and punished before you could be expected to do anything. "Musts" and "oughts" are also offenders. The emotional consequence is guilt. When you direct should statements toward others, you feel anger, frustration, and resentment.

 

  1. Labeling and mislabeling: This is an extreme form of overgeneralization. Instead of describing your error, you attach a negative label to yourself: "I'm a loser." When someone else's behavior rubs you the wrong way, you attach a negative label to him, "He's a damn louse." Mislabeling involves describing an event with language that is highly colored and emotionally loaded.

 

  1. Personalization: You see yourself as the cause of some negative external event for which, in fact, you were not primarily responsible.

 

     - From: Burns, David D., MD. 1989. The Feeling Good Handbook. New York: William Morrow and Company, Inc.

In my experience, awareness of these common tendencies can go a long way toward keeping them in proper perspective.  There is, of course, a huge emotional impact, where unclear thinking leads to sadness, depression, energy-depletion, and bringing down those we are close to.  So, understanding cognitive distortions goes far beyond getting secondary data rights of HIEs correctly handled !

Joe

Dan,
Thanks for your comment.

I'm grateful to you for leading the critical development for our best hope for semantic interoperability (v3) and it's promise of safe and productive re-use of information.   (Russler bio)

Your work on educating me and others, as well as working through the SDO (standards development organizations) to train and educate the next generation is greatly appreciated.

It was no surprise, therefore, that you ended your comments with a focus beyond tools and standards, and moving to the critical issues of developing talent, the competent and energetic next generation of informaticians.

Since you're one of the leading experts and practitioners in Research HIEs, can you recommend a link for our readers?

Joe

Great treatment of this topic, here, by Dr Jack Hadley.





Comparative Effectiveness Research (CER): Promises and Pitfalls of Observational Data
Jack Hadley, Ph.D. College of Health and Human Services George Mason University (link to Dr Hadley's biographical information)

Dr Hadley both in his linked slides and in his presentation did a fair and balanced job of painting the opportunity, the challenges and the risks of using observational data for CER.  The slides stand-alone very well (very understandable without the presenter speaking to them.)  The mini-tutorial and example of Instrumental Variable Analysis and Prostate Cancer treatment illustrates the power of this statistical approach.

His "Conclusions" slide (number 23) elaborates the same cautionary conclusions made in my original post and by the commenters.  

Readers should appreciate that we have an obligation to use observational data for surveillance.  When it signals a previously unknown pattern, be it effectiveness, ineffectiveness, or toxicity, we need to respond in a timely and prudently automatic fashion.    Dr Hadley's brief course on the promises and pitfalls linked above is therefore highly recommended.

(Thanks to Jim Oakes, David Main, HealthTechNet.org,  and Healthcare Information Consultants for facilitating awareness of Dr Hadley's important work and their decades of leadership facilitating healthcare improvement through informatics initiatives.)



As humans, we frequently use a good tool for the wrong task. I'm a fan of field biology, in which the first step of a good research program is good observational research, e.g. Jane Goodall and her chimpanzee observations. On the other hand, however critical good observational research is to good science, one frequently sees poorly substantiated assertions generated from observational research data by simple minds, e.g. Tim the Turkey.

As we move into the exciting era of larger observational research opportunities due to larger aggregations of healthcare data, it is right to caution that we should utilize the opportunities larger data aggregation capabilities give us for larger, well-controlled clinical trials as well.

But in a era which graduates more Tim the Turkeys than competent researchers from our schools, what likelihood is there for appropriate use of good tools in our future?

Dan Russler
- VP Clinical Informatics, Oracle Health Sciences Strategy

The temptation to treat retrospective data analysis as a substitute for hypothesis-driven research is almost overwhelming.  We know there isn't world enough (or time) to answer all of the questions of interest using randomized controlled trials (RCTs), RCTs are difficult (at best) to generalize beyond their specific selection criteria, and the retrospective data is essentially free.  On the other hand, our brains are quite poorly designed for unbiased intuition.  The attributes that make a narrative believable to us are typically orthogonal to the attributes that turn out to make it true.  With retrospective data those "truth attributes" often aren't even represented.

 

Another interesting attribute of voluminous retrospective data is that it can sometimes do a somewhat better job of representing the messy (and heterogeneous) "real world" than highly constrained or contrived RCT selection criteria.  The comparisons made are with real patients exhibiting real world compliance, comorbidities, and tolerance of side effects. 

 

I think we have a harder question to answer, though, than whether we prefer prospective or retrospective data.  We don't really get to pretend that RCTs can ever scale sufficiently to give us all the answers.  Indeed, by the time the results are available the clinical methodologies have often changed just enough to lend credibility to the claim of inapplicability - cf. different (better?) imaging techniques for breast cancer screening than those which led to recent changes in USPSTF screening recommendations.  The hard question is whether retrospective data is better than pure guesswork.  People will develop conditions that demand some sort of therapeutic behavior regardless of our state of data analysis.  Shall we shrug our clinical shoulders, or should we at least look for relevant clues, however misleading they will sometimes be?  Perhaps the real problem is that we do a poor job of modeling (and genuinely understanding and adjusting for) the weaknesses in the information available to us.


Scott W. Finley, MD, MPH

Physician Informaticist

Joe,

Fantastic post! Only now when I eat my Thanksgiving turkey, I'll be wondering what "our" turkey thought just before he/she was beheaded...!!!!

Seriously, though, you hit upon something very, very important, and that is that it is human nature (and probably turkey nature, too, to the extent that turkeys think about anything!) to try to conceptually fit current events and developments into a narrative that is understandable based on past events and developments, and past understandings. As we all know, we're "cursed to live in interesting times" these days, and one element of that is that it's become increasingly difficult to assert understandings of new developments based on past developments and narratives.

That realization should give us all pause. I, for one, believe that we are absolutely heading into "uncharted waters" and unprecedented new waves of developments. It's one of the things that keeps me passionate about what we cover here at the magazine.

Thank you again for your terrific insights, and, I must say, very entertaining way with articulating those insights!

Mark


Definition:  Comparative Effectiveness Research (CER)


Comparative effectiveness research is the conduct and synthesis of systematic research comparing different interventions and strategies to prevent, diagnose, treat and monitor health conditions.
 
The purpose of this research is to inform patients, providers, and decision-makers, responding to their expressed needs, about which interventions are most effective for which patients under specific circumstances.
 
To provide this information, comparative effectiveness research must assess a comprehensive array of health-related outcomes for diverse patient populations. Defined interventions compared may include medications, procedures, medical and assistive devices and technologies, behavioral change strategies, and delivery system interventions.
 
This research necessitates the development, expansion, and use of a variety of data sources and methods to assess comparative effectiveness.
 


Joe, Interesting post.
 
The problem with the CER definition is the emphasis on finding things that work in an uncertain world of differing interventions, when, at this stage of the game, the first thing to do is detect and eliminate things that don't work.

This goes back to the discovery in PRO's of the 80's that there is no adequate definition of quality. By defining and inspecting to a limited set of obvious clinical and administrative non-quality measures we accomplished quality improvement in two years what we had been working to define for 6 years.

Cognitive medicine will continue to be practiced based on experience,a hard thing to change. So the second thing in CER is how to implement change. Change was profound and rapidly adapted in the procedure and imaging spheres. Laparoscopic, endoscopic and endovascular techniques, even staples have changed surgery. 2-D ultrasound, CT and MRI have irreversibly changed imaging supported diagnosis.

The cognitive side has changed significantly based on findings in arteriosclerosis driven by the introduction of statins. However, the great problem of life style, obesity, NIDDM and fitness remain. Proper antibiotic use remains increasingly difficult.

The domain of CER needs to expand to attract research into areas where little adequate research exists and where the need is great.

Jack

Jack, Thanks for your observation. I'd like to re-cast your terminology of "detect and eliminate things that don't work" as "identify over-use, under-use and misuse of diagnostic and therapeutic modalities."

In my personal experience, using existing data collected for another purpose (usually billing) and trying to understand the provider intention and/or clinical context is often impossible. Key clinical findings (or their absence, e.g. "there was no cardiac murmur") are usually not in the data.

Even if we recognize the turkey problem (i.e. prior experience doesn't necessarily portend future experience), we're still left with the challenges of incomplete-for-purpose observational data. Per your comment, there is enough data to detect many classes of things that don't work (e.g. plain films dont work for diagnosing low back pain; and, in the medication space, drugs with lower efficacy or higher side effects in the real world than in trials is rarely studied and published.)

A great opportunity is what Peter Honig (former director of the FDA’s Office of Drug Safety,here: http://www.sa-pathways.com/changing-business-models/the-changing-fda?page=all ) describes as a "Drug-Lag Relapse," where post-marketing detection of significant safety issues "brought scrutiny and criticism of the FDA," after well known issues with SRIs, Vioxx, Bextra, and concerns with Avandia. He also makes reference to other drugs, namely Seldane, Propulsid, and Rezulin that were withdrawn from the market because of poor demonstrated risk management by providers.

The responsibility for retrospective detection shouldn't fall solely on the FDA, or NIH research dollars as currently structured. I agree with you that a first step needs to be a better detection capability (from routinely captured data) with transparency around CER issues, starting with these existing data sources.

Joe,

Thank you for the kind words.  Here is a link to a workshop that discussed Research HIE related material: http://rarediseases.info.nih.gov/PATIENT_REGISTRIES_WORKSHOP/

I presented, but others were more erudite.

A famous phrase from the workshop is that
"Every disease is a rare disease."

This was based on the observation that "Genetics + Environment + Timing >>>> Disease," and
that every individual instance of disease generally is unique when described across this three column matrix.

Dan Russler
- VP Clinical Informatics, Oracle Health Sciences Strategy

"Few observations and lots of rationale leads to error.

Lots of observations and little rationale leads to truth."

- Lewis Carroll

That aphorism seems to be at the heart of your post, Joe. You and your commenters have done a good job of calling out the fallacies of the nature of many observations. You've also called out that "leads to truth" is subject to the dangerous trap of applying narratives that are unsubstantiated. Very nicely told story.

I've been asked for the following elaboration:

"The Black Swan: The Impact of the Highly Improbable."

Taleb defines a 'black swan' as
(1) an outlier beyond the realm of our regular expectations,
(2) an event that carries an extreme impact, and
(3) a happening, that after the fact, our human nature enables us to accept by concocting explanations make it seem predictable.

Events that are rare, extreme, and retrospectively predictable.

It's this retrospective predictability that comes from data mining, and is often not useful and predictive prospectively that creates hazard. Although surveillance is objectively valuable, reliable learning from historical data is tricky and often over-rated.

Joe Bormel

Healthcare IT Consutant

Joe Bormel

@jbormel

...