Plucked, Cooked, Enjoyed—Swan 1, Turkey 0 | [node:field-byline] | Healthcare Blogs Skip to content Skip to navigation

Plucked, Cooked, Enjoyed—Swan 1, Turkey 0

November 19, 2010
by Joe Bormel
| Reprints
Don't let the narrative fallacy cloud healthcare data

Lesson from a Turkey
Tim was born about three years ago. Curious from birth, Tim was particularly good at making astute observations and had a charmingly playful approach to the world. This was very adaptive for Tim. He was well liked by his peers and he was naturally very happy. He was also a great problem solver.

At two years old, Tim started to become a lot more self aware. He was intensely interested in his environment and his future. He was very proud about his mental and physical growth. He wondered when he would be as big as some of his older neighbors and what he might do in the future. Fireman, Policeman, Indian Chief? Typical stuff for some two-year olds.

Tim was careful about his observations. He learned to avoid things that got him into trouble. Part of his contentment came from these reassuring observations. For example, he observed that his living environment was almost always clean, warm, dry and well lit, and that meals were very reliably available at regular times every day. Everyone was pretty friendly.

Using this evidence-based experience and sound, empiric reasoning, Tim concluded that he had a great future. Tim passed away this week. Yes, he was a turkey and yes, he was "sacrificed" for Thanksgiving. Tim had concluded that three years of consistent experience, the availability of food, warmth and safety, with narrow variation portended more of the same. Tim was taken in by a number of fallacies, fallacies that can be shared with most applications of secondary data use, comparative effectiveness research, and observational studies in general. He was unaware of the impact the behaviors and plans of others had on the relevance of his observations.

The Narrative Fallacy
... the "narrative fallacy" (also called illusory correlation) which refers to our tendency to construct stories around facts, which in love for example may serve a purpose, but when someone begins to believe the stories and accommodate facts into the stories, they are likely to err.

- From Wikipedia, "The Black Swan (Taleb book)"

As we look at retrospective data and fashion a story to explain it, it's extremely important to realize that most humans, including doctor-doctors (MD, PhDs, as well as executives with and without advanced degrees in every field) are highly prone to these fallacies. The narrative fallacy, the attempt to view data as linear (or Gaussian, with predictably low risks of falling outside of prior experience) (see " Fooled by Randomness"). Or the fallacy of failing to recognize the sampling and cognitive biases that financial incentives routinely cause for the best of us. Healthcare rarely follows relatively simple laws of physics like the law of gravity.

This holiday season, as we contemplate HCIT data collection (sometimes called documentation), assessments (sometimes called BI or quality and performance reporting), and interventions (sometimes called CPOE with EBM), it's important to stop now and then to think about the implications from Tim's life. We need to think about significant, rare events called black swans, and about including the "unknown, unknown" into our planning. We need to keep numerical risk management and prediction in its place. It's not an exhaustive framework that should be blindly trusted, as Tim learned too late.

Enjoy your holiday!

1) The turkey story was adapted from The Black Swan by Nassim Nicholas Taleb.
2) Turkey graphic in title from
This post available at

The temptation to treat retrospective data analysis as a substitute for hypothesis-driven research is almost overwhelming.  We know there isn't world enough (or time) to answer all of the questions of interest using randomized controlled trials (RCTs), RCTs are difficult (at best) to generalize beyond their specific selection criteria, and the retrospective data is essentially free.  On the other hand, our brains are quite poorly designed for unbiased intuition.  The attributes that make a narrative believable to us are typically orthogonal to the attributes that turn out to make it true.  With retrospective data those "truth attributes" often aren't even represented.


Another interesting attribute of voluminous retrospective data is that it can sometimes do a somewhat better job of representing the messy (and heterogeneous) "real world" than highly constrained or contrived RCT selection criteria.  The comparisons made are with real patients exhibiting real world compliance, comorbidities, and tolerance of side effects. 


I think we have a harder question to answer, though, than whether we prefer prospective or retrospective data.  We don't really get to pretend that RCTs can ever scale sufficiently to give us all the answers.  Indeed, by the time the results are available the clinical methodologies have often changed just enough to lend credibility to the claim of inapplicability - cf. different (better?) imaging techniques for breast cancer screening than those which led to recent changes in USPSTF screening recommendations.  The hard question is whether retrospective data is better than pure guesswork.  People will develop conditions that demand some sort of therapeutic behavior regardless of our state of data analysis.  Shall we shrug our clinical shoulders, or should we at least look for relevant clues, however misleading they will sometimes be?  Perhaps the real problem is that we do a poor job of modeling (and genuinely understanding and adjusting for) the weaknesses in the information available to us.

Scott W. Finley, MD, MPH

Physician Informaticist


Fantastic post! Only now when I eat my Thanksgiving turkey, I'll be wondering what "our" turkey thought just before he/she was beheaded...!!!!

Seriously, though, you hit upon something very, very important, and that is that it is human nature (and probably turkey nature, too, to the extent that turkeys think about anything!) to try to conceptually fit current events and developments into a narrative that is understandable based on past events and developments, and past understandings. As we all know, we're "cursed to live in interesting times" these days, and one element of that is that it's become increasingly difficult to assert understandings of new developments based on past developments and narratives.

That realization should give us all pause. I, for one, believe that we are absolutely heading into "uncharted waters" and unprecedented new waves of developments. It's one of the things that keeps me passionate about what we cover here at the magazine.

Thank you again for your terrific insights, and, I must say, very entertaining way with articulating those insights!


Definition:  Comparative Effectiveness Research (CER)

Comparative effectiveness research is the conduct and synthesis of systematic research comparing different interventions and strategies to prevent, diagnose, treat and monitor health conditions.
The purpose of this research is to inform patients, providers, and decision-makers, responding to their expressed needs, about which interventions are most effective for which patients under specific circumstances.
To provide this information, comparative effectiveness research must assess a comprehensive array of health-related outcomes for diverse patient populations. Defined interventions compared may include medications, procedures, medical and assistive devices and technologies, behavioral change strategies, and delivery system interventions.
This research necessitates the development, expansion, and use of a variety of data sources and methods to assess comparative effectiveness.

Joe, Interesting post.
The problem with the CER definition is the emphasis on finding things that work in an uncertain world of differing interventions, when, at this stage of the game, the first thing to do is detect and eliminate things that don't work.

This goes back to the discovery in PRO's of the 80's that there is no adequate definition of quality. By defining and inspecting to a limited set of obvious clinical and administrative non-quality measures we accomplished quality improvement in two years what we had been working to define for 6 years.

Cognitive medicine will continue to be practiced based on experience,a hard thing to change. So the second thing in CER is how to implement change. Change was profound and rapidly adapted in the procedure and imaging spheres. Laparoscopic, endoscopic and endovascular techniques, even staples have changed surgery. 2-D ultrasound, CT and MRI have irreversibly changed imaging supported diagnosis.

The cognitive side has changed significantly based on findings in arteriosclerosis driven by the introduction of statins. However, the great problem of life style, obesity, NIDDM and fitness remain. Proper antibiotic use remains increasingly difficult.

The domain of CER needs to expand to attract research into areas where little adequate research exists and where the need is great.


Jack, Thanks for your observation. I'd like to re-cast your terminology of "detect and eliminate things that don't work" as "identify over-use, under-use and misuse of diagnostic and therapeutic modalities."

In my personal experience, using existing data collected for another purpose (usually billing) and trying to understand the provider intention and/or clinical context is often impossible. Key clinical findings (or their absence, e.g. "there was no cardiac murmur") are usually not in the data.

Even if we recognize the turkey problem (i.e. prior experience doesn't necessarily portend future experience), we're still left with the challenges of incomplete-for-purpose observational data. Per your comment, there is enough data to detect many classes of things that don't work (e.g. plain films dont work for diagnosing low back pain; and, in the medication space, drugs with lower efficacy or higher side effects in the real world than in trials is rarely studied and published.)

A great opportunity is what Peter Honig (former director of the FDA’s Office of Drug Safety,here: ) describes as a "Drug-Lag Relapse," where post-marketing detection of significant safety issues "brought scrutiny and criticism of the FDA," after well known issues with SRIs, Vioxx, Bextra, and concerns with Avandia. He also makes reference to other drugs, namely Seldane, Propulsid, and Rezulin that were withdrawn from the market because of poor demonstrated risk management by providers.

The responsibility for retrospective detection shouldn't fall solely on the FDA, or NIH research dollars as currently structured. I agree with you that a first step needs to be a better detection capability (from routinely captured data) with transparency around CER issues, starting with these existing data sources.


Thank you for the kind words.  Here is a link to a workshop that discussed Research HIE related material:

I presented, but others were more erudite.

A famous phrase from the workshop is that
"Every disease is a rare disease."

This was based on the observation that "Genetics + Environment + Timing >>>> Disease," and
that every individual instance of disease generally is unique when described across this three column matrix.

Dan Russler
- VP Clinical Informatics, Oracle Health Sciences Strategy

"Few observations and lots of rationale leads to error.

Lots of observations and little rationale leads to truth."

- Lewis Carroll

That aphorism seems to be at the heart of your post, Joe. You and your commenters have done a good job of calling out the fallacies of the nature of many observations. You've also called out that "leads to truth" is subject to the dangerous trap of applying narratives that are unsubstantiated. Very nicely told story.

I've been asked for the following elaboration:

"The Black Swan: The Impact of the Highly Improbable."

Taleb defines a 'black swan' as
(1) an outlier beyond the realm of our regular expectations,
(2) an event that carries an extreme impact, and
(3) a happening, that after the fact, our human nature enables us to accept by concocting explanations make it seem predictable.

Events that are rare, extreme, and retrospectively predictable.

It's this retrospective predictability that comes from data mining, and is often not useful and predictive prospectively that creates hazard. Although surveillance is objectively valuable, reliable learning from historical data is tricky and often over-rated.


The Health IT Summits gather 250+ healthcare leaders in cities across the U.S. to present important new insights, collaborate on ideas, and to have a little fun - Find a Summit Near You!


See more on