Lesson from a Turkey
Tim was born about three years ago. Curious from birth, Tim was particularly good at making astute observations and had a charmingly playful approach to the world. This was very adaptive for Tim. He was well liked by his peers and he was naturally very happy. He was also a great problem solver.
At two years old, Tim started to become a lot more self aware. He was intensely interested in his environment and his future. He was very proud about his mental and physical growth. He wondered when he would be as big as some of his older neighbors and what he might do in the future. Fireman, Policeman, Indian Chief? Typical stuff for some two-year olds.
Tim was careful about his observations. He learned to avoid things that got him into trouble. Part of his contentment came from these reassuring observations. For example, he observed that his living environment was almost always clean, warm, dry and well lit, and that meals were very reliably available at regular times every day. Everyone was pretty friendly.
Using this evidence-based experience and sound, empiric reasoning, Tim concluded that he had a great future. Tim passed away this week. Yes, he was a turkey and yes, he was "sacrificed" for Thanksgiving. Tim had concluded that three years of consistent experience, the availability of food, warmth and safety, with narrow variation portended more of the same. Tim was taken in by a number of fallacies, fallacies that can be shared with most applications of secondary data use, comparative effectiveness research, and observational studies in general. He was unaware of the impact the behaviors and plans of others had on the relevance of his observations.
As we look at retrospective data and fashion a story to explain it, it's extremely important to realize that most humans, including doctor-doctors (MD, PhDs, as well as executives with and without advanced degrees in every field) are highly prone to these fallacies. The narrative fallacy, the attempt to view data as linear (or Gaussian, with predictably low risks of falling outside of prior experience) (see " Fooled by Randomness"). Or the fallacy of failing to recognize the sampling and cognitive biases that financial incentives routinely cause for the best of us. Healthcare rarely follows relatively simple laws of physics like the law of gravity.
This holiday season, as we contemplate HCIT data collection (sometimes called documentation), assessments (sometimes called BI or quality and performance reporting), and interventions (sometimes called CPOE with EBM), it's important to stop now and then to think about the implications from Tim's life. We need to think about significant, rare events called black swans, and about including the "unknown, unknown" into our planning. We need to keep numerical risk management and prediction in its place. It's not an exhaustive framework that should be blindly trusted, as Tim learned too late.
Enjoy your holiday!
The temptation to treat retrospective data analysis as a substitute for hypothesis-driven research is almost overwhelming. We know there isn't world enough (or time) to answer all of the questions of interest using randomized controlled trials (RCTs), RCTs are difficult (at best) to generalize beyond their specific selection criteria, and the retrospective data is essentially free. On the other hand, our brains are quite poorly designed for unbiased intuition. The attributes that make a narrative believable to us are typically orthogonal to the attributes that turn out to make it true. With retrospective data those "truth attributes" often aren't even represented.
Another interesting attribute of voluminous retrospective data is that it can sometimes do a somewhat better job of representing the messy (and heterogeneous) "real world" than highly constrained or contrived RCT selection criteria. The comparisons made are with real patients exhibiting real world compliance, comorbidities, and tolerance of side effects.
I think we have a harder question to answer, though, than whether we prefer prospective or retrospective data. We don't really get to pretend that RCTs can ever scale sufficiently to give us all the answers. Indeed, by the time the results are available the clinical methodologies have often changed just enough to lend credibility to the claim of inapplicability - cf. different (better?) imaging techniques for breast cancer screening than those which led to recent changes in USPSTF screening recommendations. The hard question is whether retrospective data is better than pure guesswork. People will develop conditions that demand some sort of therapeutic behavior regardless of our state of data analysis. Shall we shrug our clinical shoulders, or should we at least look for relevant clues, however misleading they will sometimes be? Perhaps the real problem is that we do a poor job of modeling (and genuinely understanding and adjusting for) the weaknesses in the information available to us.
Scott W. Finley, MD, MPH