As medical institutions roll out one new broken quality protocol after another, I have struggled for a way to point out the irrationality of this approach. Reading some commentary about the US attempt to achieve excellence in international soccer during the recent run-up to the World Cup gave me an idea.

Imagine the existence of a powerful Center for Masterful Soccer (CMS) in the US, charged with regulating and improving American soccer. They rank US collegiate and professional soccer programs in order of excellence, using a composite scoring system based on 5 years of data: goals scored, goals against, wins, losses, ties (all weighted by the quality of the opponent).  To improve US national soccer, they plan to offer incentives (in the form of financial support) for programs that reach certain benchmarks for excellence and threaten sanctions (lower ticket prices) against those who do not meet the benchmarks. Pay for performance.

It quickly becomes apparent to the CMS staff that it is impossible to create actionable benchmarks based on things like coaching techniques, player mix, playing style, practice characteristics, or the full range of player skills. Quality turns out to be very hard to define and the approaches taken by the most successful programs are surprisingly dissimilar. The deep-dive approach to measuring quality is simply too complex.

Instead, CMS looks for easy to measure characteristics of the best programs. They build a list: native language of the coach, number of players born in soccer-rich countries, money spent on facilities, number of shots taken per game,  and arm span of goalies. They validate the list statistically by demonstrating that high scores on this list are strongly correlated with a high-performance program. Around the US, nervous soccer aficionados concede that the published list describes attributes of good programs, but express misgivings about how this information will be put into use.  They are right to be concerned. 

Most programs, including the owners and managers of the Club for Minnesota Masters of Fussball (CMMF), immediately set about maximizing their performance on the incentivized measures. They see a potential benefit in the financial incentives, and (not being players themselves) they incorrectly assume that addressing these secondary measures will improve quality. They hire a coaching staff from Spain (basketball referees with no soccer experience), recruit 15 players from Europe and Central America (the good players from these areas are already committed, but they have little trouble recruiting players not good enough to play in their home leagues), they double the budget for lawn care on the pitch, they offer a $2 bonus for every shot taken (regardless of quality or location on the field or better options), they fine players who take fewer than 5 shots per game (including goalies and defensive backs), and they find two goalies who are over seven feet tall with huge reach (but are both visually impaired).  They consistently score at the top of their league for metrics, qualify for the financial incentives, but go on to lose 30 consecutive games and become the laughingstock of the local soccer community. Their players complain and make suggestions, but the CMMF leadership says the plan is evidence based, their complaining demonstrates that they are not team players, and besides, the club has no choice because they are only following (CMS) orders.

A few programs take a very different approach. The Midwest Athletic Youth Organization (MAYO), for example, studies the systems of coaching, practice and play used by the best performing teams. They hire a coaching staff with experience playing and coaching for high quality teams. They recruit players who have experience playing in high quality leagues. They involve their players and coaching staff in open collaborative work on which techniques of practice and play are working, and make frequent adjustments to their approach based on what ongoing monitoring supports. Their outcome is very different from CMMF: they consistently finish at the top of their league and go on to do well in the national championships.

And so it is in medicine. We are increasingly incentivized for metrics based on secondary characteristics of quality. Yes, it is true that programs that successfully reduce morbidity and mortality from falls in the elderly are very likely to screen for fall risk.  This does NOT mean that screening for fall risk will reduce falls. (This is the so-called fallacy of the converse.) In order to effectively improve the outcomes of patients who need intervention for depression, one obviously needs a finding mechanism. A finding mechanism without access to treatment is unlikely to improve depression.  Intensive interventions targeted toward the obese may improve health outcomes (the evidence is weak), but a screening program that identifies both the overweight and obese and offers only a watered down version of intervention will not work.

Identifying markers of quality does not generate instructions for building a quality program, any more than a list of the commonest ingredients in the best apple pies would tell you how to bake a first-rate apple pie.  Markers are potentially valuable because they can help find and address areas where one differs from excellence. The problem is that true quality improvement is very hard, and this approach sounds seductively easy. Like all simplistic and lazy approaches to complicated problems, incentivizing secondary markers predictably leads to misdirected activity, distracts from true analysis and quality work, and ultimately fails. 

 




Links to more on this topic: