Author Archives: zml829

Our latest JAMA paper: teaching hospitals and thinking about conflict of interest

How much does it matter which hospital you go to? Of course, it matters a lot – hospitals vary enormously on quality of care, and choosing the right hospital can mean the difference between life and death. The problem is that it’s hard for most people to know how to choose. Useful data on patient outcomes remain hard to find, and even though Medicare provides data on patient mortality for select conditions on their Hospital Compare website, those mortality rates are calculated and reported in ways that make nearly every hospital look average.

Some people select to receive their care at teaching hospitals. Studies in the 1990s and early 2000s found that teaching hospitals performed better, but there was also evidence that they were more expensive. As “quality” metrics exploded, teaching hospitals often found themselves on the wrong end of the performance stick with more hospital-acquired conditions and more readmissions. In nearly every national pay-for-performance scheme, they seemed to be doing worse than average, not better. In an era focused on high-value care, the narrative has increasingly become that teaching hospitals are not any better – just more expensive.

But is this true? On the one measure that matters most to patients when it comes to hospital care – whether you live or die – are teaching hospitals truly no better or possibly worse? About a year ago, that was the conversation I had with a brilliant junior colleague, Laura Burke. When we scoured the literature, we found that there had been no recent, broad-based examination of patient outcomes at teaching versus non-teaching hospitals. So we decided to take this on.

As we plotted how we might do this, we realized that to do it well, we would need funding. But who would fund a study examining outcomes at teaching versus non-teaching hospitals? We thought about NIH but knew that was not a realistic possibility – they are unlikely to fund such a study and even if they did, it would take years to get the funding. There are also some excellent foundations, but they are small and therefore, focus on specific areas. Next, we considered asking the Association of American Medical Colleges (AAMC). We know these colleagues well and knew they would be interested in the question.  But we also knew that for some people – those who see the world through the “conflict of interest” lens – any finding funded by AAMC would be quickly dismissed, especially if we found that teaching hospitals were better.

Setting up the rules of the road

As we discussed funding with AAMC, we set up some basic rules of the road.  Actually, Harvard requires these rules if we receive a grant from any agency. As with all our research, we would maintain complete editorial independence. We would decide on the analytic plan and make decisions about modeling, presentation, and writing of the manuscript. We offered to share our findings with AAMC (as we do with all funders), but we were clear that if we found that teaching hospitals were in fact no better (or worse), we would publish those results. AAMC took a leap of faith knowing that they might be funding a study that casts teaching hospitals in a bad light. The AAMC leadership told me that if teaching hospitals are not providing better care, they wanted to know – they wanted an independent assessment of their performance using meaningful metrics.

Our approach

Our approach was simple. We examined 30-day mortality (the most important measure of hospital quality) and extended our analysis to also examine 90 days (to see if differences between teaching and non-teaching hospitals persisted over time). We built our main models, but in the back of my mind, I knew that no matter which choices we made, some people would question them as biased. Thus, we ran a lot of sensitivity analyses, looking at shorter-term outcomes (7 days), models with and without transferred patients, within various hospital size categories, and with various specification of how one even defines teaching status. Finally, we included volume in our models to see if volume of patients seen was driving differences in outcomes.

The one result that we found consistently across every model and using nearly every approach was that teaching hospitals were doing better. They had lower mortality rates overall, across medical and surgical conditions, and across nearly every single individual condition. And the findings held true all the way out to 90 days.

What our findings mean

This is the first broad, post-ACA study examining outcomes at teaching hospitals, and for the fans of teaching hospitals, this is good news. The mortality differences between teaching and non-teaching hospitals is clinically substantial: for every 67 to 84 patients that go to a major teaching hospital (as opposed to a non-teaching hospital), you save one life. That is a big effect.

Should patients only go to teaching hospitals though? That is wholly unrealistic, and these are only average effects. Many community hospitals are excellent and provide care that is as good if not superior to teaching institutions. Lacking other information when deciding where to receive care, patients do better on average at teaching institutions.

Way forward

There are several lessons from our work that can help us move forward in a constructive way.  First, given that most hospitals in the U.S. are non-teaching institutions, we need to think about how to help those hospitals improve. The follow-up work needs to delve into why teaching hospitals are doing better, and how can we replicate and spread that to other hospitals. This strikes me as an important next step.  Second, can we work on our transparency and public reporting programs so that hospital differences are distinguishable to patients? As I have written, we are doing transparency wrong, and one of the casualties is that it is hard for a community hospital that performs very well to stand out. Finally, we need to fix our pay-for-performance programs to emphasize what matters to patients. And for most patients, avoiding death remains near the top of the list.

Final thoughts on conflict of interest

For some people, these findings will not matter because the study was funded by “industry.” That is unfortunate. The easiest and laziest way to dismiss a study is to invoke conflict of interest. This is part of the broader trend of deciding what is real versus fake news, based on the messenger (as opposed to the message). And while conflicts of interest are real, they are also complicated. I often disagree with AAMC and have publicly battled with them. Despite that, they were bold enough to support this work, and while I will continue to disagree with them on some key policy issues, I am grateful that they took a chance on us. For those who can’t see past the funders, I would ask them to go one step further – point to the flaws in our work. Explain how one might have, untainted by funding, done the work differently. And most importantly – try to replicate the study. Because beyond the “COI,” we all want the truth on whether teaching hospitals have better outcomes or not. Ultimately, the truth does not care what motivated the study or who funded it.

Correlation, Causation, and Gender Differences in Patient Outcomes

Our recent paper on differences in outcomes for Medicare patients cared for by male and female physicians has created a stir.  While the paper has gotten broad coverage and mostly positive responses, there have also been quite a few critiques. There is no doubt that the study raises questions that need to be aired and discussed openly and honestly.  Its limitations, which are highlighted in the paper itself, are important.  Given the temptation we all feel to overgeneralize, we do best when we stick with the data.  It’s worth highlighting a few of the more common critiques that have been lobbed at the study to see whether they make sense and how we might move forward.  Hopefully by addressing these more surface-level critiques we can shift our focus to the important questions raised by this paper.

Correlation is not causation

We all know that correlation is not causation.  Its epidemiology 101.  People who carry matches are more likely to get lung cancer.  Going to bed with your shoes on is associated with higher likelihood of waking up with a headache.  No, matches don’t cause lung cancer any more than sleeping with your shoes on causes headaches. Correlation, not causation.  Seems straightforward and it has been a consistent critique of this paper.  The argument is that because we had an observational study – that is, not an experiment where we proactively, randomly assigned millions of Americans to male versus female doctors – all we have is an association study.  To have a causal study, we’d need a randomized, controlled trial.  In an ideal world, this would be great, but unfortunately in the real world, this is impractical…and even unnecessary.  We often make causal inferences based on observational data – and here’s the kicker: sometimes, we should.  Think smoking and lung cancer.  Remember the RCT that assigned people to smoking (versus not) to see if it really caused lung cancer?  Me neither…because it never happened.  So, if you are a strict “correlation is not causation” person who thinks observational data only create hypotheses that need to be tested using RCTs, you should only feel comfortable stating that smoking is associated with lung cancer but it’s only a hypothesis for which we await an RCT.  That’s silly.  Smoking causes lung cancer.

Why correlation can be causation

How can we be so certain that smoking causes lung cancer based on observational data alone? Because there are several good frameworks that help us evaluate whether a correlation is likely to be causal.  They include presence of a dose-response relationship, plausible mechanism, corroborating evidence, and absence of alternative explanations, among others. Let’s evaluate these in light of the gender paper.  Dose-response relationship? That’s a tough one – we examine self-identified gender as a binary variable…the survey did not ask physicians how manly the men were. So that doesn’t help us either way. Plausible mechanism and corroborating evidence? Actually, there is some here – there are now over a dozen studies that have examined how men and women physicians practice, with reasonable evidence that they practice a little differently. Women tend to be somewhat more evidence-based and communicate more effectively.  Given this evidence, it seems pretty reasonable to predict that women physicians may have better outcomes.

The final issue – alternative explanations – has been brought up by nearly every critic. There must be an alternative explanation! There must be confounding!  But the critics have mostly failed to come up with what a plausible confounder could be.  Remember, a variable, in order to be a confounder, must be correlated both with the predictor (gender) and outcome (mortality).  We spent over a year working on this paper, trying to think of confounders that might explain our findings.  Every time we came up with something, we tried to account for it in our models.  No, our models aren’t perfect. Of course, there could still be confounders that we missed. We are imperfect researchers. But that confounder would have to be big enough to explain about a half a percentage point mortality difference, and that’s not trivial.  So I ask the critics to help us identify this missing confounder that explains better outcomes for women physicians.

Statistical versus clinical significance

One more issue warrants a comment.  Several critics have brought up the point that statistical significance and clinical significance are not the same thing.  This too is epidemiology 101.  Something can be statistically significant but clinically irrelevant.  Is a 0.43 percentage point difference in mortality rate clinically important? This is not a scientific or a statistical question.  This is a clinical question. A policy and public health question.  And people can reasonably disagree.  From a public health point of view, a 0.43 percentage point difference in mortality for Medicare beneficiaries admitted for medical conditions translates into potentially 32,000 additional deaths. You might decide that this is not clinically important. I think it is. It’s a judgment call and we can disagree.

Ours is the first big national study to look at outcome differences between male and female physicians. I’m sure there will be more. This is one study – and the arc of science is such that no study gets it 100% right. New data will emerge that will refine our estimates and of course, it’s possible that better data may even prove our study wrong. Smarter people than me – or even my very smart co-authors – will find flaws in our study and use empirical data to help us elucidate these issues further, and that will be good. That’s how science progresses.  Through facts, data, and specific critiques.  “Correlation is not causation” might be epidemiology 101, but if we get stuck on epidemiology 101, we’d be unsure whether smoking causes lung cancer.  We can do better. We should look at the totality of the evidence. We should think about plausibility. And if we choose to reject clear results, such as women internists have better outcomes, we should have concrete, testable, alternative hypotheses. That’s what we learn in epidemiology 102.

Do women make better doctors than men?

About a year ago, Yusuke Tsugawa – then a doctoral student in the Harvard health policy PhD program – and I were discussing the evidence around the quality of care delivered by female and male doctors. The data suggested that women practice medicine a little differently than men do. It appeared that practice patterns of female physicians were a little more evidence-based, sticking more closely to clinical guidelines.  There was also some evidence that patients reported better experience when their physician was a woman.  This is certainly important, but the evidence here was limited to a few specific settings or in subgroups of patients. And we had no idea whether these differences translated into what patients care the most about: better outcomes. We decided to tackle this question – do female physicians achieve different outcomes than male physicians. The result of that work is out today in JAMA Internal Medicine.

Our approach

First, we examined differences in patient outcomes for female and male physicians across all medical conditions. Then, we adjusted for patient and physician characteristics. Next, we threw in a hospital “fixed-effect” – a statistical technique that ensures that we only compare male and female physicians within the same hospital. Finally, we did a series of additional analyses to check if our results held across more specific conditions.

We found that female physicians had lower 30-day mortality rates compared to male physicians. Holding patient, physician, and hospital characteristics constant narrowed that gap a little, but not much. After throwing everything into the model that we could, we were still left with a difference of about 0.43 percentage points (see table), a modest but clinically important difference (more on this below).

Next, we focused on the 8 most common conditions (to ensure that our findings weren’t driven by differences in a few conditions only) and found that across all 8 conditions, female physicians had better outcomes. Finally, we looked at subgroups by risk. We wondered – is the advantage of having a female physician still true if we just focus on the sickest patients? The answer is yes – in fact, the biggest gap in outcomes was among the very sickest patients. The sicker you are, the bigger the benefit of having a female physician (see figure).

Additionally, we did a variety of other “sensitivity” analyses, of which the most important focused on hospitalists. The biggest threat to any study that examines differences between physicians is selection – patients can choose their doctor (or doctors can choose their patients) in ways that make the groups of patients non-comparable. However, when patients are hospitalized for an acute illness, increasingly, they receive care from a “hospitalist” – a doctor who spends all of their clinical time in the hospital caring for whoever is admitted during their shift. This allows for “pseudo-randomization.” And the results? Again, female hospitalists had lower mortality than male hospitalists.



What does this all mean?

The first question everyone will ask is whether the size of the effect matters. I am going to reiterate what I said above – the effect size is modest, but important. If we take a public health perspective, we see why it’s important: Given our results, if male physicians had the same outcomes as female physicians, we’d have 32,000 fewer deaths in the Medicare population. That’s about how many people die in motor vehicle accidents every year. Second, imagine a new treatment that lowered 30-day mortality by about half a percentage point for hospitalized patients. Would that treatment get FDA approval for effectiveness? Yup. Would it quickly become widely adopted in the hospital wards as an important treatment we should be giving our patients?  Absolutely. So while the effect size is not huge, it’s certainly not trivial.

A few things are worth noting.  First, we looked at medical conditions, so we can’t tell you whether the same effects would show up if you looked at surgeons. We are working on that now. Second, with any observational study, one has to be cautious about over-calling it. The problem is that we will never have a randomized trial so this may be about as well as we can do. Further, for those who worry about “confounding” – that we may be missing some key variable that explains the difference – I wonder what that might be? If there are key missing confounders, it would have to be big enough to explain our findings. We spent a lot of time on this – and couldn’t come up with anything that would be big enough to explain what we found.

How to make sense of it all – and next steps

Our findings suggest that there’s something about the way female physicians are practicing that is different from the way male physicians are practicing – and different in ways that impact whether a patient survives his or her hospitalization. We need to figure out what that is. Is it that female physicians are more evidence-based, as a few studies suggest? Or is it that there are differences in how female and male providers communicate with patients and other providers that allow female physicians to be more effective? We don’t know, but we need to find out and learn from it.

Another important point must be addressed. There is pretty strong evidence of a substantial gender pay gap and a gender promotion gap within medicine. Several recent studies have found that women physicians are paid less than male physicians – about 10% less after accounting for all potential confounders – and are less likely to promoted within academic medical centers. Throw in our study about better outcomes, and those differences in salary and promotion become particularly unconscionable.

The bottom line is this: When it comes to medical conditions, women physicians seem to be outperforming male physicians. The difference is small but important. If we want this study to be more than just a source of cocktail conversation, we need to learn more about why these differences exist so all patients have better outcomes, irrespective of the gender of their physician.

ACO Winners and Losers: a quick take

Last week, CMS sent out press releases touting over $1 billion in savings from Accountable Care Organizations.  Here’s the tweet from Andy Slavitt, the acting Administrator of CMS:

The link in the tweet is to a press release.  The link in the press release citing more details is to another press release.  There’s little in the way of analysis or data about how ACOs did in 2015.  So I decided to do a quick examination of how ACOs are doing and share the results below.

Basic background on ACOs:

Simply put, an ACO is a group of providers that is responsible for the costs of caring for a population while hitting some basic quality metrics.  This model is meant to save money by better coordinating care. As I’ve written before, I’m a pretty big fan of the idea – I think it sets up the right incentives and if an organization does a good job, they should be able to save money for Medicare and get some of those savings back themselves.

ACOs come in two main flavors:  Pioneers and Medicare Shared Savings Program (MSSP).  Pioneers were a small group of relatively large organizations that embarked on the ACO pathway early (as the name implies).  The Pioneer program started with 32 organizations and only 12 remained in 2015.  It remains a relatively small part of the ACO effort and for the purposes of this discussion, I won’t focus on it further.  The other flavor is MSSP.  As of 2016, the program has more than 400 organizations participating and as opposed to Pioneers, has been growing by leaps and bounds.  It’s the dominant ACO program – and it too comes in many sub-flavors, some of which I will touch on briefly below.

A couple more quick facts:  MSSP essentially started in 2012 so for those ACOs that have been there from the beginning, we now have 4 years of results.  Each year, the program has added more organizations (while losing a small number).  In 2015, for instance, they added an additional 89 organizations.

So last week, when CMS announced having saved more than $1B from MSSPs, it appeared to be a big deal.  After struggling to find the underlying data, Aneesh Chopra (former Chief Technology Officer for the US government) tweeted the link to me:

You can download the excel file and analyze the data on your own.  I did some very simple stuff.  It’s largely consistent with the CMS press release, but as you might imagine, the press release cherry picked the findings – not a big surprise given that it’s CMS’s goal to paint the best possible picture of how ACOs are doing.

While there are dozens of interesting questions about the latest ACO results, here are 5 quick questions that I thought were worth answering:

  1. How many organizations saved money and how many organizations spent more than expected?
  2. How much money did the winners (those that saved money) actually save and how much money did the losers (those that lost money) actually lose?
  3. How much of the difference between winners and losers was due to differences in actual spending versus differences in benchmarks (the targets that CMS has set for the organization)?
  4. Given that we have to give out bonus payments to those that saved money, how did CMS (and by extension, American taxpayers) do? All in, did we come out ahead by having the ACO program in 2015 – and if yes, by how much?
  5. Are ACOs that have been in the program longer doing better? This is particularly important if you believe (as Andy Slavitt has tweeted) that it takes a while to make the changes necessary to lower spending.

There are a ton of other interesting questions about ACOs that I will explore in a future blog, including looking at issues around quality of care.  Right now, as a quick look, I just focused on those 5 questions.

Data and Approach:

I downloaded the dataset from the following CMS website:

and ran some pretty basic frequencies.  Here are data for the 392 ACOs for whom CMS reported results:

Question 1:  How many ACOs came in under (or over) target

Question 2:  How much did the winners save – and how much did the losers lose?

Table 1.

Number (%)

Number of Beneficiaries

Total Savings (Losses)


203 (51.8%)




189 (48.2%)




392 (100%)




I define winners as those organizations that spent less than their benchmark.  Losers were organizations that spent more than their benchmarks.

Take away – about half the organizations lost money and about half the organizations made money.  If you are a pessimist, you’d say, this is what we’d expect; by random chance alone, if the ACOs did nothing, you’d expect half to make money and half to lose money.  However, if you are an optimist, you might argue that 51.8% is more than 48.2% and it looks like the tilt is towards more organizations saving money and the winners saved more money than the losers lost.

Next, we go to benchmarks (or targets) versus actual performance.  Reminder that benchmarks were set based on historical spending patterns – though CMS will now include regional spending as part of their formula in the future.

Question 3:  Did the winners spend less than the losers – or did they just have higher benchmarks to compare themselves against?

Table 2.

Per Capita Benchmark

Per Capita Actual Spending

Per Capita Savings (Losses)

Winners (n=203)




Losers (n=189)




Total (n=392)





A few thoughts on table 2.  First, the winners actually spent more money, per capita, then the losers.  They also had much higher benchmarks – maybe because they had sicker patients – or maybe because they’ve historically been high spenders.  Either way, it appears that the benchmark matters a lot when it comes to saving money or losing money.

Next, we tackle the question from the perspective of the U.S. taxpayer.  Did CMS come out ahead or behind?  Well – that should be an easy question – the program seemed to net savings.  However, remember that CMS had to share some of those savings back with the provider organizations.  And because almost every organization is in a 1-sided risk sharing program (i.e. they don’t share losses, just the gains), CMS pays out when organizations save money – but doesn’t get money back when organizations lose money.  So to be fair, from the taxpayer perspective, we have to look at the cost of the program including the checks CMS wrote to ACOs to figure out what happened.  Here’s that table:

Table 3 (these numbers are rounded).


Total Benchmarks

Total Actual Spending

Savings to CMS

Paid out in Shared Savings to ACOs

Net impact to CMS

Total (n=392)

$73,298 m

$72,868 m

$429 m

$645 m

-$216 m

According to this calculation, CMS actually lost $216 million in 2015.  This, of course, doesn’t take into account the cost of running the program.  Because most of the MSSP participants are in a one-sided track, CMS has to pay back some of the savings – but never shares in the losses it suffers when ACOs over-spend.  This is a bad deal for CMS – and as long as programs stay 1-sided, barring dramatic improvements in how much ACOs save — CMS will continue to lose money.

Finally, we look at whether savings have varied by year of enrollment.

Question #5:  Are ACOs that have been in the program longer doing better?

Table 4.

Enrollment Year

Per Capita Benchmark

Per Capita Actual Spending

Per Capita Savings

Net Per Capita Savings (Including bonus payments)





















These results are straightforward – almost all the savings are coming from the 2012 cohort.    A few things worth pointing out.  First, the actual spending of the 2012 cohort is also the highest – they just had the highest benchmarks.  The 2013-2015 cohorts look about the same.  So if you are pessimistic about ACOs – you’d say that the 2012 cohort was a self-selected group of high-spending providers who got in early and because of their high benchmarks, are enjoying the savings.  Their results are not generalizable.  However, if you are optimistic about ACOs, you’d see these results differently – you might argue that it takes about 3 to 4 years to really retool healthcare services – which is why only the 2012 ACOs have done well.  Give the later cohorts more time and we will see real gains.

Final Thoughts:

This is decidedly mixed news for the ACO program.  I’ve been hopeful that ACOs had the right set of incentives and enough flexibility to really begin to move the needle on costs.  It is now four years into the program and the results have not been a home run.  For those of us who are fans of ACOs, there are three things that should sustain our hope.  First, overall, the ACOs seem to be coming in under target, albeit just slightly (about 0.6% below target in 2015) and generating savings (as long as you don’t count what CMS pays back to ACOs).  Second, the longer standing ACOs are doing better and maybe that portends good things for the future – or maybe it’s just a self-selected group that with experience that isn’t generalizable.  And finally, and this is the most important issue of all — we have to continue to move towards getting all these organizations into a two-sided model where CMS can recoup some of the losses.  Right now, we have a classic “heads – ACO wins, tails – CMS loses” situation and it simply isn’t financially sustainable.  Senior policymakers need to continue to push ACOs into a two-sided model, where they can share in savings but also have to pay back losses.  Barring that, there is little reason to think that ACOs will bend the cost curve in a meaningful way.

Making Transparency Work: why we need new efforts to make data usable

Get a group of health policy experts together and you’ll find one area of near universal agreement: we need more transparency in healthcare. The notion behind transparency is straightforward; greater availability of data on provider performance helps consumers make better choices and motivates providers to improve. And there is some evidence to suggest it works.  In New York State, after cardiac surgery reporting went into effect, some of the worst performing surgeons stopped practicing or moved out of state and overall outcomes improved. But when it comes to hospital care, the impact of transparency has been less clear-cut.

In 2005, Hospital Compare, the national website run by the Centers for Medicare and Medicaid Services (CMS), started publicly reporting hospital performance on process measures – many of which were evidence based (e.g. using aspirin for acute MI patients).  By 2008, evidence showed that public reporting had dramatically increased adherence to those process measures, but its impact on patient outcomes was unknown.  A few years ago, Andrew Ryan published an excellent paper in Health Affairs examining just that, and found that more than 3 years after Hospital Compare went into effect, there had been no meaningful impact on patient outcomes.  Here’s one figure from that paper:

Ryan et al

The paper was widely covered in the press — many saw it as a failure of public reporting. Others wondered if it was a failure of Hospital Compare, where the data were difficult to analyze. Some critics shot back that Ryan had only examined the time period when public reporting of process measures was in effect and it would take public reporting of outcomes (i.e. mortality) to actually move the needle on lowering mortality rates. And, in 2009, CMS started doing just that – publicly reporting mortality rates for nearly every hospital in the country.  Would it work? Would it actually lead to better outcomes? We didn’t know – and decided to find out.

Does publicly reporting hospital mortality rates improve outcomes?

In a paper released on May 30 in the Annals of Internal Medicine, we – led by the brilliant and prolific Karen Joynt – examined what happened to patient outcomes since 2009, when public reporting of hospital mortality rates began.   Surely, making this information public would spur hospitals to improve. The logic is sound, but the data tell a different story. We found that public reporting of mortality rates has had no impact on patient outcomes. We looked at every subgroup. We even examined those that were labeled as bad performers to see if they would improve more quickly. They didn’t. In fact, if you were going to be faithful to the data, you would conclude that public reporting slowed down the rate of improvement in patient outcomes.

So why is public reporting of hospital performance doing so little to improve care?  I think there are three reasons, all of which we can fix if we choose to. First, Hospital Compare has become cumbersome and now includes dozens (possibly hundreds) of metrics. As a result, consumers brave enough to navigate the website likely struggle with the massive amounts of available data.

pullquute PR mortality

A second, related issue is that the explosion of all that data has made it difficult to distinguish between what is important and what is not. For example – chances that you will die during your hospitalization for heart failure? Important. Chances that you will receive an evaluation of your ejection fraction during the hospitalization? Less so (partly because everyone does it – the national average is 99%). With the signal buried among the noise, it is hardly surprising that that no one seems to be paying attention — and the result is little actual effect on patient outcomes.

The third issue is how the mortality measures are calculated. The CMS models are built using Bayesian “shrinkage” estimators that try to take uncertainty based on low patient volume into account. This approach has value, but it’s designed to be extremely conservative, tilting strongly towards protecting hospitals’ reputation. For instance, the website only identifies 23 out of the 4,384 hospitals that cared for heart attack patients as being worse than the national rate – about 0.5%. In fact, many small hospitals have some of the worst outcomes for heart attack care – yet the methodology is designed to ensure that most of them look about average. If a public report card gives 99.5% of hospitals a passing grade, we should not be surprised that it has little effect in motivating improvement.

Fixing public reporting

There are concrete things that CMS can do to make public reporting better. One is to simplify the reports. CMS is actually taking important steps towards this goal and is about to release a new version that will rate all U.S. hospitals one to five stars based on their performance across 60 or so measures. While the simplicity of the star ratings is good, the current approach combines useful measures with less useful ones and uses weighting schemes that are not clinically intuitive. Instead of imposing a single set of values, CMS could build a tool that lets consumers create their own star ratings based on their personal values, so they can decide which metrics matter to them.

Another step is to change the approach to calculating the shrunk estimates of hospital performance. The current approach gives too little weight to both a hospital’s historical performance and the broader volume-outcome relationship. There are technical, methodological issues that can be addressed in ways that identify more hospitals as likely outliers and create more of an impetus to improve. The decision to only identify a tiny fraction of hospitals as outliers is a choice – and not inherent to public reporting.

Finally, CMS needs to use both more clinical data and more timely data. The current mortality data available on CMS represents care that was delivered between July 2011 and June 2014 – so the average patient in that sample had a heart attack nearly 2 ½ years ago. It is easy for hospitals to dismiss the data as old and for patients to wonder if the data are still useful. Given that nearly all U.S. hospitals have now transitioned towards using electronic health records, it should not be difficult to obtain and build risk-adjusted mortality models that are superior and remains current.

None of this will be easy, but it is all doable. We learned from the New York State experience as well as that of the early years of Hospital Compare that public reporting can have a big impact when there is sizeable variation in what is being reported and organizations are motivated to improve. But with nearly everyone getting a passing grade on website that is difficult to navigate and doesn’t differentiate between measures that matter and those that don’t, improvement just isn’t happening.  We are being transparent so we can say we are being transparent.  So, the bottom line is this – if transparency is worth doing, why not do it right? Who knows, it might even make care better and create greater trust in the healthcare system. And wouldn’t that be worth the extra effort?