The Merit-based Incentive Payment System: Pearson’s Chi-Square and Categorical Dependent Variable Models Analyzed for Domains—Effective Clinical Care and Efficiency/Cost Reduction

Background: Following the 2015 repeal of the Sustainable Growth Rate formula, the US Centers for Medicare & Medicaid Services’ formula under which physicians were reimbursed, two payment systems were put in place to incentivize physicians, one of which was the Merit-based Incentive Payment System (MIPS). MIPS emphasizes high-quality care that is accessible, affordable, and supports a healthier population. Objectives: This research aims to measure characteristics of MIPS relevant to National Quality Strategy (NQS) domains, quality measure types, and clinical specialties; categorize MIPS with NQS domains and quality measure types by MIPS specialty types; and quantify the relationship between MIPS specialties, measure types, and two NQS domains, Effective Clinical Care (ECC) and Efficiency/Cost Reduction (E/CR), for years 2017 through 2020. Methodology: The Pearson’s chi-square test examined distributions of the analyzed categorical variables. The Categorical Dependent Variable Method examined the association between the dependent and independent variables. Results: The Pearson’s chi-square test showed statistically significant distributions between ECC and E/CR when analyzed with the types of quality measures. There were more process measures (93.81% vs 89.64% [P=.000]) in 2018 versus 2017. This changed minutely with significantly less process measures (93.75% vs 93.81% [P=.000]) in 2019 versus 2018. Finally, measure types changed minutely but significantly with less process measures (93.81% vs 93.75% [P=.000]) in 2020 versus 2019. The regression model showed that ECC was significantly associated with outcome measures through all analyzed years of this research. Conclusion: The above findings show scope for including additional outcome measures, given its importance in MIPS. There is potential to increase the percentage allocation for reporting more outcome measures in quality. This re-allotment infers reporting more outcome measures aligning with priority outcome measures (PROMs). Re-allocating the incentive formula to report more outcome measures aligned with PROMs shows potential to increase reporting of more outcome measures under MIPS.


INTRODUCTION
Providing high-quality care is one of the foremost goals of the health care system. 1 In general, quality care that is accessible, affordable, and provides health benefits to the population has been one of the overarching goals of many US health care programs. 1 These programs include the Merit-based Incentive Payment System (MIPS), the Hospital-Acquired Condition Reduction Program, and the Hospital Readmission Reduction Program. These programs emphasize cost, quality, and access to create a healthier population.
Prior to April 2015, the US Centers for Medicare & Medicaid Services (CMS) reimbursed physicians based on the Sustainable health of the US population by supporting proven interventions to address behavioral, social, and environmental determinants of health in addition to delivering higher-quality care. 14 Affordable Care refers to reducing the cost of quality health care for individuals, families, employers, and government. 14 Quality Domains MACRA quality domains are: (1) clinical care, (2) safety, (3) care coordination, (4) patient and caregiver experience, and (5) population health and prevention. 15 As per the CMS' Quality Measure Development Plan, the MACRA quality domains, which were mandated for use in MIPS, align with the NQS priority areas and CMS Quality Strategy goals. 15 CMS strives to clearly align quality measures with these domains to address gaps, drive quality improvement, and ensure the CMS Quality Strategy goals are achieved. 15 CMS proposed including Affordable Care as a quality domain for MIPS along with the five MACRA quality domains. 15 The six MIPS quality domains are: (1) Effective Clinical Care (ECC), (2) Efficiency/Cost Reduction (E/CR), (3) Communication and Care Coordination, (4) Person and Caregiver Experience and Outcomes, (5) Patient Safety, and (6) Community and Population Health. 16 MIPS utilizes these six domains in its Quality Performance Category Percent Score formula to compute points in its scoring system to provide incentives to EPs. 16 As per the Agency for Healthcare Research and Quality, these six MIPS domains also align with NQS priorities. 17 The NQS focuses on these six priorities to advance its initiatives in the three broad NQS aims. 17 The ECC domain aligns with measures incorporating patient preferences, shared decision-making, and outcome measures. 16 The E/CR domain aligns with decreasing overuse measures by improving efficiencies, cost-effectiveness, and, therefore, affordability of care. 16 ECC aims to improve the health of the population, whereas E/CR focuses on improving the affordability and cost-effectiveness of treatments.
The MIPS Quality Performance Category factsheet, additionally, mentions seven quality measure types: (1) Structure, (2) Process, (3) Outcome, (4) Intermediate Outcome, (5) Patient Reported Outcome, (6) Efficiency, or (7) Patient Engagement & Patient Experience. 16 Structure examines the physician's capacity, systems, and scope to provide high-quality care. 16 Process looks at the physician's initiatives to maintain and improve the health of patients. 16 Outcome measures determine how a physician's line of treatment affects the patient's health status. 16 Intermediate Outcome measures assess specific metrics that contribute to having an outcome. 16 Patient Reported Outcome measures are the outcomes reported directly by the consumers. 16 Efficiency measures evaluate the affordability of health care treatments and whether those can be made more cost effective for the patient. 16 Patient Engagement and Patient Experience measures use direct feedback from patients and their caregivers about their care experience. 16 This information is usually collected through surveys. 16

OBJECTIVES
Given past literature, little is known whether MIPS has had an impact on ECC and E/CR. The broad aim, therefore, is to explore whether MIPS has any influence on the effectiveness of care and affordability of treatments. In the interest of understanding how MIPS affects effectiveness and affordability, this research examines MIPS with regard to ECC and E/CR domains. In other words, this research aims to understand how MIPS is associated with health care outcomes and affordability, given the two NQS domains, ECC and E/CR. An extensive review of the literature failed to detect an evaluation of MIPS in the context of QPP measure types and NQS domains. Due to the gap in literature, this topic deserves some investigation to account for and understand how MIPS is associated with effectiveness and affordability of care. This understanding will contribute information regarding the association of MIPS on two specific NQS domains, ECC and E/CR, with quality measure types given the medical and surgical specialties analyzed in this research.
This research has a three-pronged objective. The first objective is to observe the count and percentage of MIPS measures that apply to quality measure types, NQS domains, and specialties. This gives us insight into how NQS domains and quality measure types are distributed in the process of reporting MIPS measures. These results will observe the intricacies, distributions, and percentage allocations of the categories of these variables.
The second objective is to analyze how MIPS measures are reported according to quality measure types and NQS domains by specialties. This information will present annual comparisons based on how reporting MIPS measures varied in the past according to specialties, over the years 2017 through 2020.
The third objective is to identify whether ECC and E/CR are significantly associated with quality measure types and specialties. This information will assist in quantitatively estimating how the two NQS domains associate with quality measure types and specialties in this research.
The purpose of these three objectives is to explore how MIPS' impact on care effectiveness and affordability is relevant to MIPS as a health care policy.

Database
The database was retrieved from CMS' publicly available QPP website for years 2017, 2018, 2019, and 2020. [18][19][20][21] Data pools were obtained from the dropdown menus of quality measure types, specialty measure set, and collection types. These data pools were leveraged to build annual datasets for this research. The 2017 MIPS measure list was the earliest list available.

Measure Categorization
Quality measure types, NQS domains, and medical/surgical specialties were the categorical variables of this research. Table 2 presents the list of medical and surgical (or clinical) specialties pertaining to this research. These clinical specialties were selected to cover a broad range of specialties and be reasonably inclusive of the different types of specialties. These clinical specialties were leveraged to analyze the effect MIPS would have on the NQS' ECC and E/CR domains and quality measure types.

Dependent and Independent Variables
The ECC and E/CR NQS domains were the primary dependent variables, given their bearing on clinical outcomes and affordability of health care services. QPP measure types and select MIPS clinical specialties were the independent variables.

Statistical Methods
This research utilized two statistical methods. First, the Pearson's chi-square test, which is a non-parametric test, 22 is a cross-tabulated distribution of the variables. It presents distributions 24 of the categorical variables. This method also shows a contingency table that exhibits statistical significance between the analyzed variables. 22,23 Second, the Categorical Dependent Variable Method (CDVM) 24 is used to determine the association between the dependent and independent variables specific to this research. More specifically, the binary logit regression model is used to examine the association between the two dependent variables, ECC and E/CR, in separate regression equations, with the independent variables.
The Pearson's chi-square test cannot be used to provide any inference about the association between variables. 22 The CDVM regression method is used to infer an association between the two dependent variables, ECC and E/CR, as the second analytical step to Pearson's chi-square test. There are several coding methods in regression models, of which the most common method is a dummy variable. 25,26 The categories of quality measure types and clinical specialties were coded as binary or dummy (0/1) variables. 27  Table 3 summarizes quality measure types, NQS domains, and clinical specialties participating and reporting under MIPS.  Table 4).

RESULTS
Compared with 2019, the 2020 clinical specialty measures were statistically significant. When compared with 2019 measures, the quality measure type changed significantly with less process measures (93.81% vs 93.75% [P=0.000]) in 2020 versus 2019. There was no statistical significance in the NQS domain categories in 2020 compared with 2019 ( Table 4).
Tables 5A and 5B summarize MIPS measure characteristics by clinical specialties for year 2019. When quality measure types were assessed, statistical significance was observed for the analyzed specialties. Orthopedic surgery had the highest number of process measures, 15 (100%), and obstetrics/gynecology had the lowest number of intermediate outcome measures, 1 (7.14%). No statistical significance was observed between the NQS domains and analyzed specialties. Table 6 summarizes the results of the CDVM for the two NQS domains, ECC and E/CR, when regressed with quality measure types and clinical specialties, for years 2017 through 2020.
The binary logit regression model for year 2017 estimated that for every one unit increase in the outcome measure (as a nested subcategory of quality measure type as an independent variable), the odds that ECC is significantly 0.82 times as small as the odds that this measure is not included in this domain, when all the other variables are held constant.
Similarly, the regression models for years 2018, 2019, and 2020 estimated that for every one unit increase in the outcome measure, the odds that ECC is significantly 0.54, 0.99, and 0.95 times as small as the odds that these observations are not included in this domain, respectively, when all the other variables are held constant.
The binary logit regression model for 2017 through 2020 estimated that for every one unit increase in the outcome measure, the odds that E/CR, although not statistically significant (from the P-values), is the indicated unit times as small as the odds that these observations are not included in this domain, when all the other variables are held constant.

DISCUSSION
The analysis exhibited both growth and scope for advancement of MIPS in terms of measure selection. MIPS measures have expanded in reporting more quality improvement measures by including different measure types across clinical specialties.
The number of MIPS process measures increased by 19 between 2017 and 2018. Those additional 2018 process measures for MIPS specialties are listed in Table 7.
In connection to the scope for advancement, MIPS had a limited number of measure types that focused on outcomes. The number of priority outcome measures (PROMs) reported for MIPS Medicare Part B measures in 2017, 2018, 2019, and 2020 were 7, 3, 2 and 2, respectively. [18][19][20][21] Outcome measures may need to be included more because they "often directly measure events with the clearest importance in clinical management and treatment." 27 Most physicians and patients are interested in high-quality outcomes following treatment and surgical procedures. These outcomes include, for example, survival rates following major surgeries, decreasing hospital readmission rates, reducing preventable hospitalizations, preserving the quality of life after medical treatments, and controlling pneumonia and heart failure, to name a few. 28 Notably, the Agency for Healthcare Research and Quality has emphasized that "outcome measures reflect the impact of healthcare services or interventions on the health statuses of patients." 29 Additionally, MIPS has a "priority focus on outcome measures, [which are] measures that provide new measure options within a topped-out specialty area or measures that are relevant for specialty providers." 30 Kessell et al have importantly noted that "health outcomes are essential for assessing quality of care and include a variety of health statuses." 31 Increasing the number of outcome measures may broaden the scope for advancing health statuses. Addressing not only outcomes but all six of the enlisted NQS domains is a substantial initiative. These NQS domains are only one of the many ways to address the Institute of Medicine's six aims for an optimized health care system. The Institute of Medicine's six aims-safe, efficient, effective, patientcentered, timely and equitable-are listed in the 2001 Crossing the Quality Chasm report. 27,32 In MIPS, outcome measures are the "gold standard" in measuring quality. 33 Outcome measures show how a health care service or intervention influences the health status of patients. 33 All seven quality measure types (efficiency, intermediate outcome, outcome, patientreported outcome, patient engagement and experience, process, and structure) include high-priority measures. 33 High-priority measures are not an additional measure type by itself but an incorporated aspect within the seven quality measure types. 33 MIPS is the most expansive pay for performance program to date. 34 In a national survey analyzing MIPS' impact on value of care, 19% to 24% of survey respondents believed MIPS would, in fact, reduce the value of care. 34 In the same vein, physicians were wary of how MIPS would affect the value of care. 34 Under MIPS, provider reimbursement is determined by two primary factors: a composite score and program participation that provides fiscal incentives ranging from 4% to 9%-depending on JOURNAL OF HEALTH ECONOMICS AND OUTCOMES RESEARCH the year of reporting. 8 The reporting of MIPS potentially creates administrative burdens given its "extensive and complex quality reporting requirements" that are intensive and time-consuming. 35 These additional reporting requirements may come at the cost of diverting time and resources from that allotted to direct patient care, thereby negatively impacting patient and provider satisfaction scores. 36 A negative impact on patient and/or provider satisfaction scores may extend a similar impact on the quality of care.
Lower patient satisfaction may be the likely outcome of lower standards of care just as excessive diagnostic testing may potentially result in less satisfied patients. 36 Green et al (2017) found that provider incentive payment programs increased the number of incentivized measures included in the MIPS cohort in their simulated experiment. 36 They observed that unnecessary diagnostic testing puts the patient at a greater risk for overtreatment, likely increasing the cost of health care. 36 In their experiment, clinicians incentivized under MIPS were more likely to overtreat and be paid more resulting in additional system costs, thereby stressing an already overstretched health care system. 36

Limitations and Strengths
This research is specific only in its applications to MIPS as a policy. It may not extend its applications to other pay-for-performance incentivebased policies. Findings are limited only to the clinical specialties that were analyzed for this research.
This research describes two NQS domains in relation to clinical outcomes and affordability of care. It presents the effect of MIPS as an aggregate overall policy and not at the granular domain levels that contribute to the MIPS final score.
The Pearson's chi-square test is specific to the sample size. 37 This chi-square test is not recommended if the sample size is less than 50 units. 37 The limitation in methodology is not applicable to this research because the sample size is a minimum of 193 for each of the analyzed years.
The observations of this research, essentially, the quality measure types, NQS domains, and clinical specialties, were distinct units and had no overlaps. The Pearson's chi-square test assumes random sampling and that the observations must be randomly selected from the total population. 37 The observations were, in fact, of no specific or sequential order of selection when analyzed for this research.
The CDVM is not relevant to showing any changes in levels and trends in research observations. The purpose of this research was not to observe changes in levels and trends but to understand the distributions and association of the analyzed variables. This was to understand this policy with regard to effectiveness and affordability of care. The CDVM model is limited in its analysis to categorical variables, which this research has employed.
Considering the strengths of this research and its statistical methods, this research analyzed each MIPS reporting year starting in 2017, the first year for reporting MIPS measures. This information may be useful in providing historical base data for comparing future values while MIPS is in practice in forthcoming years.
The association between the categorical variables of this research was analyzed by leveraging two statistical methods, Pearson's chi-square test and binary logit regression model. The Pearson's chi-square test does not give information about the strength or associations between the analyzed variables, which is why a regression model was leveraged. The regression model made it feasible to quantitatively examine the associations of the categories of the analyzed variables.

Future Work
It will be interesting to observe the effects of MIPS leveraging a longitudinal panel regression model. Panels could be constructed to include quality measure types, the years of analysis, and clinical specialties for regressing with the dependent variable. Research leveraging NQS domains as the dependent variable analyzed with independent variables would be another avenue to further research based on this policy utilizing a longitudinal panel regression model.

JOURNAL OF HEALTH ECONOMICS AND OUTCOMES RESEARCH
Practice and Policy Implications MIPS may be discontinued due to its heavy administrative burdens and diversion of time and resources from direct patient care. 38 MIPS has had little positive impact on both physician and patient satisfaction. 38 Consequently, MIPS may have failed to improve the patient experience of care. 42 Moreover, MIPS may not measure the broader aspects of health care quality and even risk worsening existing health disparities. 38 MIPS has specific reporting requirements allocated according to a defined formula. 8 This formula determines the pay for performance and bonus point incentive payments for specialties and hospitals reporting under MIPS. In 2021, this formula showed that MIPS has four domains (percentage allocations) of quality (40%), cost (20%), improvement activities (15%), and promoting interoperability (25%). 39 From a policy implications perspective, MIPS has the potential to increase the percentage allocation to report more quality outcome measures in its incentive-score formula. Adjusting this incentive formula to reflect increased reporting of more quality outcome measures would also support reporting more metrics that align with PROMs. This formula adjustment would create and record more quality outcomes in treatment.  Hospitalists (

CONCLUSION
This research analyzed MIPS through the lenses of health care outcomes and affordability pertaining to ECC and E/CR. MIPS has a priority focus on outcome measures. In addition to being reflective of health statuses of patients, outcome measures are the "gold standard" in measuring high-quality health care. 33 These outcome measures signify the patients' health statusesthe result of the treatment, procedure, or intervention. Most patients and physicians are interested in outcome measures of health statuses following treatment, surgery, and intervention.
Policy makers and analysts may need to review the components of MIPS and its percentage requirements of reporting. It may be helpful to preserve MIPS' existing areas and expand areas where reporting more outcome measures could be showcased.