The Medicare Health Outcomes Survey (HOS) is an assessment of a Medicare Advantage Organization's (MAO) ability to maintain or improve the physical and mental health functioning of its Medicare beneficiaries over a two-year period of time, using the best available science in functional status and health outcomes measurement. The survey is used as a way to measure how the care provided by MAOs affects the functional status of their enrollees. The Centers for Medicare & Medicaid Services (CMS) includes the HOS in their performance assessment program; e.g., HOS results are included in the CMS Medicare Star Ratings. For more information, go to the HOS and the Star Ratings page.
The first cohort of baseline data was collected in 1998. Beginning in 2000, both a baseline cohort and a two-year follow up cohort were collected. During the most recent survey administration (2021 Round 24), Cohort 24 Baseline and Cohort 22 Follow Up data were collected. The most recently available results are from the 2020 Cohort 23 Baseline and the 2019-2021 Cohort 22 Performance Measurement, which combines data from the 2019 Cohort 22 Baseline and 2021 Cohort 22 Follow Up surveys. For further information about the survey fielding schedule, please refer to the survey administration timeline. For further information about the dissemination of survey results, please refer to the Data Dissemination page. The overall timeline for Medicare Part C Star Ratings data collection and posting of results is found on the Star Ratings page. For general status information, including response rates for the baseline and follow up cohorts administered and reported to date, view or download the table below:
After the administration of each follow up cohort, cohort-specific performance measurement results are calculated. Seniors (age 65 or older) who had a physical component summary (PCS) score or mental component summary (MCS) score that could be calculated at baseline are eligible for performance measurement. However, some of these seniors belong to MAOs that went out of business or discontinued offering managed care between the baseline and follow up samples. These seniors are classified as "involuntarily disenrolled" for purposes of performance measurement. Therefore, the Performance Measurement Analytic Sample is limited to those seniors who had physical or mental health summary scores that could be calculated at baseline and were still enrolled in the same participating MAO at the time of the follow up sampling. Additionally, a certain number of seniors will voluntarily disenroll from their MAOs between baseline and follow up. These seniors are classified as "voluntarily disenrolled" for purposes of performance measurement. Seniors who are deceased between baseline and follow up are classified as "dead" for purposes of performance measurement.
Of the seniors sampled at follow up, a certain percentage are determined to be ineligible for inclusion in the sample. These ineligible seniors meet one or more of the following criteria: not enrolled in the MAO; bad address and phone number; or language barrier. Of the seniors eligible for inclusion in follow up, those who do not return a completed survey are designated as "non-respondents" and those who return a completed survey are referred to as "respondents." For a table that depicts the distribution of the Performance Measurement Analytic Sample for the completed cohorts to date, view or download the table below:
A performance measurement data set is created by merging a cohort's baseline and follow up data. Additionally, death information is incorporated into the performance measurement data set for those baseline respondents who died between baseline and the two-year follow up. The HOS performance measurement results are computed using rigorous case-mix/risk-adjustment models. The two longitudinal HOS functional health outcomes are based on risk-adjusted mortality rates, changes in physical health as measured by the PCS score, and changes in mental health as measured by the MCS score for the participating MAOs. For reporting purposes, death and PCS outcomes are combined into one overall measure of change in physical health. For the Medicare Part C Star Ratings, the primary outcomes are reported as the percentage of respondents within an MAO who are “Improving or Maintaining Physical Health” and the percentage within an MAO who are “Improving or Maintaining Mental Health” over the two-year period, after adjustment for case-mix. (see HOS and the Medicare Star Ratings)
There are six main categories of actual health outcomes used in the performance measurement analysis:
Each beneficiary is classified into one of the three physical health categories and one of the three mental health categories based on their actual health outcomes. Beneficiary level results are aggregated to derive the MAO, state, and HOS national percent better, same, and worse values for actual health outcomes. In calculating expected outcomes, separate case-mix models are warranted for death, and for PCS scores and MCS scores. The expected results are adjusted for the case-mix of beneficiaries within an MAO to control for pre-existing differences across MAOs such as baseline measures of sociodemographic characteristics, chronic medical conditions, and functional health status. The PCS results are combined with the percentage remaining alive in the MAO. A series of six different death models, three different physical health models, and three different mental health models are used, since all beneficiaries do not have data for all of the independent variables that could be used to calculate an expected score. In other words, each expected outcome for a beneficiary is derived from the best fit model, which is based on those variables for which the beneficiary has data. One model is used for each beneficiary, and there are no predictions made with missing data.
Beneficiary level results are aggregated to derive the MAO and state percent better, same, and worse than expected values. The MAO difference score is calculated as the actual minus the expected percentages for the combined better plus same categories for each primary outcome, since health maintenance, rather than improvement, is a realistic clinical goal for many seniors. An adjusted contract-level percentage for each of the two primary outcomes is calculated by combining the national average and the MAO difference score, using a logit transformation. Outliers are those MAOs that performed significantly better (i.e., better than expected) or significantly worse (i.e., worse than expected) when compared to the national average. The national average is based on all MAOs that participated in the performance measurement. MAOs can be outliers on the measure of physical health (which is based on death and the PCS score), or on the measure of mental health (which is based on the MCS score). Tables describing the coefficients from the most recent performance measurement analysis of the two functional health measures are available to view or download below:
Analyses of the two-year performance measurement data have demonstrated that at the national level there is significant variation among MAOs with respect to both physical and mental health outcomes. Research has identified differences in outcomes among specific groups of beneficiaries and potential opportunities to improve care. For a table that depicts the overall performance measurement results by cohort, view or download the table below:
Summaries of the most recent performance measurement results are available in a sample performance measurement report (PDF, 1.4 MB). Included in the sample report is an overview of the methodology and design followed for sampling, data collection, scoring, and analysis. The MAO Performance Measurement Contract List specifies the participating MAOs from the recently completed cohort.
PFADL is a longitudinal change measure derived from the HOS that is currently a display measure in the Medicare Part C Star Ratings. It measures, at the contract level, the change over two years in the physical functioning of beneficiaries enrolled in Medicare Advantage (MA) contracts and complements the measurement of physical health status. The PFADL scale combines two physical functioning (PF) questions (limitations in moderate activities and climbing stairs) with the six activities of daily living (ADL) questions to create a Likert-type scale. PFADL scale scores are created from responses to the baseline and the two-year follow-up questions. The PFADL measure is included in the Performance Measurement Report. Information about the measure, questions included in the calculation of the scale scores, and the case-mix adjustment procedures, are provided in the document below.
This page was last modified on 08/17/2022