Identification of target groups and individuals for adherence interventions using tree-based prediction models
Background: In chronically ill patients, medication adherence during implementation can be crucial for treatment success and can decrease health costs. In some populations, regression models do not show this relationship. We aim to estimate subgroup-specific and personalized effects to identify target groups for interventions. Methods: We defined three cohorts of patients with type 1 diabetes (n = 12,713), type 2 diabetes (n = 85,162) and hyperlipidemia (n = 117,485) from German claims data between 2012 and 2015. We estimated the association of adherence during implementation in the first year (proportion of days covered) and mean total costs in the three following years, controlled for sex, age, Charlson’s Comorbidity Index, initial total costs, severity of the disease and surrogates for health behavior. We fitted three different types of models on training data: 1) linear regression models for the overall conditional associations between adherence and costs, 2) model-based trees to identify subgroups of patients with heterogeneous adherence effects, and 3) model-based random forests to estimate personalized adherence effects. To assess the performance of the latter, we conditionally re-estimated the personalized effects using test data, the fixed structure of the forests, and fixed effect estimates of the remaining covariates. Results: 1) our simple linear regression model estimated a positive adherence effect, that is an increase in total costs of 10.73 Euro per PDC-point and year for diabetes type 1, 3.92 Euro for diabetes type 2 and 1.92 Euro for hyperlipidemia (all p ≤ 0.001). 2) The model-based tree detected subgroups with negative estimated adherence effects for diabetes type 2 (-1.69 Euro, 24.4% of cohort) and hyperlipidemia (-0.11 Euro, 36.1% and -5.50 Euro, 5.3%). 3) Our model-based random forest estimated personalized adherence effects with a significant proportion (4.2%–24.1%) of negative effects (up to -8.31 Euro). The precision of these estimates was high for diabetes type 2 and hyperlipidemia patients. Discussion: Our approach shows that tree-based models can identify patients with different adherence effects and the precision of personalized effects is measurable. Identified patients can form target groups for adherence-promotion interventions. The method can also be applied to other outcomes such as hospitalization risk to maximize positive health effects of an intervention.