Challenges in Conducting Quantitative Patient-Centered Benefit-Risk Assessments: A Case Study in Ph + ALL with Immature Efficacy Data

Overview of Case Application

Tyrosine kinase inhibitor (TKI)-based treatments have reshaped the treatment landscape for patients with Ph + ALL, dramatically improving survival and disease-free survival in patients who had historically faced a dismal prognosis [12]. When patients and their doctors select a frontline therapy for Ph + ALL, they face a decision between several available TKI-based treatments that offer different benefit-risk profiles [13,14,15,16,17,18]. The Phase 3 PhALLCON trial (NCT03589326) [14] is currently the only trial that has compared TKI treatments head-to-head in patients with newly diagnosed Ph + ALL. PhALLCON was conducted to assess the efficacy and safety of ponatinib + reduced-intensity chemotherapy vs. imatinib + reduced-intensity chemotherapy (henceforth, “chemotherapy”) [14], 2 treatments that had been used off-label as frontline therapy in Ph + ALL [19]. The primary endpoint of PhALLCON, minimal residual disease-negative complete remission (MRD–ve CR) at the end of induction, was reached by a significantly greater proportion of ponatinib-treated than imatinib-treated patients [14], which led to the approval of ponatinib for frontline use in the US [16, 20]. Ponatinib is currently the only TKI combination treatment approved for use in newly diagnosed Ph + ALL [16, 20].

It remains unclear how patients trade off between the benefits and risks of available treatments, including those used off-label, and which treatment they would ultimately prefer. For instance, ponatinib has a black box warning on its US Food and Drug Administration label for hepatic and cardiovascular (CV) events [16], but is associated with higher levels of efficacy than other TKIs [13, 14]. The goal of the patient preference study was to understand patients’ treatment preferences for frontline therapy, the relative importance that they place on treatment benefits and risks, and the benefit-risk trade-offs they are willing to make when selecting a treatment [11]. The goal of the qBRA was to understand which treatment patients would be expected to prefer based on their elicited preferences [11] and head-to-head clinical data for 2 available treatments [14]. We conducted a qBRA of frontline treatment with ponatinib + chemotherapy vs. imatinib + chemotherapy because these TKI-based treatments are actively used in clinical practice and because they are the only TKI-based treatments with head-to-head clinical data on efficacy and safety [14].

qBRA can be conducted to support different stakeholders, including medical professionals, pharmaceutical companies, health technology assessors, and regulatory authorities, incorporating the patient perspective, and to inform different stages of the drug development cycle [2, 6, 21, 22]. It is critical that the preference study informing the qBRA be centered around the targeted stakeholders and decision-making context [2]. This case study was designed to support medical professionals in shared decision-making and to guide pharmaceutical companies in future treatment development decisions, rather than to support regulatory submissions.

Treatment preferences were elicited in an online DCE [11], conducted in alignment with ISPOR good practice guidelines for preference elicitation research [23, 24] (Online Resource 1). In total, 201 people with Ph + ALL completed the DCE; participants had a mean age of 45 ± 13 years, 60% were male, and 67% were in remission [11].

Briefly, US-based adults with Ph + ALL were recruited for the DCE via physician referrals, databases, social media, and through the Leukemia and Lymphoma Society [11]. The treatment attributes and performance levels used to describe patient-relevant benefit-risk trade-offs were identified through a targeted literature review, interviews with 2 hematologists who treat patients with Ph + ALL, feedback from the Leukemia and Lymphoma Society, including a lay patient representative, and 5 pre-testing interviews with patients [11]. The following attributes were selected: 1) overall survival (OS); 2) duration of remission (DOR); 3) risk of a major CV event (i.e., myocardial infarction or stroke); and 4) risk of myelosuppression (Table 1) [11].

Table 1 Attributes, definitions, and levels included in the DCE

Operationalizing the designs of our DCE and qBRA in compliance with qBRA good practices [2] posed several challenges. Below we discuss 3 key challenges.

Challenge 1: Dominance of Survival Outcomes

To elicit preference weights for attributes within a qBRA, elicitation exercises need to stimulate trade-off behavior by ensuring that single attributes do not dominate decision-making [2, 25, 26]. If choices are dominated by 1 attribute, participants may ignore other attributes and therefore not make trade-offs. When eliciting preferences for cancer treatments, large changes in efficacy (e.g., OS) may cause participants to always prioritize that attribute.

Within the case study, OS levels ranged from 30 to 90 months and were informed by clinical trial data for TKIs and input from a key opinion leader. Adopting a common practice in DCE design and identifying 3 performance levels to cover this range would have resulted in participants being presented with treatment profiles that differed by ≥ 30 months of OS. Given the importance of OS in oncology [27], there is increased risk that OS might dominate if participants are asked to choose between alternatives with widely different levels of OS. Thus, we constrained the difference in OS presented in any choice task in 2 ways. First, we ran 2 variants of the DCE to spread the performance range across participants. Participants completing variant A were shown OS levels ranging from 30–60 months; participants completing variant B were shown levels ranging from 60–90 months. This limited the maximum difference seen across treatment alternatives to 30 months rather than 60 months and reduced any anchoring bias. Second, we limited the maximum OS difference between the pair of treatment alternatives in a task to 10 months, thus increasing the likelihood that other attributes would be considered.

Pilot interviews were conducted to ensure that this approach was effective. In the final DCE, only 4% (n = 8) of participants’ decision-making was dominated by 1 attribute; 5 participants always prioritized DOR, and 3 always prioritized OS. Analyses confirmed that treatment preferences did not significantly differ between the 30–60 months and 60–90 months DCE variants and thus, that the 2 variants could be combined (Online Resource 1).

Challenge 2: Preferentially Dependent and Overlapping Benefit Outcomes

Preference studies with attributes that are preferentially independent and nonoverlapping have greater statistical and response efficiency [2]. Where attributes do not display these properties, a recommended first step is to redefine the attributes to ensure compliance [2]. However, this is not always possible, in which case the study would need to be designed such that the interaction between attributes could be estimated [2].

Preference studies are often subject to preferentially dependent and overlapping attributes. In oncology, the value that participants place on improvements in OS may depend on whether and to what extent this added survival time is spent in remission. As such, changes in OS and DOR attributes may be preferentially dependent. Additionally, OS and DOR overlap, as time in remission is part of OS; this presents a risk of double counting. To estimate any interaction (i.e., second-order) effects between OS and DOR – whether being in remission impacts the value attached to improvements in OS – and avoid double counting the values of OS and DOR, we first needed to ensure that participants understood the conceptual relationship between these attributes. To this end, participants were shown DOR and OS in a timeline that depicted how DOR was part of OS (Fig. 1) [11]. In addition, we specified plausibility constraints in our experimental design to improve the realism of the choice tasks. For instance, our design ensured that OS was always longer than DOR. Testing during pilot interviews showed that patients understood the relationship between DOR and OS.

Fig. 1 Fig. 1

Example timelines presented in the DCE showing overlap between DOR and OS. Example timelines presented in the DCE to show the overlap between DOR and OS a with 65 months of OS, of which 45 months are spent in remission, and b with 60 months of OS, of which 55 months are spent in remission. Abbreviations: DCE, discrete choice experiment; DOR, duration of remission; OS, overall survival

We expected that sensitivity to marginal improvements in OS would increase with longer DOR. However, no significant interaction effect was detected (p = 0.80), suggesting that participants’ valuation of increases in OS was independent of whether an increase in OS was accompanied by an increase in DOR.

Challenge 3: Immature Efficacy Data

qBRA integrates performance data and preference estimates to generate an overall patient-centered assessment of the benefit-risk balance of a treatment [2]. Where performance outcomes such as OS are immature, as is often the case in oncology [28], it is not possible to generate an overall assessment through standard base case analyses [2]. Alternative analyses are therefore needed to explore how outcome uncertainty impacts patients’ assessment of the benefit-risk balance [2].

In our case application, median OS data were immature for both treatment arms, and median DOR data were immature for the ponatinib + chemotherapy treatment arm (Table 2) [14]. Various approaches are available to estimate the impact of uncertainty. One option would be to reflect this uncertainty in DCE attributes and levels – for instance, defining a level of OS as above a certain level, but uncertain. This would formally elicit patients’ preferences for such uncertainty. However, we adopted an alternative strategy because uncertainty in efficacy is expected to resolve when more clinical data become available. Lower ends of efficacy ranges were informed by trial data, but because medians had not been reached, the upper ends of efficacy ranges were informed by input from key opinion leaders to cover the likely outcome ranges. Different sensitivity analyses were conducted to understand how alternative levels of OS and DOR would impact the benefit-risk balance. Each type of sensitivity analysis provided complementary insights into the predicted uptake of ponatinib relative to imatinib. The deterministic bivariate sensitivity analysis allowed us to directly manipulate the levels of additional OS and DOR to quantify predicted uptake under fixed combinations of these parameters. By showing which combinations of OS and DOR were associated with the same level of predicted uptake, this analysis revealed how participants trade off between gains in OS and DOR. In contrast, the probabilistic sensitivity analysis incorporated uncertainty in the levels of additional OS and DOR by modeling OS and DOR as distributions rather than fixed values. This approach allowed us to estimate predicted uptake under scenarios reflecting “small”, “medium”, and “large” gains in efficacy with ponatinib over imatinib. These sensitivity analyses were used to progressively build a more comprehensive picture of decision-making under different possible conditions. This structured, stepwise approach aligns with the framework described in the ISPOR good practices report for qBRA, in which analyses evolve from simple to more complex [2].

Table 2 Performance matrix comparing ponatinib + chemotherapy vs. imatinib + chemotherapy

First, we ran a deterministic bivariate sensitivity analysis using preference data from the DCE (Online Resource 1; Online Resource 1: Table S1) [11], safety data from the PhALLCON trial [14], and potential ranges for improvements in DOR and OS offered with ponatinib + chemotherapy over imatinib + chemotherapy. Longer DOR with ponatinib was supported by PhALLCON data showing that median DOR was reached for imatinib, but not ponatinib [14]. Longer OS with ponatinib was supported by significantly longer OS in an matching-adjusted indirect treatment comparison [13] and by a higher proportion of ponatinib- than imatinib-treated patients in PhALLCON achieving MRD–ve CR [14], which is strongly correlated with long-term OS [29]. In total, 427 scenarios were run, reflecting 1-month increment increases in OS provided by ponatinib over imatinib over a range of 0–60 months and 10-month increment increases in DOR over a range of 0–60 months.

The bivariate sensitivity analysis indicated that if ponatinib and imatinib had equivalent efficacy, 52.9% (95% confidence interval, 52.5%–53.4%) of study participants would be expected to select ponatinib over imatinib, and that the proportion preferring ponatinib would increase substantially with greater gains in OS and DOR with ponatinib relative to imatinib (Fig. 2). The modest preference for ponatinib over imatinib in the equivalent efficacy scenario was driven by the 13.8% lower risk of myelosuppression with ponatinib than with imatinib, which outweighed the 0.6% higher risk of major CV events with ponatinib (Table 2) [14]. Critically, the sensitivity analysis allowed us to explore how the likelihood of preferring ponatinib increases for different combinations of improvements in OS and/or DOR. For instance, if ponatinib offers 10 months of additional DOR and 20 months of additional OS, the likelihood of ponatinib being preferred would increase to 71.6% (95% confidence interval, 67.2%–76.0%).

Fig. 2 Fig. 2

A bivariate sensitivity analysis of the probability that participants would prefer ponatinib + chemotherapy over imatinib + chemotherapy. Predicted probabilities of preferring ponatinib + chemotherapy over imatinib + chemotherapy under different potential levels of additional OS and DOR provided with ponatinib over imatinib. Shading indicates 95% confidence intervals for 0 (green) and 60 (brown) additional months of DOR provided with ponatinib over imatinib. Abbreviations: DOR, duration of remission; OS, overall survival

We conducted probabilistic sensitivity analyses to explore how variation in efficacy parameters impacted PCPs (Fig. 3–4). To reflect potential differences in the additional OS and DOR provided by ponatinib over imatinib, we drew 10 000 values from correlated triangular distributions of potential additional OS and DOR provided by ponatinib over imatinib, with a minimum difference value of 0 and a maximum difference value of 60 (Online Resource 1). The use of triangular distributions was selected as most appropriate because it ensured that ponatinib could not have worse OS or DOR than imatinib (as was expected based on data from the PhALLCON trial [14, 29]) and that values could not exceed the maximum possible differences in OS and DOR defined in the DCE (Online Resource 1).

Fig. 3 Fig. 3

Probabilistic sensitivity analysis: Distributions of simulated performance under different assumptions of additional OS and DOR. Distributions of simulated additional OS and additional DOR provided by ponatinib over imatinib in 10 000 simulated cases under small gain, medium gain, and large gain scenarios. The small gain scenario assumes that the values of additional OS and DOR provided by ponatinib over imatinib follow a triangular distribution peaking at 15 months. The medium gain scenario assumes a triangular distribution peaking at 30 months, and the large gain scenario assumes a triangular distribution peaking at 45 months. No constraints were imposed on the relationship between OS and DOR, except that the 2 triangular distributions were defined to be highly correlated (correlation = 0.8). Abbreviations: DOR, duration of remission; OS, overall survival

Fig. 4 Fig. 4

Probabilistic sensitivity analysis: Probability density and cumulative distribution functions of predicted choice probabilities for ponatinib. A Probability density function and B complementary cumulative distribution function of predicted probabilities for choosing ponatinib over imatinib under scenarios where ponatinib offers a small gain, medium gain, or large gain in efficacy over imatinib. The small gain scenario assumes that the values of additional OS and DOR provided by ponatinib over imatinib follow a triangular distribution peaking at 15 months. The medium gain scenario assumes a triangular distribution peaking at 30 months, and the large gain scenario assumes a triangular distribution peaking at 45 months. No constraints were imposed on the relationship between OS and DOR, except that the 2 triangular distributions were defined to be highly correlated (correlation = 0.8). Abbreviations: DOR, duration of remission; OS, overall survival

At each iteration, we computed the probability of preferring ponatinib over imatinib. The choice probability for ponatinib, accounting for uncertainty in the additional OS and DOR provided by ponatinib over imatinib, was computed as the average over the 10 000 PCPs.

This analysis was performed under 3 scenarios, where each scenario corresponded to different modes of the triangular distributions (Fig. 3). The values selected as the peaks of the triangular distributions (i.e., 15, 30, and 45 months) were chosen to cover the 0 to 60 month range of differences in OS and DOR values included in the DCE. The “small gain” scenario assumed that the additional gain offered by ponatinib followed a triangular distribution with a peak at 15 months of additional OS and DOR over imatinib (i.e., 15 months was the most common additional gain with ponatinib) (Online Resource 1). The “medium gain” scenario was in between the extreme scenarios, with a triangular distribution peaking at 30 months. The “large gain” scenario assumed that the triangular distribution peaked at 45 months.

These scenarios resulted in different distributions of PCPs (Fig. 4A). The average (standard deviation) PCP for ponatinib was 77.7% (9.3) in the small gain scenario, 81.6% (8.4) in the medium gain scenario, and 84.7% (8.1) in the large gain scenario. These distributions can be used to determine the probability of achieving a minimum level of uptake for a treatment (here, ponatinib). For example, the probability of at least 70% of patients choosing ponatinib over imatinib is 77.5% in the small gain scenario, 89.1% in the medium gain scenario, and 93.0% in the large gain scenario (Fig. 4B).

We explored the impact of preference heterogeneity on PCPs by conducting bivariate sensitivity analyses for subgroups of participants, where subgroups were split based on individual demographic or clinical characteristics (Fig. 5) [2]. These analyses were intended to account for uncertainty in attribute valuation due to the heterogeneous preference weights observed in participants with different personal characteristics (Online Resource 1: Table S2) [11]. For each analyzed subgroup, a bivariate sensitivity analysis was fitted to preference estimates for that subgroup [2]. The overall results – that most study participants would prefer ponatinib over imatinib – held for all analyzed subgroups in nearly all scenarios. Patients aged ≥ 59 years, who placed more weight on DOR and avoiding adverse events, had a small, negative, and statistically insignificant preference for improvements in OS. As such, marginally fewer than 50% of patients aged ≥ 59 years would prefer ponatinib in the scenarios where ponatinib, relative to imatinib, offers larger improvements in OS but either small or no additional DOR. However, most patients aged ≥ 59 years would prefer ponatinib in scenarios where the additional OS offered by ponatinib was coupled with larger improvements in DOR, as was the case in the vast majority of scenarios. Although this analysis highlights choice probabilities in different participant subgroups, the relationships between PCPs and personal characteristics are not necessarily causal.

Fig. 5 Fig. 5

Subgroup analyses of preference heterogeneity: Probabilities that patients with different demographic and clinical characteristics would prefer ponatinib over imatinib. Predicted choice probabilities of preferring ponatinib + chemotherapy over imatinib + chemotherapy. Probabilities are estimated for different potential levels of additional OS and DOR provided with ponatinib over imatinib. Levels of DOR are represented in the height of the colored ribbon, with the bottom of each ribbon representing the minimum additional DOR (0 months) and the top of each ribbon representing the maximum additional DOR (60 months). Abbreviations: DOR, duration of remission; ER, emergency room; G, group; OS, overall survival; Ph + ALL, Philadelphia chromosome-positive acute lymphoblastic leukemia

View original article

THERAPEUTIC INNOVATION & REGULATORY SCIENCE

Share Bookmark

0 0 0 0 0 0 0

More from this channel

Challenges in Conducting Quantitative Patient-Centered Benefit-Risk Assessments: A Case Study in Ph + ALL with Immature Efficacy Data

Comments (0)