Challenges and considerations in non-inferiority trials: a narrative review from statisticians’ perspectives
Review Article

Challenges and considerations in non-inferiority trials: a narrative review from statisticians’ perspectives

Ruizhe Chen1 ORCID logo, Qian Shi2 ORCID logo

1Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University, Baltimore, MD, USA; 2Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN, USA

Contributions: (I) Conception and design: Both authors; (II) Administrative support: Q Shi; (III) Provision of study materials or patients: Both authors; (IV) Collection and assembly of data: Both authors; (V) Data analysis and interpretation: Both authors; (VI) Manuscript writing: Both authors; (VII) Final approval of manuscript: Both authors.

Correspondence to: Qian Shi, PhD. Department of Quantitative Health Sciences, Mayo Clinic, Harwick 6-76, 200 First Street SW, Rochester, MN 55905, USA. Email: shi.qian2@mayo.edu.

Background and Objective: Non-inferiority (NI) study is a popular randomized controlled trial design that aims to demonstrate whether a test treatment, considering its auxiliary benefits, is not unacceptably worse compared to a standard active control treatment. There is extensive work in the literature that discusses NI trials’ merits, issues, and how certain clinical and statistical challenges can be addressed. Here, we are aimed to provide a narrative review of NI studies in terms of its design considerations, potential issues, and corresponding solutions from the perspectives of biostatisticians.

Methods: We conducted a wide literature search on clinical and statistical methodology papers related to NI trials.

Key Content and Findings: In the “Fundamentals of NI study” section, we start from the formulation of the margin and the NI hypothesis test and then focus on the underlying two fundamental assumptions (the constancy assumption and assay sensitivity). We present experts and regulatory agencies’ opinions on how certain statistical issues of NI studies are caused and how they could be addressed. We focus on key aspects of NI studies, which include formulations of an NI hypothesis test, definition of NI margins, determining historical evidence of the active control drug, checking assay sensitivity and constancy assumption, etc. We also briefly touch on topics such as comparisons between the fixed-margin method and the synthesis method for NI evaluation, analysis principle in presence of treatment non-adherence, Bayesian design of NI studies, and restricted mean survival time (RMST) as a measure for designing NI studies. Figures and examples are given throughout the article to better illustrate ideas.

Conclusions: We believe that NI design, with its issues addressed by appropriate statistical and clinical considerations, still plays a pivotal role in clinical research by improving patients’ experience and alleviating healthcare inequalities.

Keywords: Non-inferiority design (NI design); clinical trials; constancy assumption; assay sensitivity


Submitted Aug 10, 2024. Accepted for publication Jan 16, 2025. Published online Feb 24, 2025.

doi: 10.21037/cco-24-84


Introduction

Non-inferiority (NI) study is an active controlled-trial design that aims to demonstrate the NI of the investigational treatment compared to the active comparator. The statistical term of NI refers to that the treatment effect of the regimen under evaluation is not worse than the active control drug by exceeding an acceptable amount. An NI active control study design is often chosen in clinical settings when it would not be ethical to use a placebo, or a no-treatment control, or a very low dose of an active drug, because there is an effective treatment that provides an important benefit (e.g., life-saving, preventing irreversible injury or morbidity) available to patients for the condition to be studied in the trial (1,2). In many situations, an NI trial design is motivated considering that the new treatment can provide certain ancillary benefits compared to the established standard therapeutics. Such benefits include but are not limited to, smaller toxicities, less invasiveness, greater ease of administration, lower procedural risks, favorable costs, or improved convenience. In fact, some noninferiority trials have been criticized for merely studying a new marketable product (“me-too” drugs) without offering any advantages over existing products (3). Therefore, a new treatment regimen shown to be non-inferior to the standard therapy still need to demonstrate certain ancillary benefit(s) for it to be considered a preferred treatment option (4). For example, either infusional fluorouracil plus leucovorin plus oxaliplatin (FOLFOX) or capecitabine (CAPOX) is the standard of care adjuvant treatment for patients with stage III colon cancer. One main side effect of oxaliplatin-based adjuvant therapy is a cumulative sensory neurotoxicity, which can affect patients’ quality of life, even along after the treatment is discontinued. A critical clinical question arises as whether a shortened duration of adjuvant therapy could achieve outcomes as good as those with the standard 6 months of treatment because the incidence and severity of neurotoxicity are correlated with the duration of oxaliplatin-based treatments. The international duration evaluation of adjuvant therapy (IDEA) collaboration was therefore established to investigate whether 3 months of oxaliplatin-based therapy, with reduced toxicity and health-care use, was non-inferior to 6 months through a pre-planned, pooled analysis of six independent multinational trials (5). We use IDEA study as an example to demonstrate key NI study concepts throughout this manuscript.

Some criticisms have been raised over the ethical issue of NI trials claiming that NI trials do not have the intention to show that a new drug is better than the standard drug and that the new drug might be even worse than its comparator (6). Such claims are made without considering the resource scarcity of health care services. NI trials can be ethical considering that health care systems are run with limited resources (7). Furthermore, NI trials can help alleviate health care inequality especially for people in developing worlds where health care is less affordable and accessible. One excellent example is the single dose oral azithromycin versus intramuscular benzathine penicillin study for the treatment of yaws on children in Papua New Guinea. In this open-label randomized NI study, the oral treatment was shown to be non-inferior to the treatment by injection with slightly favorable cure rates. Injections of benzathine penicillin are painful, can potentially transmit other blood-borne infections, and require trained health workers to safely administer treatment. In comparison, azithromycin is well suited to administration by community health volunteers in poor-resource settings where yaws commonly occur. Proving the NI and effectiveness of a single dose of azithromycin in treating yawns is a major advance in the history of the disease and could facilitate its eradication by large-scale treatment campaigns where the availability of an orally effective treatment is the key (8). However, auxiliary benefits of the test drug often come at a cost of loss in treatment efficacy (4,9,10). In fact, the goal of an NI study is to assess whether the potential treatment efficacy loss is statistically significant compared to a margin. In the meantime, it is equally important, if not more, to evaluate statistical results of an NI study while considering clinical relevance. For example, in Rothman et al.’s (11) view, if the new treatment offers a better toxicity profile compared to the active-control or standard treatment, then it may even be considered beneficial when some efficacy is lost.

Due to its many merits, NI trial has been gaining popularity across multiple medical and surgical disciplines and diverse treatment strategies since its inception. For example, Althunian et al. (12) conducted a systematic PubMed search for published randomized, double-blind, NI trials between 1966 and 2015, and found 273 articles for such trials. In a recent search (conducted in October 2024) on clinicaltrials.gov using “Non-inferiority Trial” as the keyword, we found 252 active recruiting and 404 completed NI trials records. With the increasing adoption of NI design in clinical trials, statistical debates persist. As discussed by Fleming et al. (13), five special issues on noninferiority were published by three journals in the mid-2000s: two by Statistics in Medicine [2003, 2006], two by the Journal of Biopharmaceutical Statistics [2004 and 2007], and one by the Biometrical Journal [2005]. Since the onset of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) global pandemic in 2019, several studies investigating NI of coronavirus disease 2019 (COVID-19) vaccines (14,15) have been published on high-impact journals.

Despite its immense growth and popularity over the decades, NI study has faced criticisms in terms of its designs, conducts, and interpretations. There is a rich literature addressing issues and challenges in NI trials (4,9,10,13-29). We address several well-recognized challenges, including determining NI margins, assuring assay sensitivity and constancy assumption. We also evaluate the current state of identified issues of NI trials, incorporating the latest developments in both statistical methodologies and regulatory guidance. While the majority of the paper focuses on the design aspects of NI studies, the first section of the main body details the formulation and testing of the NI hypothesis. We provide insightful recommendations on key design features of NI studies and offer perspectives on future developments in this field. We present this article in accordance with the Narrative Review reporting checklist (available at https://cco.amegroups.com/article/view/10.21037/cco-24-84/rc).


Methods

We conducted wide literature research on PubMed, clinicaltrials.gov, and Google Scholar to find relevant materials of statistical methodologies, regulatory guidance, expert opinions, scholar journal articles, and clinical practice on design, conduct, analysis, and issues of NI trials. The search specifications were included in Table 1.

Table 1

Literature search strategy specifications

Items Specification
Date of search Initial: March 15th, 2022; updated: August 10th, 2024
Databases and other sources searched PubMed, clinicaltrials.gov, Google Scholar
Search terms used Non-inferiority trial; constancy assumption; non-inferiority margin; assay sensitivity; restricted mean survival time; intention-to-treat; oncology; survival analysis, etc.
Timeframe 1949–2024
Inclusion and exclusion criteria Inclusion criteria: only non-inferiority studies, both RCTs and observational studies; published in English. Exclusion criteria: superiority trials, case reports, and case series; studies published in languages other than English
Selection process Selection was conducted through a series of steps involving initial screening, full-text review, consensus meetings, data extraction, quality assessment, and synthesis of findings. The selection process was conducted independently by R.C. and Q.S., and common consensus was reached through extensive discussions

RCT, randomized controlled trial.


Fundamentals of NI study

Hypotheses tests of an NI study

Prior to addressing the issues and challenges associated with NI trials, it is essential to establish a foundation by outlining the fundamental notations, concepts, and formulations pertinent to NI study design. The objective of an NI study is to demonstrate that the test drug has a treatment effect by showing that its effect is sufficiently close to the effect of an active control (1). The active control drug has already been established as the standard therapy for treating certain disease(s). Usually, the effectiveness of the active control was established in one or multiple historical placebo-controlled trials that compared the historical active control drug to the historical placebo. The one-sided NI hypothesis test is formulated such that the differential treatment effect between the test drug and the active control drug is compared to a specified amount called NI margin (denoted by M). The NI margin (M) is generally interpreted as the maximum threshold beyond which the loss of treatment efficacy becomes unacceptable. For alternative interpretations of M, Ng (30) provided a comprehensive list of NI margin definitions discussed in the literature.

Figure 1 shows possible hypothesis test results of an NI study under the hazard ratio (HR) measure, which is commonly used in clinical trials studying time-to-event endpoints. Note, proportional hazards assumption is applied for discussions and examples related to time-to-event endpoints. For our example, the HR measures the relative effect of taking the test drug over the active control drug. An HR greater than 1 indicates a worse treatment outcome under the test drug. The null hypothesis of inferiority is that the test drug is inferior to the active control drug by M or more, and the alternative hypothesis of NI is that the test drug is inferior to the active control drug by less than M. The tests are conducted by comparing the 95% confidence interval (CI) upper bounds of HR with the NI margin (M). In scenario (a), point estimate (of HR) shows the result that favors the control but the upper bound of the 95% CI for HR is greater than M. So, NI is not concluded in this case. In scenario (b), point estimate of HR suggesting equal effect of test and control; upper bound of the 95% CI is below M; NI is concluded. In scenario (c), the test drug shows the favorable result; superiority and NI are both concluded. In scenario (d), HR point estimate equals the NI margin (M); upper bound of the 95% CI is above M; NI is not concluded. Scenario (e) presents an interesting situation. Upper bound of the 95% CI lies exactly on the NI margin (M). A similar case happened in the IDEA study (31) that tested the NI of 3-months of adjuvant chemotherapy versus that of 6 months on a 5-year overall survival (OS) endpoint. The 95% CI upper bound of HR in the overall population was estimated as 1.11 that exactly lies on the prespecified NI margin. As a result, NI was not confirmed for 5-year OS. In scenario (f), point estimate of HR is 1.1, and the active control drug shows. The upper bound of the 95% CI for HR is smaller than M, demonstrating NI (the entire effect of the active control has not been lost); the 95% CI for HR is above one, indicating that the test drug is inferior to the control, even the NI standard is met. In scenario (g), point estimate favors the test drug; NI is concluded, but superiority is not concluded. In concluding NI, one should be cautious of the “bias towards the alternative” when inference is drawn using the intention-to-treat analysis, in presence of treatment non-adherence, as illustrated in Figure 2.

Figure 1 Possible results of an NI study using hazard ratio measure: test drug/active control drug (point estimates and 95% confidence intervals); this figure is inspired by various scenarios of NI study results illustrated in the FDA NI guidance (1). HR, hazard ratio; h(.), hazard function; T, time-to-event variable in the test group; C, time-to-event variable in the control group; M, non-inferiority margin; NI, non-inferiority; FDA, Food and Drug Administration.
Figure 2 The ITT analysis principle; for an NI trial, quality issues could result in treatment groups appearing similar (i.e., biasing the results towards the alternative hypothesis), when the test drug may be inferior. NI, non-inferiority; ITT, intention-to-treat.

In addition to the example illustrated in Figure 1, the NI hypothesis encompasses various formats. These variations arise from differences in: (I) absolute measure versus relative measure; (II) direction of comparisons. For example, the first variation specifies Active Control Drug – Test Drug ≥ NI margin (C−T≥M>0) as the null (1). In this version, a larger numeric value usually represents a favorable clinical outcome: e.g., a higher survival probability. A second variation involves multiplying the hypotheses shown in Figure 1 by negative one with T−C≤−M<0 as the null (4). Such example may be found in an NI study where the restricted mean survival time (RMST) is used as the measure (27). While the aforementioned hypotheses are all valid, researchers should exercise caution in their application, keeping two key considerations in mind: (I) one should carefully decide the direction of treatment comparisons (either C-T or T-C), and understand which direction indicates a favorable outcome for the test drug given the selected end point and measure; (II) consistency in format must be maintained throughout the design, analysis, and interpretation phases. Also note that the above-mentioned NI hypotheses are formulated within the fixed-margin (95%-95%) analysis framework where a fixed-value NI margin (M) is pre-specified before conducting the study. In comparison, the synthesis method tests the NI hypothesis without specifying the control effect or a specific fixed NI margin based on the control effect (1). A later section of this article is dedicated to comparing these two approaches, contributing to numerous significant issues and discussions in the NI statistical literature.

What constitutes an NI margin?—a two-step formulation

Establishing the NI margin is arguably the most challenging yet crucial aspect of designing an NI study. As described in the earlier section, the NI margin represents the largest treatment difference between the active control drug and the test drug that can be regarded clinically acceptable. A valid NI margin (M) should be defined combining historical data with clinical and statistical considerations, should reflect uncertainties in the evidence on which the choice is based, and should be suitably conservative (1,2,32,33). To convey the ideas behind quantification of an NI margin (M), we adopt the M1, M2 margins framework described in the Food and Drug Administration (FDA) NI guidance (1). Let M1 represent the entire effect of the active control assumed to be present in the current NI study. M1 can not be directly measured in the current NI study due to the absence of a concurrent placebo group. Instead, M1 is estimated based on historical trials of the active control drug. For example, the M1 margins, for both 3-year disease-free survival (DFS) and 5-year OS end points, of the IDEA study were determined based on treatment effects estimate of adjuvant chemotherapy with fluoropyrimidines and oxaliplatin demonstrated in the MOSAIC trial (29). Let M2 represent the largest clinically acceptable difference (degree of inferiority) in treatment effect between the test drug and the active control. M2 is often set as some clinically relevant portion (λ) of M1 (M2=λ×M1). This portion of the active control drug effect is deemed important to be retained by the test drug based on clinical considerations. For example, to balance between benefits (relief from neurotoxicity) and cost (loss of 5-year OS efficacy) of a 3-month reduction in oxaliplatin plus fluoropyrimidine exposure, the IDEA study set the maximum acceptable loss of treatment efficacy to 50% of the gain in 5-year OS obtained by adding oxaliplatin to FOLFOX established in the MOSAIC trial (34).

The ICH E10 document (2) describes that setting the NI margin to M2 (M=M2) essentially satisfies two requirements for an NI study under the constancy assumption: (I) “The margin chosen for an NI trial cannot be greater than the smallest effect size that the active drug would be reliably expected to have compared with placebo (M2<M1) in the setting of the planned trial”. (II) The NI margin should be smaller than that suggested by the smallest expected effect size of the active control to ensure some clinically acceptable effect size (or fraction of the control drug effect) was maintained (M2−λ×M1).

Determine historical evidence of the active control drug

With the fixed margin method, the margin M1 is determined based upon the treatment effect estimate(s) of the active control drug using historical studies. These studies serve as historical evidence of sensitivity to drug effects (1), which we refer to as “historical evidence”. Historical evidence of the active control means that the active control drug regularly showed to be superior compared to the placebo based on evidence from historical studies. For example, findings of the survival advantages of 5-fluorouracil and leucovorin (5-FU/LV) in combination with oxaliplatin over that of 5-FU/LV alone, for six months, among patients with stage II and III colon cancer from the MOSAIC trial served as historical evidence for designing the IDEA study. One should consider the following potential issues when evaluating the validity of historical evidence. First, historical evidence can serve as the basis for estimating the effect of the active control drug only when the “constancy assumption” is satisfied. The constancy assumption holds given sufficient similarities between historical trials and the current NI trial with respect to all important study design and conduct features (e.g., effect modifiers) (1). Such design features include the characteristic of the patient population, entry criteria, dose of active control, etc. The European Medicines Agency (EMA) NI margin guidance (32) listed challenges in identifying relevant historical trials, such as selection bias, constancy of trial design and clinical practice over time, constancy of effects over time, and publication bias. Secondly, establishing an early-generation active control drug without sufficient historical evidence could exacerbate a problematic situation known as “Bio-Creep”. ‘Bio-creep’ is a phenomenon when an ineffective or harmful therapy may falsely be deemed efficacious after a series of NI trials with each new test drug being a little worse than its predecessor (35,36). Fleming (36) provided a related example in empirical use of anti-fungal therapy on febrile neutropenic patients. Besides, it was further explained that the issue of subjectiveness in choosing whether to include or exclude certain trials as historical evidence, which could alter the final NI study results (36).

Violation of the constancy assumption and estimation biases of M1

Violations of the constancy assumption undermines the credibility of historical evidence used to determine M1. Fleming (36) described several scenarios that could cause this issue. In another work (13), Fleming et al. emphasized the negative impact of effect modifiers on the constancy assumption noting that the active control’s effect can vary across observed or unobserved patient characteristics (covariates), which can often have heterogeneous distributions between the previous studies and the current NI trial. An example on HER2-neu tumor levels and KRAS gene expression type as treatment effect modifiers was illustrated (13). Besides, bias and variability are two statistical issues associated with the estimation of M1. When multiple historical randomized placebo-controlled are available, a random-effects (or fixed-effects) meta-analysis model can be applied to draw inference on the average effect and the study-to-study variability of the active control (1). The random-effects meta-analysis model incorporates the trial-to-trial variability of the treatment effect random variable, along with with-in study variations, into the estimate of the overall treatment effect of the active control (37). The FDA NI guidance gave an example [Appendix-Example 1(A)] (1); ximelagatran versus warfarin (38) on how M1 can be selected using both fixed-effects and random-effects models. Publication bias is a prevalent issue in the application of meta-analysis (39). Fleming et al. (13) identified additional factors contributing to bias in the estimation of the active control, including: (I) selecting the evidence from historical trials that will yield more favorable estimates of active control’s effect (selection bias); (II) “random high” bias when one selects the best outcome among many estimated outcomes, since what appears to be “best” tends to be an overestimate (40). In the absence of historical placebo-controlled trials for the active control, the FDA NI guidance (1), Section V.1., outlines specific strategies for determining M1. On the other hand, observed heterogeneities in patient population between historical studies and the NI study can seriously undermine the constancy assumption and hence biases the estimation of M1. For example, Fleming (36) showed how heterogeneities in prevalence of vancomycin-resistant enterococci (VRE) between historical trials and the NI trial can result in a biased estimate of the active control efficacy. As a solution, it is crucial to identify potential effect modifiers and develop the statistical analysis plan with appropriate covariate adjustments (1). For example, Zhang et al. (41) proposed a method that deals with possible violations of the constancy assumption due to imbalances in unmeasured covariates after adjusting for the measured covariates.

Discounting, preservation, and issues in determining M2

An NI margin cannot be larger than the smallest effect size that the active control drug would be reliably expected to have compared to a placebo (2), that is, M<M1. Besides, ICH E10 (2) suggests that the determination of an NI margin should be suitably conservative. One method to ensure conservativeness is to discount the estimated M1 by a fraction (1,30) to alleviate concerns over the constancy assumption being violated due to differences in patient populations or clinical conducts. This step reduces the magnitude of the active control’s effect estimated from historical evidence to adjust for a potential loss in effect size. For example, André et al. (5) reported that the rate of 3-year DFS was 78.2% in the 6 months FOLFOX treated patients. Subsequently, André et al. (42) reported a reduced 3-year DFS of 76% among patients treated with 6 months of FOLFOX. Hence, failure to discount M1 can cause an overestimated effect size. In addition, the chosen NI margin is usually smaller than the discounted M1 because of interest in ensuring certain clinically acceptable effect size (or fraction of the control drug effect) is preserved (2), resulting in M2. An example of such derivation applied in the IDEA study is illustrated in Figure 3 (34,43,44). It is important to distinguish the concept of discounting, introduced in (45), from that of preservation although they are mathematically indistinguishable (due to associate property of multiplication) (30,46). Discounting addresses the concern that the actual effect size of the active control in the NI study might be smaller than that estimated based on historical evidence. Preservation, however, is a clinical judgment that reflects the amount of effect loss that would be clinically acceptable considering both risks and benefits associated with the test drug (1). A potential issue in deciding preservation fraction is reliance on subjective judgment. As pointed out by Althunian et al. (47), the preserved fraction is often arbitrarily chosen based on data that experts consider to be clinically relevant, and this subjectivity is reflected in the variability of the preserved fractions that were used in some published NI trials (0–85%) (48,49). In common practice, 50% (on a relative scale) is chosen as the preservation fraction for certain cardiovascular studies (1,4). In antibiotic trials, a 10–15% NI margin (on the risk difference scale) for the treatment difference (M2) is commonly chosen (1). Althunian et al. (47) suggest that the choice of the preserved fraction in general should depend on the effect size of the active control(s), how much effect size of the active control that stakeholders are willing to lose to fulfil an unmet medical need, and the feasibility of the trial (sample size) (11,12,50,51). Additionally, Section IV.D. of the FDA NI guidance (1) provides detailed recommendations on how to appropriately select M2.

Figure 3 Surgery plus 5-FU and LV was shown superior to surgery alone (43). The NI margin for the 3-year DFS endpoint of the IDEA study (44) was determined based on 3-year DFS rates reported in the historical MOSAIC study (34). NI, non-inferiority; 5-FU, 5-fluorouracil; LV, leucovorin; DFS, disease-free survival; IDEA, international duration evaluation of adjuvant therapy.

On assay sensitivity of an NI study

The FDA NI guidance (1) emphasized that historical evidence, constancy assumption, and quality of the new trial are three key considerations in concluding that an NI trial has assay sensitivity. Assay sensitivity is a property of a clinical trial defined as the ability to distinguish an effective treatment from a less effective or ineffective treatment (2). With a slight abuse of terminology, we use “clinical significance” in place of “assay sensitivity” for rest of this article. In an NI study setting, clinical significance is defined as the demonstration that, had a placebo been included, the difference between the control drug and placebo would have been at least M1. Clinical significance is especially crucial for an NI study which aims to establish relative clinical evidence without directly comparing the test drug with a placebo. A successful superiority trial should have demonstrated clinical significance. However, A “successful” NI trial that shows an acceptably small treatment difference between the control and the test drugs, may or may not have clinical significance and therefore may or may not support a conclusion that the test drug is effective (superior to a placebo) (1). Without including a placebo arm, clinical significance cannot be proved in the NI study rather than assumed relying on external knowledge such as historical evidence (1). If the active control drug did not have any of its expected effect in the NI study, then both the active control and the test drug are similarly ineffective even if “non-inferiority” is concluded. As pointed out by Sankoh (52), even the superiority of the experimental drug to the active control does not guarantee superiority of either drug to placebo. Figure 4 summarizes the key components, and their relationships discussed above, whose understanding are necessary for a valid NI study design.

Figure 4 Key components for designing an NI study (fixed-margin method) and their relationships. NI, non-inferiority.

The fixed margin method

The NI margin determination procedures described in the previous sections constitute the design aspect of a fixed-margin approach, where a fixed-value NI margin (M2) is determined prior to conducting the NI trial. The hypothesis test under a fixed-margin approach is conducted by comparing the CI upper bound, at a sufficient level of confidence, with the fixed margin (1). When 95% confidence level is chosen for the NI hypothesis testing, and a 95% lower CI bound (LCB) is chosen as the estimated M1 (with potential preservation adjustments), we refer to this approach as the ‘95%-95%’ method (1). Note that the 95% CIs mentioned here are two-sided, resulting in an effective significance level of 97.5%. A less conservative alternative to the 95%-95% method is to choose the point estimate of the active control treatment effect as M1. James Hung et al. (16), compared rejection regions between these two design approaches (point estimate versus the 95% LCB) and showed that the lower bound approach is always more conservative compared to the one based on point estimates (less than 2.5% versus greater than 2.5% type I error). One caveat about the 95%-95% method is the interpretation of the Cis. The FDA NI draft guidance incorrectly suggested that the 95% CI lower bound (for NI margin determination) is a lower bound for the actual effect of the active control in the NI trial (1). This was corrected in the final version of the guidance (1) as: the 95% CI lower bound (for NI margin determination) is a bound for the average effect of the active control in historical trials or the true effect in the NI trial given the constancy assumption holds. To bound the actual effect, a prediction interval is required. For example, Brittain et al. (37) proposed a method that computes the prediction interval for treatment effect of active control minus the absent placebo in the NI trial.

The synthesis method

Another design and analysis paradigm of NI trials is the synthesis method, which is also often referred to as the imputed placebo approach, the preservation test method, or the cross-trial comparison approach. It was first proposed by Holmgren (53) as a procedure for establishing one-sided equivalence (NI) that determines whether a specified percentage of the treatment effect of a known active agent over placebo is maintained. For testing of effect retention, the synthesis/preservation test method does not require setting the value of the NI margin (neither M1 nor M2 directly) but needs only specification of percent preservation, i.e., M2/M1 (1,16). Figure 5 shows the key design procedures and analysis components of an NI study, planned to use either the fixed-margin method or the synthesis method.

Figure 5 Comparison of the fixed-margin method and the synthesis method in terms of key NI study design features. H0, null hypothesis; M1, the entire effect of the active control assumed to be present in the current NI study; M2, the largest clinically acceptable difference (degree of inferiority) in treatment effect between the test drug and the active control; λ, clinically relevant portion; NI, non-inferiority.

Comparing two approaches: pros and cons

There have been heated debates from 2000s to 2010s over the merits and demerits of both approaches. The fixed-margin NI analysis method is often criticized for two reasons. Firstly, the NI hypothesis formulated under the fixed-margin method seemingly violates basic frequentist statistical principle (16,17,25). Secondly, the fixed-margin method seems too conservative and therefore less efficient. Holmgren (53) showed that the probability of establishing one-sided equivalence (NI) under the fixed-margin approach is less than that under the synthesis approach. James Hung et al. (16) further showed that the type-I error rate yielded by the fixed-margin approach is always less than or equal to 2.5% while that of the synthesis approach can be obtained exactly at 2.5%. On the other hand, Fleming (36) argued that the synthesis method falls short to address that the test drug would need to have an acceptable benefit-to-risk profile relative to that of active control, which can be solved by the “setting the margin” approach. The FDA NI guidance (1) describes the disadvantage of the synthesis approach is that it is not possible to use clinical judgment to choose M2, based on the magnitude of M1, in planning of the NI study. Such M2, however, cam provide critical clinical relevance in the decision-making process of an NI trial design. Hung et al. (17) also pointed out the synthesis method that controls the unconditional type I error rate at a fixed level does not provide a fixed NI margin for assessing clinical relevance. Attempting to use a “non-fixed margin”, under the synthesis method framework, to plan the NI trial can cause a problem say when a larger NI trial requires to rule out a smaller NI margin (17). More importantly, the synthesis method is more susceptible to violations of the constancy assumption. As shown in Wang et al. (29), when the effect of the active control is smaller in the noninferiority trial than it would have been in the historical trials, the unconditional type I error rate will inflate rapidly. In comparison the “95%-95%” method may be less sensitive to the constancy condition, depending on the level of preservation specified in defining M2 (29).

The FDA guidance (1) regards the two separate processes (margin determination and NI hypothesis testing) of the fixed-margin approach is advantageous in two perspectives: (I) an NI margin is clinically understandable and serve as a basis for planning the sample size of the NI trial to achieve desired alpha level and power; (II) easily interpretable and flexibility in design adjustment (less or more conservative). For the synthesis method, its disadvantage is that “it is not possible to use clinical judgment to choose M2, based on the magnitude of M1, in advance of the NI trial” (1). However, it “can lead to a more efficiently designed study (e.g., by allowing for a reduction in sample size or achieving greater power for a given sample size) than the fixed-margin approach, provided the constancy assumption holds”. The EMA NI margin guideline (32) shared a preference of the fixed-margin approach over the synthesis method: “In order to demonstrate NI, the recommended approach is to pre-specify a margin of NI in the protocol…”. On the other hand, extension of the CONSORT 2010 Statement (33) expressed preference of the synthesis method (putative placebo method) by stating that the fixed-margin method “might show an ineffective new treatment as noninferior if the margin is too large in relation to the effect of the reference treatment compared with placebo”, and that the synthesis method “should be used if possible if the noninferiority trial is aimed for drug approval”.

RMST as an absolute metric for NI trials

The RMST (54) measure has gained many interests for designing and analyzing oncology trials with time-to-event endpoints due to its intuitive clinical interpretation and potentially high statistical power (55-59). An NI margin measured in RMST difference (DRMST) can be interpreted as the maximally acceptable loss in treatment effects in terms of expected survival time difference, for a limited time. For example, when evaluating a 3-year OS end point, a DRMST NI margin of 1 month means the maximally acceptable loss of survival time for patients treated with the test drug is 1 month less compared to those treated with active control by year 3. As a relative measure, HR does not necessarily carry the same absolute effect size under different distributional assumptions, which can confound the interpretation of treatment effects. In comparison, DRMST can exhibit such differences as illustrated in Figure 6. Recent theoretical and numerical works have shown that statistical power of RMST based hypothesis tests, depending on the validity of underlying proportional hazards (PH) assumption, is as good as or better than those HR based tests (60-62). Recently, some discussions have been focused on the merits of using RMST over HR in designing NI studies. In particular, Freidlin et al. (62) have found that in scenarios with large NI margins and low event rates, and limited follow-up past the clinical cut-off time, conclusions are reversed so that DRMST has a power advantage over HR, assuming the PH assumption holds. Quartagno et al. (58) further investigated this power difference through the concept of NI frontiers (63), which is a curve defined as the most appropriate NI margin for each possible value of control event risk. They explained that, even when the NI margins match (60,64), the power difference exists because the null hypotheses are curves in space (NI frontiers), rather than single points, which are simply used as assumptions for the purpose of designing a frequentist trial (58). Therefore, it is crucial to understand the different null hypotheses implied by different population-level summary measures when interpreting the power differences. Another difficulty in implementing the RMST measure for the design of NI studies comes from the requirement of deciding a clinical cut-off time (denoted τ), which may differ from the duration of accrual and follow-up, for choosing the NI margin and conducting statistical inferences. The study team needs to predetermine τ at which the NI of the test treatment is evaluated combining both clinical and statistical considerations. Making such decision is challenging, and can be subjective due to a lack of principles and guidelines. However, RMST does serve an alternative for designing NI trials where the PH assumption is presumably unreasonable, e.g., with delayed treatment effects observed in immunotherapy trials. Yet, choosing an appropriate τ is still challenging. For example, if the two survival curves presumably cross at a certain time point and then separate out, a τ chosen further after the crossing point can reverse the conclusion compared to one chosen shortly after or before the crossing point. We see great promise in future developments of statistical methodologies in designing and analyzing NI trials under the RMST measure, and hope to see more discussions and solutions to the above discussed issues.

Figure 6 For a time-to-event endpoint, the RMSTD measure provides more clinical relevance for interpretations compared to HR; huge difference in survival characteristics (average life expectancy within 5 years) are hidden by the two equivalent HRs. RMSTD, restricted mean survival time difference; HR, hazard ratio; T, time-to-event variable of the test drug group; C, time-to-event variable of the control drug group; T/C, ratio of time-to-event variable of the test drug group over that of the control drug group.

Conclusions

In this article, we narrated important issues and discussions in the design, conduct, and analysis of NI trials. We summarize our interpretations of the key design features and considerations for an NI study in a relationship diagram. Over decades of research and applications of NI studies in a broad range of disease settings, critical methodologies have been developed and comprehensive discussions of issues with corresponding solutions have been presented in the literature. For example, there are topics related to safety NI trial (65), alternative margin determination and NI analysis regimens (16,17,25,66), placebo-creep (58), relative versus absolute measures (3), surrogate endpoint (36,67,68), etc. Bayesian designs and Bayesian approaches to determining the NI margins further provide impactful innovative alternative methods (69), including recent contributions on borrowing historical data to implement NI studies and compute NI margins (70), as well as contributions on the sample size determination of an NI trial with more than two treatment groups (71). We encourage clinicians to consider favorably of NI design in achieving their research goals given appropriate considerations.


Acknowledgments

None.


Footnote

Reporting Checklist: The authors have completed the Narrative Review reporting checklist. Available at https://cco.amegroups.com/article/view/10.21037/cco-24-84/rc

Peer Review File: Available at https://cco.amegroups.com/article/view/10.21037/cco-24-84/prf

Funding: None.

Conflicts of Interest: Both authors have completed the ICMJE uniform disclosure form (available at https://cco.amegroups.com/article/view/10.21037/cco-24-84/coif). Q.S. serves as an unpaid editorial board member of Chinese Clinical Oncology from January 2023 to December 2024. The other author has no conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.


References

  1. U.S. Department of Health and Human Services. Non-Inferiority Clinical Trials to Establish Effectiveness. Guidance for Industry. Silver Spring, MD: Food and Drug Administration; 2016.
  2. ICH Harmonised Tripartite Guideline. Choice of Control Group and Related Issues in Clinical Trials. ICH-secretariat; Geneva; 2000.
  3. Palmas W. The CONSORT guidelines for noninferiority trials should be updated to go beyond the absolute risk difference. J Clin Epidemiol 2017;83:6-7. [Crossref] [PubMed]
  4. Head SJ, Kaul S, Bogers AJ, et al. Non-inferiority study design: lessons to be learned from cardiovascular trials. Eur Heart J 2012;33:1318-24. [Crossref] [PubMed]
  5. André T, Boni C, Navarro M, et al. Improved overall survival with oxaliplatin, fluorouracil, and leucovorin as adjuvant treatment in stage II or III colon cancer in the MOSAIC trial. J Clin Oncol 2009;27:3109-16. [Crossref] [PubMed]
  6. Garattini S, Bertele' V. Non-inferiority trials are unethical because they disregard patients' interests. Lancet 2007;370:1875-7. [Crossref] [PubMed]
  7. Chuang-Stein C, Beltangady M, Dunne M, et al. The ethics of non-inferiority trials. Lancet 2008;371:895-6; author reply 896-7. [Crossref] [PubMed]
  8. Kwakye-Maclean C, Agana N, Gyapong J, et al. A Single Dose Oral Azithromycin versus Intramuscular Benzathine Penicillin for the Treatment of Yaws-A Randomized Non Inferiority Trial in Ghana. PLoS Negl Trop Dis 2017;11:e0005154. [Crossref] [PubMed]
  9. Mauri L, D'Agostino RB Sr. Challenges in the Design and Interpretation of Noninferiority Trials. N Engl J Med 2017;377:1357-67. [Crossref] [PubMed]
  10. Macaya F, Ryan N, Salinas P, et al. Challenges in the Design and Interpretation of Noninferiority Trials: Insights From Recent Stent Trials. J Am Coll Cardiol 2017;70:894-903. [Crossref] [PubMed]
  11. Rothmann M, Li N, Chen G, et al. Design and analysis of non-inferiority mortality trials in oncology. Stat Med 2003;22:239-64. [Crossref] [PubMed]
  12. Althunian TA, de Boer A, Klungel OH, et al. Methods of defining the non-inferiority margin in randomized, double-blind controlled trials: a systematic review. Trials 2017;18:107. [Crossref] [PubMed]
  13. Fleming TR, Odem-Davis K, Rothmann MD, et al. Some essential considerations in the design and conduct of non-inferiority trials. Clin Trials 2011;8:432-9. [Crossref] [PubMed]
  14. Durkalski V, Silbergleit R, Lowenstein D. Challenges in the design and analysis of non-inferiority trials: a case study. Clin Trials 2011;8:601-8. [Crossref] [PubMed]
  15. Blackwelder WC. Current issues in clinical equivalence trials. J Dent Res 2004;83 Spec No C:C113-5.
  16. James Hung HM, Wang SJ, Tsong Y, et al. Some fundamental issues with non-inferiority testing in active controlled trials. Stat Med 2003;22:213-25. [Crossref] [PubMed]
  17. Hung HM, Wang SJ, O'Neill R. Issues with statistical risks for testing methods in noninferiority trial without a placebo ARM. J Biopharm Stat 2007;17:201-13. [Crossref] [PubMed]
  18. Pocock SJ, Clayton TC, Stone GW. Challenging Issues in Clinical Trial Design: Part 4 of a 4-Part Series on Statistics for Clinical Trials. J Am Coll Cardiol 2015;66:2886-98. [Crossref] [PubMed]
  19. Fleming TR, Powers JH. Issues in noninferiority trials: the evidence in community-acquired pneumonia. Clin Infect Dis 2008;47:S108-20. [Crossref] [PubMed]
  20. Burger HU, Beyer U, Abt M. Issues in the assessment of non-inferiority: perspectives drawn from case studies. Pharm Stat 2011;10:433-9. [Crossref] [PubMed]
  21. Morland LA, Greene CJ, Rosen C, et al. Issues in the design of a randomized noninferiority clinical trial of telemental health psychotherapy for rural combat veterans with PTSD. Contemp Clin Trials 2009;30:513-22. [Crossref] [PubMed]
  22. Gøtzsche PC. Lessons from and cautions about noninferiority and equivalence randomized trials. JAMA 2006;295:1172-4. [Crossref] [PubMed]
  23. Greene CJ, Morland LA, Durkalski VL, et al. Noninferiority and equivalence designs: issues and implications for mental health research. J Trauma Stress 2008;21:433-9. [Crossref] [PubMed]
  24. D'Agostino RB Sr, Massaro JM, Sullivan LM. Non-inferiority trials: design concepts and issues - the encounters of academic consultants in statistics. Stat Med 2003;22:169-86. [Crossref] [PubMed]
  25. Snapinn S, Jiang Q. Remaining Challenges in Assessing Non-Inferiority. Ther Innov Regul Sci 2014;48:62-7. [Crossref] [PubMed]
  26. Xie X, Wang M, Ng V, et al. Some issues for the evaluation of noninferiority trials. J Comp Eff Res 2018;7:835-43. [Crossref] [PubMed]
  27. Tsong Y, Wang SJ, Hung HM, et al. Statistical issues on objective, design, and analysis of noninferiority active-controlled clinical trial. J Biopharm Stat 2003;13:29-41. [Crossref] [PubMed]
  28. Munk A, Trampisch HJ. Therapeutic equivalence--clinical issues and statistical methodology in noninferiority trials. Biom J 2005;47:7-9; discussion 99-107. [Crossref] [PubMed]
  29. Wang SJ, Hung HM, Tsong Y. Utility and pitfalls of some statistical methods in active controlled clinical trials. Control Clin Trials 2002;23:15-28. [Crossref] [PubMed]
  30. Ng TH. Noninferiority testing in clinical trials: issues and challenges. 1st Edition. New York: ImprintChapman and Hall/CRC; 2014.
  31. André T, Meyerhardt J, Iveson T, et al. Effect of duration of adjuvant chemotherapy for patients with stage III colon cancer (IDEA collaboration): final results from a prospective, pooled analysis of six randomised, phase 3 trials. Lancet Oncol 2020;21:1620-9. [Crossref] [PubMed]
  32. European Medicines Agency's (EMA) Committee for Medicinal Products for Human Use (CHMP): Guideline on the Choice of the Non-inferiority Margin. London; 2005.
  33. Piaggio G, Elbourne DR, Altman DG, et al. Reporting of noninferiority and equivalence randomized trials: an extension of the CONSORT statement. JAMA 2006;295:1152-60. [Crossref] [PubMed]
  34. André T, Boni C, Mounedji-Boudiaf L, et al. Oxaliplatin, fluorouracil, and leucovorin as adjuvant treatment for colon cancer. N Engl J Med 2004;350:2343-51. [Crossref] [PubMed]
  35. Everson-Stewart S, Emerson SS. Bio-creep in non-inferiority clinical trials. Stat Med 2010;29:2769-80. [Crossref] [PubMed]
  36. Fleming TR. Current issues in non-inferiority trials. Stat Med 2008;27:317-32. [Crossref] [PubMed]
  37. Brittain EH, Fay MP, Follmann DA. A valid formulation of the analysis of noninferiority trials under random effects meta-analysis. Biostatistics 2012;13:637-49. [Crossref] [PubMed]
  38. Kaul S, Diamond GA, Weintraub WS. Trials and tribulations of non-inferiority: the ximelagatran experience. J Am Coll Cardiol 2005;46:1986-95. [Crossref] [PubMed]
  39. Peters JL, Sutton AJ, Jones DR, et al. Comparison of two methods to detect publication bias in meta-analysis. JAMA 2006;295:676-80. [Crossref] [PubMed]
  40. Fleming TR. Clinical trials: discerning hype from substance. Ann Intern Med 2010;153:400-6. [Crossref] [PubMed]
  41. Zhang Z, Nie L, Soon G, et al. Sensitivity Analysis in Non-Inferiority Trials with Residual Inconstancy After Covariate Adjustment. Journal of the Royal Statistical Society Series C: Applied Statistics 2014;63:515-38. [Crossref]
  42. André T, Vernerey D, Mineur L, et al. Three Versus 6 Months of Oxaliplatin-Based Adjuvant Chemotherapy for Patients With Stage III Colon Cancer: Disease-Free Survival Results From a Randomized, Open-Label, International Duration Evaluation of Adjuvant (IDEA) France, Phase III Trial. J Clin Oncol 2018;36:1469-77. [Crossref] [PubMed]
  43. Moertel CG, Fleming TR, Macdonald JS, et al. Levamisole and fluorouracil for adjuvant therapy of resected colon carcinoma. N Engl J Med 1990;322:352-8. [Crossref] [PubMed]
  44. André T, Iveson T, Labianca R, et al. The IDEA (International Duration Evaluation of Adjuvant Chemotherapy) Collaboration: Prospective Combined Analysis of Phase III Trials Investigating Duration of Adjuvant Therapy with the FOLFOX (FOLFOX4 or Modified FOLFOX6) or XELOX (3 versus 6 months) Regimen for Patients with Stage III Colon Cancer: Trial Design and Current Status. Curr Colorectal Cancer Rep 2013;9:261-9. [Crossref] [PubMed]
  45. Snapinn SM. Alternatives for discounting in the analysis of noninferiority trials. J Biopharm Stat 2004;14:263-73. [Crossref] [PubMed]
  46. Snapinn S, Jiang Q. Preservation of effect and the regulatory approval of new treatments on the basis of non-inferiority trials. Stat Med 2008;27:382-91. [Crossref] [PubMed]
  47. Althunian TA, de Boer A, Mantel-Teeuwisse AK, et al. Assessment of the Regulatory Dialogue Between Pharmaceutical Companies and the European Medicines Agency on the Choice of Noninferiority Margins. Clin Ther 2020;42:1588-94. [Crossref] [PubMed]
  48. Rehal S, Morris TP, Fielding K, et al. Non-inferiority trials: are they inferior? A systematic review of reporting in major medical journals. BMJ Open 2016;6:e012594. [Crossref] [PubMed]
  49. Hernandez AV, Pasupuleti V, Deshpande A, et al. Deficient reporting and interpretation of non-inferiority randomized clinical trials in HIV patients: a systematic review. PLoS One 2013;8:e63272. [Crossref] [PubMed]
  50. Althunian TA, de Boer A, Groenwold RHH, et al. Using a single noninferiority margin or preserved fraction for an entire pharmacological class was found to be inappropriate. J Clin Epidemiol 2018;104:15-23. [Crossref] [PubMed]
  51. Althunian TA, de Boer A, Groenwold RHH, et al. Defining the noninferiority margin and analysing noninferiority: An overview. Br J Clin Pharmacol 2017;83:1636-42. [Crossref] [PubMed]
  52. Sankoh AJ. A note on the conservativeness of the confidence interval approach for the selection of non-inferiority margin in the two-arm active-control trial. Stat Med 2008;27:3732-42. [Crossref] [PubMed]
  53. Holmgren EB. Establishing equivalence by showing that a specified percentage of the effect of the active control over placebo is maintained. J Biopharm Stat 1999;9:651-9. [Crossref] [PubMed]
  54. IRWIN JO. The standard error of an estimate of expectation of life, with special reference to expectation of tumourless life in experiments with mice. J Hyg (Lond) 1949;47:188. [Crossref] [PubMed]
  55. Trinquart L, Jacot J, Conner SC, et al. Comparison of Treatment Effects Measured by the Hazard Ratio and by the Ratio of Restricted Mean Survival Times in Oncology Randomized Controlled Trials. J Clin Oncol 2016;34:1813-9. [Crossref] [PubMed]
  56. Abulizi X, Ribaudo HJ, Flandre P. The Use of the Restricted Mean Survival Time as a Treatment Measure in HIV/AIDS Clinical Trial: Reanalysis of the ACTG A5257 Trial. J Acquir Immune Defic Syndr 2019;81:44-51. [Crossref] [PubMed]
  57. Pak K, Uno H, Kim DH, et al. Interpretability of Cancer Clinical Trial Results Using Restricted Mean Survival Time as an Alternative to the Hazard Ratio. JAMA Oncol 2017;3:1692-6. [Crossref] [PubMed]
  58. Quartagno M, Morris TP, White IR. Why restricted mean survival time methods are especially useful for non-inferiority trials. Clin Trials 2021;18:743-5. [Crossref] [PubMed]
  59. A'Hern RP. Restricted Mean Survival Time: An Obligatory End Point for Time-to-Event Analysis in Cancer Trials? J Clin Oncol 2016;34:3474-6. [Crossref] [PubMed]
  60. Weir IR, Trinquart L. Design of non-inferiority randomized trials using the difference in restricted mean survival times. Clin Trials 2018;15:499-508. [Crossref] [PubMed]
  61. Tian L, Fu H, Ruberg SJ, et al. Efficiency of two sample tests via the restricted mean survival time for analyzing event time observations. Biometrics 2018;74:694-702. [Crossref] [PubMed]
  62. Freidlin B, Hu C, Korn EL. Are restricted mean survival time methods especially useful for noninferiority trials? Clin Trials 2021;18:188-96. [Crossref] [PubMed]
  63. Quartagno M, Walker AS, Babiker AG, et al. Handling an uncertain control group event risk in non-inferiority trials: non-inferiority frontiers and the power-stabilising transformation. Trials 2020;21:145. [Crossref] [PubMed]
  64. Chen R, Basu S, Meyers JP, et al. Conversion of non-inferiority margin from hazard ratio to restricted mean survival time difference using data from multiple historical trials. Stat Methods Med Res 2022;31:1819-44. [Crossref] [PubMed]
  65. Gaffney M. Statistical issues in the design, conduct and analysis of two large safety studies. Clin Trials 2016;13:513-8. [Crossref] [PubMed]
  66. Soon G, Zhang Z, Tsong Y, et al. Assessing overall evidence from noninferiority trials with shared historical data. Stat Med 2013;32:2349-63. [Crossref] [PubMed]
  67. Gentile I, Borgia G. Surrogate endpoints and non-inferiority trials in chronic viral hepatitis. J Hepatol 2010;52:778. [Crossref] [PubMed]
  68. Bikdeli B, Caraballo C, Welsh J, et al. Non-inferiority trials using a surrogate marker as the primary endpoint: An increasing phenotype in cardiovascular trials. Clin Trials 2020;17:723-8. [Crossref] [PubMed]
  69. Gamalo MA, Wu R, Tiwari RC. Bayesian approach to non-inferiority trials for normal means. Stat Methods Med Res 2016;25:221-40. [Crossref] [PubMed]
  70. Mariani F, De Santis F, Gubbiotti S. A dynamic power prior approach to non-inferiority trials for normal means. Pharm Stat 2024;23:242-56. [Crossref] [PubMed]
  71. Tang N, Yu B. Bayesian sample size determination in a three-arm non-inferiority trial with binary endpoints. J Biopharm Stat 2022;32:768-88. [Crossref] [PubMed]
Cite this article as: Chen R, Shi Q. Challenges and considerations in non-inferiority trials: a narrative review from statisticians’ perspectives. Chin Clin Oncol 2025;14(1):8. doi: 10.21037/cco-24-84

Download Citation