The pros and cons of noninferiority trials
Medical Statistics Unit, Department of Epidemiology and Population Health, London School of Hygiene andTropical Medicine, Keppel Street, London, WC1E 7HT UK
Noninferiority trials comparing new treatment with an active standard control are
becoming increasingly common. This article discusses relevant issues regarding their
need, design, analysis and interpretation: the appropriate choice of control group,
types of noninferiority trial, ethical considerations, sample size determination and
Received 12 January 2001;revised 1 March 2001;accepted 13 January 2003
Correspondence and reprints:[email protected]
(or not) of a noninferiority trial strategy will depend on
the particular circumstances. While general guidance
The term ‘noninferiority trial’ is commonly used to refer
can be given, the relative merits of noninferiority active
to a randomized clinical trial in which a new test
control trials or placebo-controlled trials aimed at
treatment is compared with a standard active treatment
demonstrating superiority for evaluating any specific
rather than a placebo or untreated control group. A prior
new treatment rests on a complex of issues requiring
judgement is made, that for the new treatment to be of
wise judgements and continued open debate.
merit it only needs to be as good as the active controlregarding appropriate outcome measure(s) of response.
C H O I C E O F C O N T R O L G R O U P :
While the superiority of the new treatment over active
control would be an added (perhaps unrealistic) advant-age, the clear demonstration of noninferiority in one or
One starting principle is that no patient is denied a
more specific criteria of patient response is the desirable
known effective treatment by entering a clinical trial. An
goal which motivates such a trial. The term ‘equivalence
equally important principle is that the degree of scientific
trial’ is sometimes used in this context but does not
rigour adopted in the evaluation of a new treatment is
reflect so well the (usually) one-sided nature of this
sufficient to prevent any ineffective, unsafe or inferior
noninferiority question, and is implicitly dismissive of the
treatments obtaining regulatory approval or gaining
desirable option that the new treatment could actually
widespread use. Both principles highlight the ethical
be superior to the active control treatment. So the more
responsibility of a society and its medical researchers to
appropriate term ‘noninferiority’ is used hereon.
facilitate the best possible health care at present and also
The aim of this article is to present a balanced view of
the role of noninferiority trials in the development of safe
The first principle is more easily grasped because it
and effective new treatments. This involves a mix of
relates immediately to the individual rights of the next
ethical, scientific, statistical and practical considerations.
patient. Of importance here is the distinction between a
This article elucidates some of the pros and cons of
treatment known to be effective, and one thought to
noninferiority trials and offers some pointers on how
be effective, hoped to be more effective, believed to
to enhance their public health value. The desirability
be effective or in widespread use without evidence of
Ó 2003 Blackwell Publishing Fundamental & Clinical
effectiveness. Arguments against the use of placebo
This ongoing dilemma for clinical trials research can
controls are put forward because treatment practice
be summarized by the wish to avoid two types of error:
involves other active treatments. The question is: how
a type I error would be the acceptance of a useless
convincing is the evidence that such active treatments
treatment into widespread use, and one needs to
are better than placebo in aspects that genuinely benefit
consider the increased risk of this error occurring by
patient welfare? While one needs to consider the
not using placebo controls and instead pursuing a
understandable wish to do something positive for every
noninferiority (equivalence) trial design with an active
patient, one needs to draw a clear distinction between
control group. The consequences of such an error will
desire for benefit in a supposedly active potential control
depend on the nature of the treatment and disease. If the
treatment and hard evidence of benefit from previous
ineffective treatment has substantial side-effects then
great harm could ensue, if it is expensive then it detracts
So, what is the extent of evidence? Is it ‘proof beyond
from more fruitful use of health care costs, if it is a safe,
reasonable doubt’ of patient benefit derived from several
useless, cheap ‘placebo’ for a minor condition then
large studies generalizable to the relevant patient pop-
ulation or is it just one or two statistically significant
Even if there is an effective active control treatment,
results, perhaps on short-term studies of limited size
there can still be problems in the design, conduct,
studying surrogate end points rather than overall patient
analysis and interpretation of a noninferiority trial that
benefit? P < 0.05 for a treatment difference in a clinical
could lead to such a type I error. Such problems are
trial, does not equate with proof of effect.
outlined in the rest of this article.
Even the relevance of P < 0.0001 for a treatment
A type II error is the failure to use an effective active
difference can be questioned on several grounds: might
control treatment by adopting a placebo control group
the trial have been biased in some aspect of its design or
instead. As expressed above, the degree of certainty with
analysis; were the patients, the delivery of treatment, the
which such an error occurs depends on the extent of
outcome measure used and the length of follow-up
prior knowledge that the active control is truly effective.
sufficiently relevant to normal clinical practice and
In addition, the severity of this error will depend on
patient benefit; was the absolute magnitude of benefit
particular circumstances. At one extreme it would be
sufficiently large taking account of any adverse side-
absolutely intolerable to deny a known effective agent
effects of a treatment; how many trials were performed
that reduces mortality in a rapidly progressing life-
and on how many patients. That is, in sizing up the
threatening condition. However, in a more minor
evidence that a pre-existing active treatment is superior
ailment in which recovery often happens on placebo or
to placebo one needs to exercise one’s constructive
no treatment, the denial of a known active agent in a
critical faculties when appraising its clinical trial
short-term placebo-controlled trial, after which all
evidence for internal validity, external validity, overall
patients can go on to receive active treatment, has
patient benefit and extent of research.
much less serious consequences for patient welfare.
Should the overall evidence for patient benefit on a
In many trials, having a placebo control group does
pre-existing active treatment be less than totally con-
not that mean such patients receive no active interven-
vincing, then the dangers of exclusively adopting that
tion. Often, all randomized patients undergo normal
treatment as an active control group (instead of a
accepted care, including other active drugs as appropri-
placebo control group) for the evaluation of other new
ate, but the addition of a new treatment is compared
treatments are substantial. In certain areas, this problem
with addition of a placebo. It is often debated whether
is tackled by having both an active control group and a
such ancillary care and supplementary drugs should be
placebo control group, so that noninferiority compared
according to a fixed protocol or pragmatically left to
with the former and superiority over the latter can be
individual clinical judgement as in routine clinical
evaluated within the same trial, or set of trials.
practice. It is also relevant to ask, might the patient
It is important to recognize that approval of a drug by
have got the active control treatment if they were not
regulatory authorities as being safe and effective for a
included in a clinical trial. In some circumstances the
specific condition does not in itself imply that the use of
answer is ‘no they would not’, in which case there
that drug as an active control without a placebo group
appears a perverse twist in the ethical argument
would provide a reliable basis for a noninferiority trial of
whereby trialists are required to adopt more stringent
a new drug, (see Temple and Ellenberg 2000).
ethical standards than regular treating physicians.
Ó 2003 Blackwell Publishing Fundamental & Clinical Pharmacology 17 (2003) 483–490
The pros and cons of noninferiority trials
Thus, every time one chooses between active controls
(1) The new treatment has less side-effects. For instance,
or placebo controls in planning a randomized trial, one
low dose aspirin vs. anticoagulation following thrombo-
has to consider the risks of type I and type II error,
lysis after a myocardial infarction may be equivalent as
taking account of the likelihood of them occurring and
regards recurrence of infarct or cardiac death, but the
the severity of the consequences. The consequent deci-
former produces less bleeding complications. Aspirin is
sions are not easily taken: neither passionate one-sided
hardly a new drug! However, in the context where
ethical arguments against placebo controls in general
anticoagulation had become the norm it was a new
nor scientific pleas for mandatory up front demonstra-
tion of new treatment superiority over placebo should
(2) The new treatment is less invasive. Carotid endarter-
dominate this thinking and planning. Rather one aims
ectomy is a surgical procedure for patients at high risk
for an ethical balance of the genuine needs of the next
of a stroke. If carotid stenting could be demonstrated
patient in a trial to receive good care and the longer
as equally efficacious, for many patients it might be
term public health need to only allow marketing
the treatment of choice, being less invasive. This is
approval and widespread use of treatments that actually
the motivation behind a proposed National Institute of
Health-funded trial comparing these two interventionstrategies. (3) The new treatment is cheaper. An analogous situation
T Y P E S O F N O N I N F E R I O R I T Y T R I A L
concerns the relative merits of bypass surgery and cor-
There are many different circumstances that may lead to
onary angioplasty for patients with angina. Trials have
undertaking a noninferiority trial design. The simplest
shown that the prognosis death and/or myocardial
case is where one wishes to demonstrate (if true) that the
infarction appears similar for both intervention strat-
efficacy of a new drug is the same as an existing active
egies, the former provides better symptomatic relief
drug, and one is not anticipating any other differences.
initially but is more invasive. However, any health care
This will be particularly plausible if the drugs are of the
strategy must take costs and cost-effectiveness into
same class, in which case such a ‘me too’ drug
account, and much interest in these trials has focussed
development could lead to an additional marketable
on the reduced initial costs of an angioplasty and
product but not a substantial improvement in thera-
whether that gain is maintained over several years
peutic care. The merits of such a narrow diversity of
products within a specific drug class may appear rather
The complexity of these situations arises from the fact
small, except as regards company profits. However, even
that one is looking at a trade-off between efficacy and
if the average benefits and safety profiles of two drugs in
other issues regarding side-effects, patient acceptability
a class appear identical, it is possible that individual
and costs. Although such trials are often presented as
patients may benefit more from one drug than the other.
noninferiority trials, it is possible that some modest
Either before or after marketing approval, there is
reduction in efficacy may be acceptable alongside the
sometimes a need for a large randomized controlled
other benefits of a new treatment. One such example
safety study to evaluate a concern that may have arisen
of this may be the acellular pertussis vaccines, which
from observational adverse event reporting. For instance,
are used in preference to whole cell pertussis vaccines
the European Post-Operative NSAID Study Group evalu-
because of fewer adverse events, although their efficacy
ated in 11 302 patients undergoing surgery the safety of
one nonsteroidal anti-inflammatory drug (NSAID) pain
Another issue is whether any particular trial compar-
relief drug ketorolac compared with two others, diclof-
ing a new treatment with an active standard treatment
enac and ketoprofen. This noninferiority safety study
should be formulated as a noninferiority trial or not.
was motivated by concerns in the European regulatory
Given the relative state of ignorance with which one
authority, the Committee on Proprietary Medicinal
starts any new study, one is often unsure whether to
optimistically pursue the prospect of demonstrating a
A more complex and interesting scenario arises when
new treatment’s superiority or whether to settle for
the aim is to demonstrate equivalence (noninferiority) of
demonstrating noninferiority on the basis that, that it
a new treatment to an active control, while knowing or
will be still good enough to make the new treatment of
suspecting that the two treatments will differ in some
some value. Although the statistical power calculations
other important respects. Possibilities here are:
differ somewhat for these two scenarios (i.e. the latter
Ó 2003 Blackwell Publishing Fundamental & Clinical Pharmacology 17 (2003) 483–490
reverses the roles of null and alternative hypotheses) –
deny any claim of a new treatment’s noninferiority. We
the underlying statistical and scientific intent is unal-
then choose a sample size sufficiently large such that if
tered. What one wants is a sufficiently large and unbiased
there is true equivalence of new and control treatments
study that the true magnitude of treatment difference is
there is a high probability that the confidence interval for
estimated precisely. That is, when the trial is completed
treatment difference will be wholly to one side (the good
the point estimate and confidence interval for the
appropriate measure(s) of treatment difference contain
The simplest case to quantify is for a binary response
all the relevant evidence on which to hang claims of
(success or failure). Let p be the anticipated percentage
of success on each treatment if true equivalence exists.
Rigorous adherence to a single prespecified criterion of
Let d be the ‘minimum clinically relevant difference’.
noninferiority, except for the convenience of planning
Suppose that results will be expressed as an estimated
the size of a trial, may not necessarily be the most
percentage of treatment difference with a 95% confid-
sensible way of interpreting a trial’s results. Nevertheless,
ence interval around it. Also, suppose one wants to be
formulating one’s realistic goals, and hence the required
90% sure that if treatments are truly identical then the
number of patients, is an important feature of any
confidence interval will exclude d, in a more favourable
noninferiority trial’s planning and the next section deals
A simple commonly used formula in this instance is
A P P R O P R I A T E G O A L S A N D S A M P L E
S I Z E S F O R N O N I N F E R I O R I T Y T R I A L S
First it is essential to realize that failure to demonstrate
More complex refinements exist and alternative but
a statistically significant difference between two treat-
similar formulae exist for other types of outcome data,
ments does not allow one to assert that the two treat-
such as comparison of risk ratios or means of a
ments are equivalent, or even similar, in their efficacy.
quantitative measure, but this formula will adequately
Obviously, the fewer patients there are in a trial the less
illustrate the problems of choosing the size of a non-
power to detect any meaningful difference so that
nonsignificance in a conventional test of a null hypo-
The difficulty lies in choosing appropriate values for
thesis is a hopeless criterion for inferring noninferiority.
p and d, especially the latter. For example, consider a
It would actually encourage the pursuit of smaller trials!
noninferiority trial comparing a new drug with omep-
Instead, the most widely accepted approach to deter-
razole for treatment of Helicobacter pylori infection. The
mine the required size of a noninferiority trial is to first
binary response is eradication of infection (yes or no).
define the smallest true magnitude of inferiority that
From past experience with omeprazole, p ¼ 85% was
would be regarded as unacceptable, assuming that one
the anticipated eradication rate. For trial planning d was
has already chosen a single primary outcome measure
set at 15%. This means that the new drug would be
of response for this purpose. Anything truly bad or
regarded as noninferior provided that the possibility of its
even worse needs to be detected reliably so that any claim
eradication rate being 15% worse than omeprazole could
of noninferiority for the new treatment can then be ruled
be ruled out (in the sense that the 95% confidence
out. However, one is prepared to accept more minor
interval for the treatment difference in eradication
differences from true equivalence as being ‘good enough’.
rates would not include a 15% inferiority relative to
The logical basis here is that even if one carries out an
extremely large clinical trial, one never fully proves that
Hence the trial required 2n ¼ ð4 Â 10:5 Â 85 Â 15Þ=
two treatments are truly identical in their efficacy. The
confidence interval for the treatment difference gets
Leaping ahead to the actual results of this trial, the
smaller and smaller as the sample size increases, but
observed eradication rates on new drug and omeprazole
proof of equivalence would require a confidence interval
were 109 of 126 (86.5%) and 110 of 129 (85.3%),
centred on zero and with zero width. An impossible task!
In a spirit of achievable compromise one sets out to
The 95% confidence interval for the treatment differ-
arbitrarily choose this minimum clinically relevant
ence was )7.5% to +9.8%. A difference of )15% is
difference, commonly called delta, which if true would
clearly ruled out of consideration, and on that basis the
Ó 2003 Blackwell Publishing Fundamental & Clinical Pharmacology 17 (2003) 483–490
The pros and cons of noninferiority trials
trial data support the new drug’s noninferiority relative
above formulae have reversed the usual concepts of null
to omeprazole. Of course, one still cannot claim with
and alternative hypothesis, and type I and II errors a, b.
certainty that the new drug is identical in efficacy to
For a noninferiority trial, a/2 is the probability that the
omeprazole. After all, the confidence interval for treat-
100(1 ) a)% confidence interval excludes d when the
ment difference does go beyond 5% in both favourable
null hypothesis, treatment difference ¼ d, is in fact true.
and unfavourable direction. But according to the pre-
b is the probability that the confidence interval includes
defined goal, adequate evidence of noninferiority is
d (or worse) when the alternative hypothesis of no
treatment difference is in fact true.
Now just suppose that the new drug had had an
In practice, there seems to be a pragmatic acceptance
eradication rate of 99 of 126 (78.6%) instead of the
by trialists and regulatory authorities that fairly gener-
above 109 of 126. In that case, the 95% confidence
ous choices of d, a and b are allowed in order not to
interval for the treatment difference would have been
demand inordinately large numbers of patients in
from )16.1% to +2.7%. This would have been a most
noninferiority trials. Should we be concerned therefore
unhelpful result as one could neither claim noninferior-
that the adoption of noninferiority designs with gener-
ity (as the confidence interval would include )15%) nor
ously large choices of d be permitting treatments with
could one rule out equivalence of the two treatments
more modest but important extents of inferiority to be
(because the confidence interval includes no difference).
falsely accepted as noninferior? This is an inherent
Fortunately, this did not happen in reality, but it
weakness of noninferiority trials as currently performed.
illustrates the inconclusiveness that can easily arise if
We do take a sizeable risk that some truly inferior
noninferiority trials are conducted with fairly modest
treatments will slip through the net.
Perhaps one could draw a distinction between (1) trials
It seems quite fashionable to choose d ¼ 15% in
comparing two drugs in the same class where there
noninferiority trials. Indeed the regulatory authority
may exist a high prior belief that treatments truly should
did approve such a choice for the above trial, and there
be equally efficacious and (2) trials comparing quite
are more general regulatory guidelines for choice of delta
contrasting treatments, e.g. drugs of differing types, or
in such anti-infective trials. But it is hard to come up
radically differing intervention strategies where there is
with an objective reasoning behind this apparently
no firm grounds on which to anticipate noninferiority.
arbitrary often used choice. Why not d ¼ 10% instead?
A generous d leading to a smaller required sample size
That would require 2n ¼ 535 patients, more than twice
seems more permissible in the first instance (as was
as many. d ¼ 5% might seem a plausibly tight safety
the case from the above H. pylori example). The more
margin, on the basis that only the slightest possible
contrasting the treatments the harder it often is to
inferiority of the new drug should be allowable, but
recruit patients, but that is just the instance when large
that requires nine times as many patients, a staggering
sample sizes are needed in order to be confident of true
These calculations are all based on 95% confidence
The appropriate choice of d is particularly important
and being 90% sure or ruling out noninferiority.
when a noninferiority trial vs. active control is taking
More generally if one requires 100(1 ) a)% confidence
place because it is considered unethical to proceed with a
and wants 100(1 ) b)% surity, then one requires
placebo control group. The worst that could happen is
that the noninferiority margin is set so wide that a new
and Zb are standardized normal deviates associated with
treatment not much (if at all) better than placebo gets
one-tail probabilities a/2 and b, respectively.
accepted as ‘noninferior’ on such an unduly lose
For instance, with a 90% confidence interval and only
criterion. Hence, it is useful to infer, preferably from
80% surity, each of the above sample sizes is reduced by
past placebo-controlled trials, the magnitude of superi-
40%. However, with the tougher demands set by a 99%
ority of the active control over placebo. The choice of d
confidence interval and being 95% sure of rejecting a
both for planning trial size and interpreting results of the
difference d and claiming noninferiority when equival-
new trial needs to be substantially smaller than this
ence truly exists, each of the above sample sizes increases
(1) The magnitude of superiority of active control over
For those more used to power calculations for trials
placebo may well be an overestimate. The placebo-
aimed at detecting differences, it is worth noting that the
controlled evidence may be limited, past trials may have
Ó 2003 Blackwell Publishing Fundamental & Clinical Pharmacology 17 (2003) 483–490
some biases present and to carry forward one (perhaps
where strict adherence to protocol inevitably does not
lucky) possibly exaggerated effect of active treatments
always happen, means that the real-life benefit of a new
into future planning, reflects a lack of scientific caution.
treatment is seen for what it really is. That is, a new
(2) Without a direct comparison with a placebo-control
treatment has to fight its way through the hiccups,
group, one is using an indirect argument via a compar-
failings, frailties and unpredictability of human beings,
ison with active control to infer that a new treatment is
(both trialists and patients), in order to demonstrate its
worthwhile. For a whole variety of reasons discussed in
the next section (e.g. patient selection, noncompliance),
The great difficulty with noninferiority trials is that
the circumstances of the noninferiority trial may be
their very motivation is to demonstrate the similarity of
sufficiently different from the placebo-controlled trials to
new and standard treatments, so that all these same
cast doubt on the appropriateness of the active treat-
problems work towards achieving this goal even if it is
ment’s apparent magnitude of superiority over placebo.
not true. The anti-conservatism of a poorly designed
A safety margin of less ambitious efficacy may be in
and poorly conducted noninferiority trial can greatly
enhance the risk of a type I error, the adoption of a
(3) Any new treatment needs to have a certain
useless treatment whose inadequacies could not be
minimum magnitude of efficacy compared with placebo
in order to be of worth, especially if other considerations
One could argue that the unscrupulous investigator
(side-effects, costs, inconvenience) come into play. Thus,
has every intention to undertake a sloppy noninferiority
even supposing issues (1) and (2) above did not apply
trial. For instance, with selection of inappropriate
(they usually do though!), one would still want d to be
patients, poor compliance with intended treatments,
much smaller than the active control’s superiority
use of nondiscriminatory outcome measures, inconsis-
tencies between observers, too short a follow-up and a
These issues are linked to the earlier arguments
substantial amount of missing data, it would not be
concerning the choice between placebo and active
surprising if the results showed closely comparable
controls. The ideal circumstance for a noninferiority
results even if the real treatments properly given to the
trial is when the superiority of active control compared
right patients were substantially different in real patient
with placebo is irrefutable and well-documented, the
noninferiority trial can be conducted in very similar
So in noninferiority trials it is especially important to
conditions and the choice of delta is small enough to
adhere to a well-defined relevant study protocol, and also
convince one that any new treatment passing such a
to document that such adherence is successfully
noninferiority test is truly of therapeutic value.
achieved. Some of the principal difficulties to bear in
Fundamentally, many noninferiority trials are not be
large enough to satisfy these requirements, meaning that
(1) Selection of patients. It is important to select the type
the risk of a type I error (as discussed in section 2 above),
of patient for whom the efficacy of the active control
false acceptance of a useless treatment, is often greater
treatment has been clearly established. For instance,
were one to deviate, even in part, from the patientpopulation in whom superiority over placebo hadpreviously been demonstrated, then any claim regarding
T H E P O T E N T I A L I N F E R I O R I T Y O F
a new treatment’s merits could not well distinguish
between genuine noninferiority or inappropriate selec-
For conventional clinical trials aimed at exploring the
tion of patients. Informative generalizability depends on
potential superiority of a new treatment over standard
a representative patient sample of the same kind as had
treatment (whether placebo or active), many of the
previously demonstrated efficacy for the active control.
pitfalls that can arise operate in the direction of making it
(2) Treatment compliance. The first requirement is that
harder to detect a genuine treatment difference. Such
one chooses a genuinely efficacious active control
conservatism leads to the observed treatment difference
treatment, and that it be given in the same form, dose
being a dilution of the true effects under ideal conditions.
and quality as was previously used to demonstrate that
This is often seen as an appropriate pragmatism,
efficacy. One then requires that for both new and active
whereby the attempted unbiased comparison of new
treatment groups a satisfactorily high level of patient
and standard treatment policies in a practical setting
compliance is achieved, and that appropriate measures
Ó 2003 Blackwell Publishing Fundamental & Clinical Pharmacology 17 (2003) 483–490
The pros and cons of noninferiority trials
of such compliance are recorded. Any reasons for
noninferiority by diluting some real treatment differ-
alteration or discontinuation of treatments need docu-
ence. More weight may instead be attached to per
menting. Also, use of concommitant nonrandomized
protocol analyses (which focus on patient outcome
treatments needs documenting (and possibly standard-
amongst compliers only or up until compliance ceases
izing) as any differential use of other efficacious treat-
in each patient) in the hope that they may reveal
ments could conceivably mask the inferiority of a new
undesirable treatment differences. But this in turn has
problems as compliers are a select group of patients
(3) Outcome measures. One needs to choose outcome
who may give a favourably biased view (e.g. if the
measures that reflect genuine patient benefit (i.e. surro-
treatment is not helping, you drop out). These difficul-
gate markers may well not suffice), and which were
ties become particularly problematic if compliance
previously used to demonstrate the efficacy of the active
control treatment. Each such measure (or end point)
Thus, for noninferiority trials there is no single ideal
needs consistent well-defined criteria, with appropriate
analysis strategy in the face of substantial noncompli-
steps to reduce observer variation or bias. In addition,
ance or missing data, and both analysis by intention to
rigorous, objective reporting of adverse events is an
treat and well-defined per protocol analyses would seem
important issue, as any noninferiority needs to concern
warranted. Such noncompliance will inevitably lead to a
degree of concern over any affirmative claims of non-
(4) Duration of treatment and evaluations. In any non-
inferiority, which feeds back to the need to minimize any
inferiority trial, the randomized treatments need to be
given for long enough and the patient response evalu-
In general, statistical considerations in the design,
ated over a long enough period so that any potential
monitoring and analysis of noninferiority are less well
treatment differences have a realistic opportunity to
established than for superiority trials, making it a fruitful
reveal themselves. Due attention needs to be given to the
area for further methodological research.
durations of treatment and follow-up in previous trialsdemonstrating efficacy of the active control treatment,
and also the intended duration of treatment in futureclinical practice.
The most important advantage of a noninferiority trial
(5) Statistical analysis issues. Any noninferiority trial
is that faced with clear evidence of efficacy for an
requires a well-documented statistical analysis plan.
existing standard treatment, it would be ethically
There will often be a single primary outcome measure
unacceptable to proceed with a placebo or inactive
with a predefined noninferiority criterion and method of
control group in the evaluation of a new treatment for
analysis, but this should not preclude appropriate
the same condition. In any particular circumstance an
secondary analyses of other outcome measures, which
important reservation before jumping to that conclusion
could become important if they exhibit any signs of the
is that such evidence of efficacy really is strong enough
new treatment’s inferiority or if interpretation of the
to warrant exclusion of placebo controls. Too lax an
primary outcome findings is not clear-cut.
acceptance of noninferiority trials, with a less than
In major phase III trials aimed at detecting treatment
convincing active control treatment, could potentially
differences, analysis by intention to treat is routinely
lead to the adoption of more and more ineffective
highlighted. That is, one analyses the complete follow-
treatments, and this would be a misguided over-reac-
up results for all randomized patients regardless of
tion to the ethical concerns in conducting randomized-
their compliance with intended treatment in a spirit of
comparing the treatment policies as actually given.
So when one has made the right judgement to
Although this may dilute any idealized treatment differ-
undertake a noninferiority trial with an active control
ences under (unrealistic) circumstances of 100% com-
treatment, all necessary steps need to be taken to ensure
pliance, that is generally considered wise pragmatism
that any failings in the trial design, conduct or analysis
compared with any potential exaggerations of efficacy
could not artificially dilute out any real treatment
that could arise from focussing on treatment compliers
differences. That is, false claims of noninferiority need
only. The dilemma for noninferiority trials is that faced
with non-negligible noncompliance analysis by inten-
Inevitably one can never prove that two treatments
tion to treat could artificially enhance the claim of
are identical, and hence some degree of compromise
Ó 2003 Blackwell Publishing Fundamental & Clinical Pharmacology 17 (2003) 483–490
is required so that realistically achievable but ade-
non-zero risk difference or non-unity relative risk. Stat. Med.
quately large numbers of patients are randomized in
a noninferiority trial. Thus, any clinically important
3 Garbe E., Ro¨hmel J, Gundert-Remy U. Clinical and statistical
issues in therapeutic equivalence trials. Eur. J. Clin. Pharmacol.
treatment difference can be demonstrated not to exist
with reasonable confidence. One suspects that in too
4 International Conference on Harmonisation. Choice of Control
many instances sample size determination for noninfe-
Group in Clinical Trials. Federal Register (1999) 64 51767–
riority trials is based on too generous a criterion of
what constitutes a minimum clinically important treat-
5 Jones B., Jarvis P., Lewis J.A., Ebbutt A.F. Trials to assess
ment difference, and this increases the risk that some
equivalence: the importance of rigorous methods. BMJ (1996)
inferior treatments may gain regulatory approval and
¨ hn A. Comparison of tests and sample size
formulae for proving therapeutic equivalence based on the
So placebo controls may rightly need to be ruled out in
difference of binomial probabilities. Stat. Med. (1995) 14
certain areas of clinical research, but it would be wrong
to rush too enthusiastically into more widespread use of
7 Senn S. Inherent difficulties with active control equivalence
noninferiority (equivalence) trials without full consid-
studies. Stat. Med. (1993) 12 2367–2375.
8 Senn S. Statistical issues in drug development. Wiley,
9 Snapinn S.M. Noninferiority trials. Curr. Cont. Trials
B I B L I O G R A P H Y F O R F U R T H E R R E A D I N G
10 Temple R., Ellenberg S.S. Placebo-controlled trials and active-
1 Blackwelder W.C. ‘‘Proving the null hypothesis’’ in clinical
control trials in the evaluation of new treatments. Ann. Intern.
trials. Con. Clin. Trials (1982) 3 345–353.
2 Farrington C.P., Manning G. Test statistics and sample size
formulae for comparative binomial trials with null hypothesis of
Ó 2003 Blackwell Publishing Fundamental & Clinical Pharmacology 17 (2003) 483–490
In case of emergency call 3E at 800.451.8346 MATERIAL SAFETY DATA SHEET: AMPICILLIN, SODIUM SALT CATALOG NUMBER: AB00115 SECTION I - CHEMICAL IDENTIFICATION NAME: AMPICILLIN SODIUM SALT SECTION II - COMPOSITION/INFORMATION ON INGREDIENTS CAS: 69-52-3 MF: C16H18N3NAO4S ALPEN-N * AMCILL-S * D(-)-ALPHA-AMINOBENZYLPENICILLIN SODIUM SALT * AMPICILLIN SODIUM * AMPICILLIN SODIUM SALT * BINOTAL SODIUM