ATLA 35, 641–659, 2007 641
Systematic Reviews of Animal Experiments DemonstratePoor Human Clinical and Toxicological Utility Animal Consultants International, London, UK Summary — The assumption that animal models are reasonably predictive of human outcomes provides
the basis for their widespread use in toxicity testing and in biomedical research aimed at developing cures
for human diseases. To investigate the validity of this assumption, the comprehensive Scopus biomedical
bibliographic databases were searched for published systematic reviews of the human clinical or toxico-
logical utility of animal experiments. In 20 reviews in which clinical utility was examined, the authors con-
cluded that animal models were either significantly useful in contributing to the development of clinical
interventions, or were substantially consistent with clinical outcomes, in only two cases, one of which was
contentious. These included reviews of the clinical utility of experiments expected by ethics committees to
lead to medical advances, of highly-cited experiments published in major journals, and of chimpanzee
experiments — those involving the species considered most likely to be predictive of human outcomes.
Seven additional reviews failed to clearly demonstrate utility in predicting human toxicological outcomes,
such as carcinogenicity and teratogenicity. Consequently, animal data may not generally be assumed to be
substantially useful for these purposes. Possible causes include interspecies differences, the distortion of
outcomes arising from experimental environments and protocols, and the poor methodological quality of
many animal experiments, which was evident in at least 11 reviews. No reviews existed in which the major-
ity of animal experiments were of good methodological quality. Whilst the effects of some of these prob-
lems might be minimised with concerted effort (given their widespread prevalence), the limitations
resulting from interspecies differences are likely to be technically and theoretically impossible to overcome.
Non-animal models are generally required to pass formal scientific validation prior to their regulatory
acceptance. In contrast, animal models are simply assumed to be predictive of human outcomes. These
results demonstrate the invalidity of such assumptions. The consistent application of formal validation
studies to all test models is clearly warranted, regardless of their animal, non-animal, historical, contem-
porary or possible future status. Likely benefits would include, the greater selection of models truly pre-
dictive of human outcomes, increased safety of people exposed to chemicals that have passed toxicity tests,
increased efficiency during the development of human pharmaceuticals and other therapeutic interven-
tions, and decreased wastage of animal, personnel and financial resources. The poor human clinical and
toxicological utility of most animal models for which data exists, in conjunction with their generally sub-
stantial animal welfare and economic costs, justify a ban on animal models lacking scientific data clearly
establishing their human predictivity or utility.
Key words: animal experiment, animal study, clinical trial, human outcome, systematic review.
Address for correspondence: Andrew Knight, Animal Consultants International, 91 Vanbrugh Court,
Wincott Street, London SE11 4NR, UK.
E-mail: [email protected]
United States alone, that many millions of animalsare used worldwide, and that certain trends areresulting in an increase in laboratory animal use.
Trends in laboratory animal use
Standards for the reporting of laboratory animal usevary internationally, with many countries failing to European Commission (EC) statistics on laboratory record or publicise statistics on animal use at all. Of animal use in 25 EU Member States, revealed that those that do, most record only live animal use, and 12,117,583 animals were used in 2005, the latest fail to record the substantial numbers of animals that reporting period (except for France, which provided may be killed prior to certain procedures, such as dis- figures for 2004). The majority of these were mice section or the collection of organs, tissues or cells.
(53.1%), rats (19.3%), cold-blooded animals (15.1%, Hence, making realistic annual estimates of animal consisting of fish [primarily], amphibians and rep- use within biomedical research and toxicity testing is tiles), and birds (5.4%). As in previous years, difficult. Despite these limitations, it remains clear France, Germany and the UK reported the greatest from consideration of the European Union (EU) and Program, and the Voluntary Children’s ChemicalEvaluation Program. The 2003 EC proposal for the In the USA, laboratory animal use is federally reg- Registration, Evaluation and Authorisation of ulated by the Animal Welfare Act 1966 (amended in Chemicals (REACH), similarly aims to assess the 1985), which excludes laboratory-bred mice and toxicity of chemicals produced or imported in high rats, as well as non-mammals, from consideration quantities (15–20). It is reported that the HPV pro- or protection (2, 3), despite the fact that mice and gramme, for example, has already subjected over rats comprise the overwhelming majority of all lab- 150,000 animals to chemical tests (21).
oratory subjects. This impedes the accurate estima-tion of laboratory animal use in the USA. Forexample, although 1,012,713 regulated animals Claims supporting laboratory animal use
were used in the Fiscal Year 2006 (4), the latestreporting period, Carbone (5) estimated that in Biomedical research using laboratory animals is excess of 100 million mice are used annually. This highly controversial. Advocates frequently claim represents a dramatic increase from the 17–22 mil- that such research is vital for preventing, curing or lion vertebrates used in the mid-1980s (6).
alleviating human diseases (e.g. 22, 23), that thegreatest achievements of medicine have been possi-ble only due to the use of animals (e.g. 24), and that the complexity of humans requires nothing lessthan the complexity of laboratory animals to serve In recent years, the previous steady decreases in as an effective model during biomedical investiga- laboratory animal use have been reversed, in some tions (e.g. 25). They even claim that medical countries, mostly as a result of dramatic increases progress would be “severely maimed by prohibition in the use of genetically-modified (GM) animals.
or severe curtailing of animal experiments,” and The production of these GM animals requires sub- that “catastrophic consequences would ensue” (26).
stantial breeding, which serves to further increase However, such claims are hotly contested (e.g.
the numbers of animals used. Within the UK, for 27), and the right of humans to experiment on ani- example, a steady and significant reduction since mals has also been strongly contested philosophi- 1976 stabilised during the early 1990s, and then cally (e.g. 28, 29). A growing body of empirical reversed. 3,012,032 procedures on living, regulated evidence also casts doubt upon the scientific utility animals (vertebrates and one species of octopus, of animals as experimental models of humans.
Octopus vulgaris) were conducted in 2006, the high-est number for around 15 years (7). Greater breed-ing and use of GM animals have contributed to Clinical utility of animal models: case studies
these increasing numbers (8, 9). In 1995, GM ani-mals were used in 8% of regulated procedures. In Within the field of pharmaceutical development, 2004, the total was 32%, and in 2006 it was 34% case studies exemplifying differing human and ani- (1,035,343 regulated procedures; 7). Increased GM mal outcomes — sometimes with severe adverse animal use has also been recorded in Germany (10) consequences for human patients — are sufficiently and Switzerland (11), where total animal use is also numerous to fill entire book chapters (e.g. 30, 31).
A recent notorious example was TGN1412 (also known as CD28-SuperMAB), a fully-humanisedmonoclonal antibody (i.e. a product developed in a non-human species and protein-engineered to pos-sess specifically-human characteristics) that was Recently-initiated, large-scale chemical testing pro- undergoing development for the treatment of grammes are also important drivers of the recent inflammatory conditions, such as leukaemia and and probable substantial future increases in labora- rheumatoid arthritis (32, 33). During a Phase I clin- tory animal use (13, 14). These programmes are ical trial in the UK in 2006, TGN1412 caused severe intended to rectify existing knowledge gaps with adverse reactions, culminating in organ failure regard to the toxicity of chemicals that are pro- requiring intensive care, in all six volunteers given duced or imported into the EU and the USA in par- the drug, with one volunteer suffering permanent ticularly high quantities (or that otherwise give rise damage. These effects occurred despite the admin- to special concerns), and are likely to result in the istration of an expected sub-clinical dose of 0.1 use of unprecedented numbers of animals in toxic- mg/kg — 500 times lower than the 50 mg/kg dose ity testing. Included are three programmes initiated found not to cause adverse effects in cynomolgus monkeys. Tests on rhesus macaques, rats and mice Environmental Protection Agency (EPA) since also failed to reveal adverse effects (34, 35).
1998: the High Production Volume (HPV) Chall Another recent notorious example was the arthri- enge Program, the Endocrine Disruptor Screening tis drug, Vioxx, which appeared to be safe, and even Poor human clinical and toxicological utility of animal experiments 643 beneficial to the heart, in animal studies. However, Results and Discussion). True discordance in results Vioxx was withdrawn from the global market in may also arise from interspecies differences.
2004, after causing as many as 140,000 heart Finally, the limited predictivity for wider human attacks and strokes, and over 60,000 deaths, in the outcomes of human clinical trials may result from their focus on small groups of healthy young men, Since their commercial introduction in the early or from insufficient study durations. Particularly in 1980s, non-steroidal anti-inflammatory drugs Phases I–II, small cohorts of young men (20–300) (NSAIDs) have also had a problematic clinical his- are typically used, to minimise experimental vari- tory. Although apparently safe in year-long studies ability and to avoid the possibility of endocrinologi- in rhesus monkeys, benoxaprofen (Oraflex) pro- cal disruption or other risks to women of duced thousands of serious adverse reactions in reproductive age. Although 1,000–3,000 volunteers humans, including dozens of deaths, within three may be used in Phase III trials, the final phase months of its initial marketing (37). Fenclofenac before marketing (50), it is nevertheless clear that (Flenac) revealed no toxicity in ten animal species, cohort numbers, study durations or other aspects of yet produced severe liver toxicity in humans, and protocol design, conduct or interpretation, are inad- was subsequently withdrawn (38). Similar fates equate to detect the adverse side-effects of the large befell some other NSAIDs, including zomepirac number of pharmaceuticals that are found to harm (Zomax; 39), bromfenac (Duract; 40), and phenyl - patients after marketing. Longer studies of more- butazone (Butazolidin; 41), which produced adverse broadly representative human populations would human effects undetected in animal studies.
be more predictive, but would increase the time and Numerous other pharmaceuticals have also been cost of pharmaceutical development, and are resis- marketed after passing limited clinical trials and more rigorous animal testing, only to subsequentlybe found to cause serious side-effects or death inhuman patients. Examples include various antibi- The necessity of systematic reviews
otics (e.g. chloramphenicol, clindamycin, tema flox -acin), antidepressants (e.g. nomifensine), antivirals The premise that laboratory animal models are gen- (e.g. idoxuridine), cardiovascular medications (e.g.
erally predictive of human outcomes is the basis for amrinone, cerivastatin, mibefradil, ticrynafen), and their widespread use in human toxicity testing, and in the safety and efficacy testing of putative Although 92% of new drugs that pass preclinical chemotherapeutic agents and other clinical inter- testing, which routinely includes animal tests, fail ventions. However, the numerous cases of discor- to reach the market because of safety or efficacy dance between laboratory animal and human failures in human clinical trials (45), adverse drug outcomes suggest that this premise may well be reactions detected after drugs have been approved incorrect, and that the utility of animal experi- for clinical use, nevertheless remain common. They ments for these purposes may not be assured. On are, indeed, sufficiently common to have been the other hand, only small numbers of experiments recently recorded as the 4th–6th leading cause of are normally reviewed in case studies, and their death in US hospitals (based on a 95% confidence selection may be subject to bias. To provide more- interval; 46), a rate considered by investigators to definitive conclusions, systematic reviews of the human clinical or toxicological utility of large num- There are also cases of safe and efficacious bers of animal experiments are necessary.
human pharmaceuticals that would not pass rigor- Experiments included in such reviews should be ous animal testing, because of severe or lethal toxi- selected without bias, via randomisation, or simi- city in some laboratory animal species. Notable larly methodical and impartial means.
examples include, penicillin (e.g. 47), paracetamol In support of this concept, Pound and colleagues (acetaminophen; e.g. 48), and aspirin (acetylsali- (51) commented that clinicians and the public often cylic acid; e.g. 49). More rigorous animal testing consider it axiomatic that animal research has con- may well have delayed or prevented the use of these tributed to human clinical knowledge, on the basis of highly beneficial drugs in human patients.
anecdotal evidence or unsupported claims. These The large number of examples of apparent differ- constitute an inadequate form of evidence, they ences between outcomes in laboratory animals and in asserted, for such a controversial area of research, human patients may be the result of several factors.
particularly given increasing competition for scarce Flaws may occur during the pharmaceutical develop- research resources. Hence, they called for systematic ment and testing process, in which the design, con- reviews to examine the human clinical utility of ani- duct or interpretation of experiments may fail to mal experiments, and commenced by examining six adequately highlight the risks to human patients.
existing reviews, which did not demonstrate the clin- Such flaws are more likely in animal studies than in ical utility expected of the experiments in question.
human clinical trials, because the experimental qual- Soon afterwards, the UK Nuffield Council on ity of the former are usually significantly lower (see Bioethics stated that, It would… be desirable to undertake further systematic reviews and meta- tions of animal experiments toward increased analyses to evaluate more fully the predictability understanding of the aetiological, pathogenesic or and transferability of animal models. They called other aspects of human diseases, or on the clinical for these to be undertaken by the UK Home Office, utility of animal experiments in non-human in collaboration with the major funders of research, species, were excluded from consideration.
industry associations and animal protection groups(52).
Since then, several such reviews and meta-analy- ses have been published, which collectively provideimportant insights into the human clinical and tox- Bibliographic databases are constantly updated.
icological utility of animal models. Their identifica- 2,274 articles or reviews were retrieved, by using tion and examination was the purpose of this the specified search terms, on 1 March 2007. In total, 27 systematic reviews which examined theutility of animal experiments during the develop-ment of human clinical interventions (20), or in deriving human toxicity classifications (seven),were located. Three different approaches that The Scopus biomedical bibliographic databases sought to determine the maximum human clinical were searched for systematic reviews of the human utility that may be achieved by animal experiments, clinical or toxicological utility of animal experi- ments published in the peer-reviewed biomedicalliterature. Among the world’s most comprehensivedatabases, they include over 12,850 academic jour- Clinical utility of experiments expected to
nals, 500 open access journals, 700 conference pro- lead to medical advances
ceedings, and a total of 29 million abstracts (53).
The Life Sciences database includes over 3,400 Lindl and colleagues (55, 56) examined animal exper- titles, and the Health Sciences database includes iments conducted at three German universities over 5,300 titles, including all of Medline, the lead- between 1991 and 1993, that had been approved by ing medical and allied health profession database, animal ethics committees, at least partly on the basis which itself contains over 15 million citations from of claims by researchers that the experiments might the mid-1950s onwards, sourced from more than lead to concrete advances toward the cure of human 5,000 biomedical journals from over 80 countries diseases. Experiments were only included where pre- vious studies had shown that the applications of All abstracts, titles and key words were searched related animal research had confirmed the hypothe- for (animal experiment OR animal model OR ani- ses of the researchers, and where the experiments mal study OR animal trial) AND (clinical trial OR had achieved publication in biomedical journals. human outcome OR human relevance OR human For 17 experiments meeting these inclusion crite- result). The results were limited to articles or ria, citations were analysed over at least 12 years.
reviews, but no chronological, language or other Citation frequencies and types of citing papers were limitations were applied. Titles and, where neces- recorded: whether they were reviews or animal- sary, abstracts or complete papers, were examined, based, in vitro, or clinical studies. 1,183 citations in order to locate relevant papers. Additional rele- were evident, but only 8.2% (97 citations) were in vant studies were obtained by examination of the clinical publications, and only 0.3% (4 citations) reference lists of the papers retrieved, and by con- demonstrated a direct correlation between the sultation with colleagues working in this field.
results of animal experiments and human out- To minimise bias, reviews were included only comes. However, even in these four cases, the when they had been conducted systematically, by hypotheses that had been verified successfully in using randomisation or similarly methodical and the animal experiment failed in every respect when impartial means to select animal studies. For exam- applied to humans. None of these 17 experiments ple, in some cases, all the animal studies within rel- led to any new therapies, or had any beneficial clin- evant subsets of toxic chemical databases were ical impact during the period examined.
As a result of their analysis, Lindl and colleagues The examination covered only reviews which con- called for serious, rather than cursory, evaluations sidered the human toxicological predictivity or util- of the likely benefits of animal experiments by ani- ity of animal experiments, their contributions mal ethics committees and related authorities, and toward the development of prophylactic, diagnostic for a reversal of the current paradigm, in which ani- or therapeutic interventions with clear potential for mal experiments are routinely approved. Instead of combating human diseases or injuries, or their con- approving experiments because of the possibility sistency with human clinical outcomes. Reviews that benefits might accrue, Lindl and colleagues which focused, for example, only on the contribu- suggested that where significant doubt exists, labo- Poor human clinical and toxicological utility of animal experiments 645 ratory animals should receive the benefit of that leading scientific journals, few included the random doubt, and that such experiments should not, in allocation of animals to test groups, any adjustment for multiple hypothesis testing, or the blindedassessment of outcomes. Accordingly, Hackam andRedelmeier cautioned patients and physicians Clinical utility of highly-cited animal
about the extrapolation of the findings of even experiments
highly-cited animal research to cases of human dis-ease.
Hackam and Redelmeier (57) also used a citationanalysis, but without geographical limitations.
Based on the assumption that findings from highly- Clinical utility of chimpanzee experiments
cited animal experiments would be most likely to besubsequently tested in clinical trials, they searched Chimpanzees are the species most closely related to for experiments with more than 500 citations and humans, and consequently, are considered to be the published in the seven leading scientific journals, as laboratory animals most likely to provide results which are predictive of human outcomes. Hence, in Of 76 animal studies located, with a median cita- 2005, I conducted a citation analysis of the human tion count of 889 (range: 639–2,233), only 36.8% clinical utility of chimpanzee experiments (59).
(28/76) were replicated in randomised human trials.
I searched three major biomedical bibliographic 18.4% (14/76) were contradicted by randomised tri- databases, and located 749 papers published als, and 44.7% (34/76) had not translated to clinical between 1995 and 2004, which described experi- trials. Ultimately, only 10.5% (8/76) of these medical ments on captive chimpanzees or their tissues.
interventions were subsequently approved for use in Although published in the international scientific patients, and, as stated previously, even in these literature, the vast majority of these experiments cases, human benefit cannot be assumed, because were conducted within the USA (60). To obtain 95% adverse reactions to approved interventions are com- CIs with an accuracy of at least plus or minus 10%, mon, and a leading cause of death (46).
when estimating the proportion of chimpanzee A low rate of translation to clinical trials of even studies subsequently cited by other published these highly-cited animal experiments was appar- papers, a subset of at least 86 chimpanzee studies ent, despite 1992 being the median publication year, allowing a median of 14 years for potential Of 95 published randomly-selected studies on translation. For studies that did translate to clinical chimpanzees, 49.5% (47/95) were not cited by any trials, the median time for translation was seven subsequent papers, demonstrating minimal contri- years (range 1–15). The frequency of translation butions toward the advancement of biomedical was not affected by the laboratory animal species knowledge. This is of particular concern, because it used, the type of disease or therapy under examina- can be assumed that research judged to be of lesser tion, the journal, year of publication, methodologi- value was not published. Hence, it appears that the cal quality, and even, surprisingly, the citation rate.
majority of chimpanzee research generates data of However, animal studies incorporating dose– questionable value, which make little obvious con- response gradients were more likely to be trans- tribution toward the advancement of biomedical lated to clinical trials (odds ratio [OR] = 3.3; 95% confidence interval [CI] = 1.1–10.1).
35.8% (34/95) of the 95 published chimpanzee Although the rate of translation of these animal studies were cited by 116 papers that clearly did not studies to clinical trials was low, as Hackam and describe well-developed methods for combating Redelmeier stated, it is nevertheless higher than human diseases. Only 14.7% (14/95) of them were that of most published animal experiments, which cited by 27 papers that had abstracts which indi- are considerably less likely to be translated than cated well-developed prophylactic, diagnostic or these highly-cited animal studies published in lead- therapeutic methods for combating human dis- ing journals. Furthermore, the selective focus on eases. However, a detailed examination of these 27 positive animal data, whilst ignoring negative medically-oriented papers revealed that in vitro results (optimism bias), was one of several factors studies, human clinical and epidemiological studies, proposed that may have increased the likelihood of molecular assays and methods, and genomic stud- translation beyond that which was scientifically ies, contributed most to their development. 63.0% merited. As Hackam (58) stated, the rigorous meta- (17/27) were wide-ranging reviews of 26–300 analysis of all relevant animal experimental data (median 104) references, to which these cited chim- would probably significantly decrease the transla- panzee studies made very small contributions.
Duplication of human outcomes, inconsistency with In addition, only 48.7% (37/76) of these highly- other human or primate data, and other causes, cited animal studies were considered to be of good resulted in the absence of any chimpanzee study methodological quality. Despite their publication in able to demonstrate an essential contribution, or, in most cases, a significant contribution of any kind, In many cases, animal models did indicate efficacy, toward the development of the medical method but this did not translate to humans. In a few reviews, the authors speculated on the possible Despite the low utility of chimpanzee experi- causes. For example, Jonas and colleagues (70) ments in advancing human health which was indi- hypothesised that the poor clinical efficacy of neuro- cated by these results, it remains true that protectants which had been found to be successful in chimpanzees are the species most closely related to animal models, was due to differences in the timing of human beings. Hence, it is highly likely other labo- the initiation of treatment. Curry (71) hypothesised ratory species are even less useful as experimental that the human clinical failure of fourteen neuropro- models of humans in biomedical research and toxi- tective agents which were successful in animal mod- els, was due to the antagonism of glutamate — whichmay be associated with neuroprotection — by drugtreatment in clinically-normal individuals. He there- Clinical utility of stroke and head injury
fore proposed that clinical trials should be restricted to real stroke patients, who experience elevatedplasma glutamate levels. However, such speculation Despite the existence of literature on the efficacy of has not resulted in improvements in the poor clinical more than 700 drugs in treating experimental mod- record of neuroprotectants which were previously els of stroke (artificially-induced focal cerebral found to be successful in animal models.
ischaemias; 64), only recombinant tissue plasmino- The utility of the majority of these animal studies gen activator (rt-PA) and aspirin have convincingly also appears to have been impeded by their poor demonstrated efficacy in human clinical trials of methodological quality. Examples include: animal treatments for acute ischaemic stroke (65–67).
studies on the efficacy of melatonin (64); 20 animal Hence, Macleod and colleagues (64) stated that, studies on the efficacy of nimodipine (68); 29 animal This failure of putative neuroprotective drugs in studies on the efficacy of FK506 (72); 45 animal stud- clinical trials represents a major challenge to the ies on five compounds from different classes of doctrine that animals provide a scientifically-valid alleged neuroprotective agents — clomethiazole, model for human stroke. At least 10 published sys- gavestinel, lubeluzole, selfotel, and tirilazad mesylate tematic reviews have described the poor human (73); 25 animal studies on the efficacy of nitric oxide clinical utility of animal experimental models of (NO) donors and L-arginine (74); and 73 animal stud- stroke and head injuries (64, 68–76).
ies of the efficacy of NO synthase inhibitors (75).
In some cases, clinical trials proceeded, despite The methodological quality of animal studies was equivocal evidence of efficacy in animal studies. For typically scored on the basis of the presence of char- example, Horn and colleagues (68) systematically acteristics such as: appropriate animal models (aged, reviewed 20 animal studies on the efficacy of diabetic or hypertensive animals are considered to nimodipine, of which only 50% showed beneficial more-closely model human stroke patients); power effects following treatment. They concluded that, calculations of sample sizes; random allocation to .the results of this review did not show convincing treatment and control groups; use of a clinically-rel- evidence to substantiate the decision to perform tri- evant time window for commencement of treatment; als with nimodipine in large numbers of patients. blinded drug administration; use of anaesthetics These clinical trials also demonstrated equivocal without significant intrinsic neuroprotective activity evidence of efficacy, and furthermore, proceeded (ketamine, for example, may alter neuroprotective concurrently with the animal studies, despite the activity); blinded induction of ischaemia (given that fact that the latter are intended to be conducted the severity of induced infarcts may be subtly prior to clinical trials, to facilitate the detection of affected by knowledge of treatment allocation); blinded outcome assessment; assessment of both O’Collins and colleagues (69) conducted a very infarct volume and functional outcome; adequate large review of 1,026 experimental drugs for acute monitoring of physiological parameters; assessment stroke that had been tested in animal models. They during both the acute (e.g. one to six days) and found that the effectiveness in animals of 114 drugs chronic (e.g. seven to 30 days) phases; statement of chosen for human clinical use was no greater than temperature control; compliance with animal wel- that of the remaining 912 drugs not chosen for clin- fare regulations; peer-reviewed publication; and con- ical use, thereby demonstrating that effectiveness flict of interest statements. Typically, one point was in animal models had no measurable effect on given for the presence of each characteristic. For whether or not these drugs were selected for human example, The Stroke Therapy Academic Industry clinical use. Accordingly, O’Collins and colleagues Roundtable recommendations for standards with questioned whether the most efficacious drugs are, regard to preclinical and restorative drug develop- in fact, being selected for clinical trials, and called ment involve an eight-point scale (68, 77).
for greater rigour in the conduct, reporting, and Median quality scores were: four out of 10 (13 studies; range zero to six [64]); four out of 10 (29 Poor human clinical and toxicological utility of animal experiments 647 studies; range zero to seven [72]); three out of 10 review (78–85, of which 79 and 80 described a sin- (45 studies [73]); and three out of 8 (73 studies; gle review), in only two cases — one of which was range one to six [75]). Common deficiencies contentious — did the animal models appear to be included lack of: sample size calculations, aged ani- clearly useful in the development of human clinical mals or those with appropriate co-morbidities, ran- interventions, or substantially consistent with domised treatment allocation, blinded drug administration, blinded induction of ischaemia, As in the case of stroke, some clinical trials pro- blinded outcome assessment, and conflict of inter- ceeded, despite equivocal evidence of efficacy in ani- est statements. Some studies also used ketamine mal studies. Upon systematically reviewing the anaesthesia, and there was also substantial varia- effects of Low Level Laser Therapy (LLLT) on wound healing in 36 cell or animal studies, Lucas van der Worp and colleagues (73), for example, con- and colleagues (78) found that an in-depth analysis cluded that the collective evidence for neuroprotec- of studies with the highest methodological quality tive efficacy which formed the basis for 21 clinical showed no significant pooled treatment effect.
trials, was obtained in animal studies with a method- Despite this, the clinical trials proceeded. Further - ological quality that would not, in retrospect, justify more, almost from the beginning of LLLT investi- such a decision. Wilmot and colleagues (74) also gations, animal experiments and clinical studies found considerable variations in animal experiment occurred simultaneously, rather than sequentially.
protocols, which concerned: animal species; physio- The human trials also failed to demonstrate signif- logical parameters (such as blood pressure); drug administration (timing, dosage, and route); surgical Roberts and colleagues (79), and Mapstone and methodology; and duration of ischaemia. Statistical colleagues (80), all systematically reviewed a group analysis (Egger’s test) also revealed the likely exis- of 44 randomised, controlled animal studies on the tence of publication bias (an increased tendency to efficacy of fluid resuscitation in bleeding animals. A publish studies in which a treatment effect is appar- previous systematic review by some of these inves- ent, or a decreased tendency to so publish, e.g. result- tigators of clinical trials of fluid resuscitation had ing from commercial pressures, particularly in the found no evidence that the practice improved out- case of patented drugs under development). Macleod comes, and had even identified the possibility that and colleagues (64) commented that, These deficien- it might be harmful (86). In this later review cies apply to most, if not all, of the animal literature. (79–80), they found that fluid resuscitation reduced This is of particular concern, because Macleod and mortality in animal models of severe haemorrhage, colleagues (72) reported that efficacy was apparently but increased mortality in those with less severe lower in higher quality studies, which raised concerns that the apparent efficacy may have been artificially After clinical trials in humans failed to provide elevated by factors such as poor methodological qual- evidence of benefit, Lee and colleagues (81) con- ducted a systematic review and meta-analysis of A related review, not limited solely to stroke, exem- controlled trials of endothelin receptor blockade in plified some of these issues. Perel and colleagues (76) animal models of heart failure. Meta-analysis failed examined therapeutic interventions with unambigu- to provide evidence of overall benefit, and indicated ous evidence of a treatment effect (benefit or harm), increased mortality with early administration. in clinical trials related to the following: corticos- In their investigation of the contributions of teroidal treatment for head injury; anti-fibrinolytics human clinical trial results and analogous experi- for the treatment of haemorrhage; thrombolysis, and mental studies to asthma research — one of the also tirilazad, for the treatment of acute ischaemic most common and heavily-investigated of modern stroke; antenatal corticosteroids in the prevention of diseases — Corry and Kheradmand (82) demon- neonatal respiratory distress syndrome; and bisphos- strated that failure to conduct and analyse the phonates in the treatment of osteoporosis. They results of animal studies before proceeding to clini- found that three interventions had similar outcomesin animal models, whilst three did not, suggesting cal trials is not uncommon: Research along two that the animal studies did not reliably predict the fronts, involving experimental models of asthma human outcomes. Perel and colleagues reported that and human clinical trials, proceeds in parallel, the animal studies varied in methodological quality often with investigators unaware of their counter- and sample sizes, that randomisation and blinding were rarely reported, and that publication bias was The clinical utility of animal models is clearly questionable in such cases, in which clinical trialsproceed concurrently with, or prior to, animal stud-ies, or continue, despite equivocal evidence of effi- Clinical utility of other animal experiments
As in the case of stroke, the clinical utility of the Of seven systematic reviews on the utility of animal majority of these animal studies also appears to models in other clinical fields identified by this have been limited by their poor methodological quality. Examples include: 36 cell or animal studies humans with aspirin, but discordant results were on the effects of LLLT on wound healing (78); 44 obtained with calcium and wheat bran (the equiva- studies on the efficacy of fluid resuscitation in lent β-carotene results were not available). Corpet bleeding animals (79–80); and studies on the effi- and Pierre concluded that these results suggest that cacy of endothelin receptor blockade in animal mod- the use of the rodent models can roughly predict els of heart failure (81). Common flaws included treatment effects in humans, but that the predic- inadequate sample sizes, leaving studies underpow- tion is not accurate for all agents, and that the car- ered, and lack of randomisation and blinding.
cinogen-induced rat model is more predictive than In some cases, obvious deficiencies within the the Min mouse model. However, relatively few animal models were identified. In commenting on agents were tested, and two of the three agents the clinical relevance of animal models for testing tested in mice produced different outcomes in the effects of LLLT on wound healing, Lucas and humans, so the conclusion that rodents are predic- colleagues (78) noted that the animal models tive of human treatment effects, albeit only excluded common problems associated with wound healing in humans, such as ischaemia, infection,and necrotic debris. Difficulties were also apparent, in translating Toxicological utility: carcinogenicity
animal outcomes to human clinical protocols, in atleast one case. Lazzarini and colleagues (83) Due to the limited availability of data on human reviewed experimental studies on osteomyelitis, to exposure, the identification and regulation of expo- ascertain their impacts on the systemic antibiotic sure to potential human toxins has traditionally treatment of human osteomyelitis. Although they relied heavily on animal studies. However, system- found that most of the animal models reviewed atic reviews have indicated that the utility of ani- were reproducible and dependable, they also found mal studies for these purposes is lacking in the that the human predictivity of these studies was fields of carcinogenicity (at least five reviews: unclear, and was possibly undermined by difficul- 87–91) and teratology (one review: 92). No system- ties in establishing the right dose regimen in the atic review demonstrated a contrary result. The animals. Although they considered that the use of sensitivities of animal models to a range of human antibiotic combinations was associated with better toxicities (i.e. the ability to identify them) high- outcomes in the majority of animal studies, and lighted by one review (93) generally appears to be that these studies did provide indications of appro- accompanied by poor human specificity (i.e. the priate minimum treatment durations, they con- ability to correctly identify human non-toxins), cluded that these studies had limited relevance to resulting in a high incidence of false-positive In two cases, reviewers reported that animal and human outcomes were substantially consistent,although in one case this conclusion was con- tentious. While reviewing therapeutic approachesto streptococcal endocarditis, Scheld (84) reported The regulation of human exposure to potentially good overall correlations among results obtained by carcinogenic chemicals constitutes society’s most in vitro susceptibility testing (especially killing important use of animal carcinogenicity data. In kinetics in broth), in animal experiments, and in 2004, to examine the utility of animal carcinogenic- clinical trials on different antimicrobial regimens in ity data in protecting public health, I surveyed the humans with streptococcal endocarditis.
EPA’s Integrated Risk Information System (IRIS) To investigate the efficacy of rodent models of chemicals database. This database contains the carcinogenesis in predicting treatment outcomes in environmental contaminants of greatest concern in humans, Corpet and Pierre (85) conducted a sys- the USA, together with their animal, and, in a small tematic review and meta-analysis of colon cancer minority of cases, human toxicity data, along with chemoprevention studies involving the use of the human toxicity assessments based on this aspirin, β-carotene, calcium, and wheat bran, in pooled data. However, of the 160 IRIS chemicals rats, mice and humans. Controlled intervention lacking even limited human exposure data, but pos- studies on the recurrence of adenomas in human sessing animal data, for which human toxicity volunteers were compared with chemoprevention assessments existed, the EPA considered the ani- studies of carcinogen-induced tumours in rats, and mal carcinogenicity data to be inadequate to sup- of polyps in Min (Apc[+/–]) mice. 6,714 humans, port a classification of probable human carcinogen 3,911 rats and 458 mice were included in the meta- or non-carcinogen in the majority of cases (58.1%, analyses. Corpet and Pierre found that comparable results were achieved in rats and humans with aspirin, calcium, β-carotene, and wheat bran.
Organisation’s International Agency for Research Comparable results were found in Min mice and on Cancer (IARC) indicated that the true utility of Poor human clinical and toxicological utility of animal experiments 649 animal carcinogenicity data for deriving human car- been added to the 1993 number, yielding a total of cinogenicity assessments is actually substantially 885 agents or exposure circumstances listed in the lower than that indicated solely by EPA assess- IARC Monographs (95). The proportion of definite ments. Of 128 chemicals with human or animal or probable human carcinogens had increased only data assessed by both the EPA and the IARC, slightly, from 13.3% in 1993 to 17.1% in 2004. human carcinogenicity classifications were consis-tent between the two agencies only for the 17 chem-icals for which at least limited human data were available. For those 111 chemicals for which theclassification was primarily reliant on animal data, Surveys by other investigators have also demon- the EPA was much more likely than the IARC to strated the poor human predictivity of animal car- assign carcinogenicity classifications indicative of cinogenicity data. After examining the studies on greater human risk (p < 0.0001; 87).
471 substances contained within the US National The IARC is a leading international authority on Toxicology Program (NTP) carcinogenicity data- carcinogenicity assessments, and the significant dif- base as of 1 July 1998, Haseman (89) concluded ferences between its human carcinogenicity classifi- that, although 250 (53.1%) produced carcinogenic cations and those of the EPA, for identical effects in at least one sex–species group, the actual chemicals, indicate that: i) in the absence of signifi- proportion which posed a significant carcinogenic cant human data, the EPA is over-reliant on animal risk to humans was probably far lower, for reasons carcinogenicity data; ii) as a result, the EPA tends such as interspecies differences in mechanisms of to over-predict carcinogenic risk; and iii) the true predictivity for human carcinogenicity of animal Similarly, around half of all chemicals tested on data is even poorer than that indicated by EPA fig- animals and included in the comprehensive ures alone. EPA policy erroneously assuming that Berkeley-based carcinogenic potency database, tumours in animals are indicative of human car- whether natural or synthetic, gave positive results cinogenicity, was implicated as a primary cause of (89). Rall (96) estimated that only around 10% of these errors, which have substantial US public chemicals are truly carcinogenic to humans. Ashby health implications concerning the regulation of and Purchase (97) speculated that all chemicals human exposures to environmental contaminants would eventually display some carcinogenic activ- ity, if tested in sufficient rodent strains. Even com-mon table salt has been classified as a tumourpromoter in rats (98). Fung and colleagues (99) estimated that, if all the 75,000 chemicals in use were tested for carcino- The poor human predictivity of animal carcino- genicity via the standard NTP bioassay, signifi- genicity studies was also demonstrated in 1993 by cantly less than 50% would prove carcinogenic in Tomatis and Wilbourn (88), who surveyed the 780 animals, and less than 5–10% would warrant fur- chemical agents or exposure circumstances evalu- ther investigation. They suggested that the higher ated and listed within Volumes 1–55 of the IARC positivity rate recorded is due to chemical selection Monographs series (94). Of these, 502 (64.4%) had based on a priori suspicion of carcinogenicity.
definite or limited evidence of animal carcinogenic- However, examination of the carcinogenicity litera- ity, and 104 (13.3%) were assessed as definite or ture reveals that chemicals are selected for study probable human carcinogens. Virtually all of the for many reasons other than a priori suspicion, latter group would, of course, have been members including production volumes, occupational and of the former; so at least 398 animal carcinogens environmental exposure risks, and investigations of were assessed and considered not to be definite or mechanisms of carcinogenesis (100). Despite this, the positivity rate of the carcinogenicity bioassay in The positive predictivity of a test is the propor- the general literature remains around 50% (101).
tion of positive outcomes that are truly positive for Huff (90) demonstrated a significant variation in the characteristic being tested for, while the false- carcinogenicity test results between two major car- positive rate refers to the proportion that are not.
cinogenicity testing programmes, at the NTP Hence, based on these IARC figures, the positive (Research Triangle Park, NC, USA) and the Rama - predictivity of the animal bioassay for definite or zzini Foundation (RF; Bentivoglio, Italy). Both lab- probable human carcinogens was, at best, only oratories had carried out several hundred chemical 20.7% (104/502), while the false-positive rate was at carcinogenesis bioassays: around 500 at the NTP, and 200 at the RF. Of these, 21 chemicals were eval- More-recent IARC classifications indicate little uated by both laboratories, of which published improvement in the positive predictivity of the ani- results were available for 14. The results were mal bioassay for human carcinogens. By 1 January inconsistent for 3 of these 14 chemicals (21.4%), 2004, a decade later, only 105 additional agents had which had been declared carcinogenic by one labo- ratory but not the other, questioning the reliability Toxicological utility: various
of these assays. Of the remaining 11 chemicals,both laboratories found nine to be carcinogenic, and Under the auspices of the International Life Sciences Institute’s Health and Environmental Possible causes for such different toxicity results Sciences Institute, Olsen and colleagues (93) sought between laboratories include differences in: the test to determine the extent to which various types of species, strain, age or gender; the quantity, dura- human toxicities evident during clinical trials could tion and consistency of dosing; the route and be predicted from standard toxicology studies.
method of administration; diet and laboratory envi- Based on a multi-company database of 131 pharma- ronmental conditions; and the criteria used for the ceutical agents with one or more human toxicities identified during clinical trials, they reported a Ennever and Lave (91) demonstrated that nei- true-positive prediction rate of animal models for ther of the two commonly-used interpretations of human toxicity of 69%, and also that study results rodent carcinogenicity data provide valid conclu- from non-rodent (dog, primate) species have good sions about human carcinogenicity. If a risk avoid- potential to identify human toxicities from many ance interpretation is used, in which any positive result in male or female mice or rats is considered These results concur with those of the other tox- positive, then nine of the 10 known human carcino- icity reviews described. Animal studies are often gens among the hundreds of chemicals tested by the reasonably sensitive for human toxins. However, NTP are positive (102), but so are an implausible their human predictivity and toxicological utility 22% of all chemicals tested (99). If a less risk-sensi- are limited by their poor human specificity, which tive interpretation is used, whereby only chemicals results in high false-positive rates.
positive in both mice and rats are considered posi-tive, then only three of the six known human car-cinogens tested in both species are positive (102).
Causes of the poor human utility of animal
The former interpretation could result in the need- less denial of potentially useful chemicals to society,while the latter could result in widespread exposure When evaluated overall, these 27 systematic reviews clearly do not support the widely-held assumptions ofanimal ethics committees and the opinions of advo-cates of animal experimentation, that laboratory ani- Toxicological utility: teratogenicity
mal use is generally beneficial in the development ofhuman therapeutic interventions and the assessment In 2005, my colleagues and I published an extensive of human toxicity. On the contrary, they frequently survey examining the human predictivity of animal demonstrate that animal experiments are of low util- teratogenicity testing (92). We examined nearly ity for these purposes. This appears to result both every putative teratogen tested in more than one from limitations of the animal models themselves, species, including 1,396 studies. Data for 11 groups and also from the poor methodological quality and of known human teratogens tested in 12 animal statistical design of many animal experiments.
species were analysed. Discordance between specieswas apparent in just under 30% of these 1,396reports. Almost a quarter of all the outcomes in the six main species used (mouse, rat, rabbit, hamster,primate and dog) were equivocal. For known human Chimpanzees are our closest living relatives, but teratogens, there was high variability in positive pre- despite great similarities between the structural dictivity between species, the mean of which was regions of chimpanzee DNA and human DNA, impor- only 51% — hardly better than tossing a coin. Some tant differences between the regulatory regions exert species exhibited a high false-negative rate. Only an “avalanche” effect on large numbers of structural around half of these known human teratogens were genes (103). Despite nucleotide difference between teratogenic in more than one primate species. Fewer chimpanzees and humans of only 1–2%, this effect than one in 40 of the substances designated as poten- results in differences of around 20%, in terms of pro- tial teratogens from animal studies, were conclu- tein expression (104), representing a marked pheno- sively linked to human birth defects.
typic differences between the species. These We concluded that the poor human predictivity of differences manifest as: altered susceptibility to the animal-based teratology warrants the cessation of aetiology and progression of various diseases; differ- animal testing, and that resources should be reallo- ences in the absorption, tissue distribution, metabo- cated into the further development and implemen- lism, and excretion of chemotherapeutic agents; and tation of quicker, cheaper and more reliable, differences in the toxicity and efficacy of pharmaceu- scientifically validated alternatives, such as the ticals and other agents (59, 103). Such effects appear to be responsible for the demonstrated inability of Poor human clinical and toxicological utility of animal experiments 651 most chimpanzee research to contribute substan- quality of many of the animal studies examined, tially to the development of methods which are effi- and none of the reviews demonstrated good cacious in combating human diseases (59). methodological quality in a majority of studies.
Other laboratory animal species are much less While the omission of study details due to publica- similar to humans, both genetically and phenotypi- tion space constraints may artificially lower appar- cally, and are therefore less likely to be useful for ent quality, the prevalence of such deficiencies accurately modelling the progression of human dis- exceeds that which might reasonably be expected, eases or of human responses to chemicals and puta- and is, accordingly, grounds for considerable con- Common deficiencies included lack of: sample size calculations, sufficient sample sizes, appropri- ate animal models (e.g. aged animals or those withappropriate comorbidities), randomised treatment Rodents are by far the most common laboratory allocation, blinded drug administration, blinded animal species used in toxicity studies. Several fac- induction of ischaemia in the case of stroke models, tors contribute to the demonstrated inability of blinded outcome assessment, and conflict of inter- rodent bioassays to reliably predict human toxicity.
est statements. Some studies also used anaesthetics The stresses incurred during handling, restraint, that may have altered the experimental outcomes, other routine laboratory procedures, and particu- and substantial variation was evident in the param- larly, the stressful routes of dose administration common to toxicity tests, alter immune status and These deficiencies limited the clinical utility of disease predisposition in ways which are very diffi- these studies in various significant ways. For exam- cult to accurately predict, and which distort the pro- ple, it is well established that studies lacking ran- gression of diseases and responses to chemicals and domisation or blinding often over-estimate the putative chemotherapeutic agents (105, 106). magnitude of the effects of treatments (107–109).
In addition, animals have a broad range of physi- Bebarta and colleagues (110) described the impacts ological defences against general toxic insults, such of lack of randomisation or blinding on estimations as epithelial shedding and inducible enzymes, of the significance of treatment effects in 389 ani- which commonly prove effective at environmentally mal studies and in 2,203 cell line studies. They relevant doses, but which may be overwhelmed at found that studies lacking randomisation or blind- the high doses commonly applied in routine toxicity ing, but not both, were more likely to report a treat- testing (101). Carcinogenicity assays, in particular, ment response than studies that used these involve chronic, high level dosing. This may result, measures (OR = 3.4; 95% CI = 1.7 to 6.9, and OR = inter alia, in insufficient rest intervals between 3.2; 95% CI = 1.3 to 7.7, respectively), and that doses for the effective operation of DNA and tissue studies lacking both randomisation and blinding repair mechanisms, which, with the unnatural ele- were even more likely to report a treatment vation of cell division rates during ad libitum feed- response (OR = 5.2; 95% CI = 2.0 to 13.5). ing, may predispose the animals to mutagenesis andcarcinogenesis. Lower doses, greater intervalsbetween exposures, shorter total periods of expo- sure, and intermittent feeding, which represent amore realistic approach to the environmental expo- Insufficient sample sizes left many studies under- sure of humans to most potential toxins, might not powered, limiting the statistical validity of the result in toxic changes at all (106).
study conclusions. Animal lives and other resources Finally, differences in rates of absorption and may also be wasted, if experiments subsequently transport mechanisms between test routes of require repetition as a result. As stated by the UK administration and other important human routes Medical Research Council (111), The number of ani- of exposure, and the considerable variability of mals used… must be the minimum sufficient to cre- organ systems in response to toxic insults, between ate adequate statistical power to answer the question and within species, strains and genders, render pro- foundly difficult any attempt to accurately predict According to Balls and colleagues (112), however, human hazard on the basis of animal toxicity data …surveys of published papers, as well as more anec- dotal information, suggest that more than half of thepublished papers in biomedical research have statis-tical mistakes, many seem to use excessive numbers of animals, and a proportion are poorly designed.
Festing (113) similarly stated that, Surveys of pub- At least 11 systematic reviews (57, 64, 68, 72–76, lished papers show that there are many errors, both 78–81 [of which, 79 and 80 described a single in the design of the experiments and in the statisti- review]) demonstrated the poor methodological cal analysis of the resulting data. This must result in a waste of animals and scientific resources, and it ing the statistical power of small samples, are par- is surely unethical. De Boo and Hendriksen (114) ticularly appropriate when marked ethical, cost or noted the tendency to alter animal numbers based practical constraints limit the number of animals on scientifically irrelevant issues, such as availabil- that may be used (e.g. in experiments involving Factors that should be considered when calculat- Finally, the appropriate statistical analysis of the ing appropriate sample sizes include: detectability resultant data should be closely linked to the exper- threshold (the size of the difference between treat- imental design, and to the type of data produced ment groups considered significant); known or (124). The relatively poor statistical knowledge of expected data variation; the required significance of many animal researchers may be the cause of the the test (‘p’ or ‘α’: the probability of a Type I error high prevalence of poor sample size choices in ani- — assuming a difference where none exists); the mal studies. Solutions could include the training of acceptable probability of assuming no difference researchers in statistics, and the direct input of where one does exist (‘β’, a Type II error. The statisticians in experimental design and data analy- ‘power’ of an experiment = 1–β; 0.8 is the usual choice); and the type of statistical analysis to whichthe data will be subjected. Smaller thresholds,greater data variation, smaller acceptable error Raising standards: evidence-based medicine
probabilities (greater power), and certain statisticaltests for differences, all require larger samples.
Evidence-based medicine (EBM) bases clinical deci- No universal rule for calculating correct sample sions on methodologically-sound, prospective, ran- sizes exists (114). Festing (115), for example, domised, blinded, and controlled clinical trials. The describes two methods, the preferred ‘power calcu- gold standard for EBM is large prospective epidemio- lation,’ and the ‘resource equation.’ Power calcula- logical studies, or meta-analyses of randomised and tions use formulae which are available in blinded, controlled clinical trials (126). The applica- interactive computer programmes (e.g. 116, 117), tion to animal experiments of the EBM standards and calculate the minimum sample sizes required to which are currently applied to human clinical trials, detect treatment effects with specified degrees of would make the results more robust and would certainty. Mead’s ‘resource equation’ (118) calcu- increase their applicability (76, 127–130). However, lates sample sizes by using degrees of freedom, and mechanisms would be needed to ensure compliance incorporates statistical parameters, such as treat- with such standards. Compliance could, for example, ment effects, block effects and error degrees of free- be made a prerequisite for research funding, ethics committee approval, and the publication of results.
Strategies should also be considered for minimis- These measures would require the education and co- ing animal numbers without unacceptably compro- operation of funding agencies, ethics committees and mising statistical power. Several of these strategies aim to decrease data variability by minimising het- erogeneity in experimental environments and pro- researchers who are planning clinical trials, to ref- tocols. This can be achieved by: i) the appropriate erence systematic reviews of related previous work use of environmental enrichment, aimed at decreas- before they are permitted to proceed (51). To facili- ing physiological variation resulting from barren tate the detection of toxicity and of potentially effi- laboratory housing and stressful procedures; ii) cacious drugs, such reviews should also include all choosing, where possible, to measure variables with relevant animal research (76). A similar require- relatively low inherent variability; iii) the use of ment to reference, or where necessary, conduct, sys- genetically homogeneous (isogenic or inbred) or tematic reviews of relevant animal studies, prior to specified pathogen-free animal strains; and iv) the commencement of further animal studies, screening raw data for obvious errors or outliers would encourage a more complete and impartial assessment of the existing evidence (51).
Meta-analysis involves the aggregation and sta- Mechanisms are also needed to encourage the tistical analysis of suitable data from multiple reporting of negative results. The negative results experiments. For some purposes, treatment and of preclinical studies are much more likely to control groups can be combined, permitting group remain unpublished than are the negative results of numbers to be minimised. Although new informa- clinical trials (131). In a systematic review of stud- tion can be derived through meta-analysis, more ies on the efficacy of nicotinamide in combating frequently, the results allow the refinement of experimentally-induced stroke, comparisons pub- existing knowledge. By designing experiments and lished only in abstract form gave a significantly reporting protocols to maximise their utility for lower estimate of effect size than those published in later meta-analyses, the benefit of individual ran- full, demonstrating publication bias (132). van der domised controlled experiments can be maximised Worp and colleagues (73) commented on the pres- (123). Strategies such as these, aimed at maximis- sure to obtain and publish positive results: It is Poor human clinical and toxicological utility of animal experiments 653 therefore conceivable that the career of a preclinical process should be utilised to improve the efficiency investigator is more dependent on obtaining positive of the formal validation process, by ensuring satis- results, than that of a clinical trialist. factory protocol refinement and transferability, andtest performance (138).
However, it is not always scientifically necessary, Fundamental constraints on the human
or even logistically possible, to conduct multi-centre utility of animal models
practical studies. Hence weight-of-evidence valida-tion, also known as validation by retrospective Strategies designed to increase the full and impar- analysis (139, 140), may be conducted, based on the tial examination of existing data before conducting assessment of existing data in a structured, system- animal studies, to improve their methodological atic and transparent manner, provided that data of quality, and to decrease bias during the publication sufficient quantity and quality are available (141).
of results, would minimise the consumption of ani- Regardless of the approach taken, the criteria mal, financial and other resources within studies of required for formal validation are comprehensive questionable merit and quality, and would increase (136, 141). Key objectives include: establishing the the potential utility of animal data in addressing role and necessity of the test model; ensuring clar- human situations and problems. However, the poor ity of the defined goals; defining a prediction model, human clinical or toxicological utility of many ani- i.e. an algorithm for converting the test data into mal experiments is unlikely to result solely from meaningful predictions of in vivo toxicity; examin- their poor methodological quality, or from publica- ing the mechanistic relevance and credibility of the tion bias. As stated by Perel et al. (76), the failure of model with respect to those goals; and providing a animal models to adequately represent human dis- description of the limitations of the model. ease may be another fundamental cause, which, in Where practical validation studies do occur, these contrast, could be technically and theoretically should adhere to best practice standards, designed to ensure good methodological quality, including, for The genetic modification of animal models example, statistical justifications of sample sizes, ran- through the addition of foreign genes (transgenic domised allocation to test groups, and blinded treat- animals) or the inactivation or deletion of genes ment and assessment of results. Where possible, (knockout animals) is being attempted, to make inter-laboratory reproducibility should be demon- them more-closely model humans. However, as well as being technically very difficult to achieve, such Whether validation studies are conducted by prac- modification may not permit clear conclusions, due tical or weight-of-evidence approaches, experience to a large number of factors, including those reflect- has shown that transparency and independence from ing the intrinsic complexity of living organisms, commercial, political or other interests should be such as the variable redundancy of some metabolic maximised through the use of independent experts pathways between species (133). Furthermore, the and the peer-reviewed publication of outcomes (136).
animal welfare burdens incurred during the cre- Scientific validation should lead to the reasoned ation and use of GM animals are particularly high overall assessment that sufficient evidence exists to demonstrate that a model is, or is not, relevant andreliable for the specified purpose, or that insuffi-cient evidence exists to be reasonably certain either Implications for scientific validation of
way. In some cases, an interim assessment can be experimental models
made, until further evidence becomes available(141).
Proposed non-animal test models are generally The European Centre for the Validation of required to pass formal scientific validation before Alternative Methods (ECVAM) was created by the their use is widely or officially accepted.
EC in 1991, to fulfil the requirements of Directive Pharmaceutical licensing agencies, for example, are 86/609/EEC on the protection of animals used for generally unwilling to accept non-animal test data experimental and other scientific purposes. These as evidence of the human safety of proposed new requirements state that the EC and its Member pharmaceuticals, until the test models used have States should actively support the development, validation and acceptance of methods which could Scientific validation has traditionally involved replace, refine or reduce the use of laboratory ani- the demonstration, in multiple independent labora- mals (142). The US equivalent is the Interagency tories, that the test in question is relevant and reli- Coordinating Committee on the Validation of able for its specified purpose (practical validation; Alternative Methods (ICCVAM), which has similar 135), such as the prediction of a certain in vivo out- goals. Despite the high standards required for suc- come. It should also be preceded by an evaluation of cessful validation, between 1998 and 2007, 21 dis- the necessity for the test and of the adequacy of its tinct tests or categories of test methods that could development (136, 137). A three-stage prevalidation replace, reduce or refine laboratory animal use, had been validated and registered with ECVAM, animal data can be generally assumed not to be sub- and nine had achieved regulatory acceptance Likely causes of this inadequacy include inherent However, unlike non-animal models, animal mod- genotypic and phenotypic differences between els are generally assumed to be reasonably predictive human and non-human species, the distortion of of human outcomes in preclinical drug development, experimental outcomes arising from experimental toxicity testing, and other fields of biomedical environments and protocols, and the poor method- research, without the need to undergo formal valida- ological quality of many animal experiments, as was tion studies. Yet the 27 systematic reviews examined apparent in at least 11 reviews. There were no in this study, demonstrate that it is insufficient to reviews in which a majority of animal experiments assume that animal models are reliably predictive of were of good methodological quality. Some of these human outcomes, even those in use for long periods, problems might be minimised with concerted effort without subjecting them to critical assessment.
(given their widespread prevalence), but the limita- Clearly, formal validation should be consistently tions resulting from interspecies differences are applied to all proposed experimental models, likely to be technically and theoretically impossible regardless of their animal, non-animal, historical, contemporary or possible future status, and models Despite the fact that they have not passed and, should be chosen on the basis of critical scientific indeed, could not pass, the formal scientific validation review, with appropriate consideration also given to process required of non-animal models prior to regu- animal welfare, ethical, legal, economic, and any latory acceptance, most animal models are incorrectly assumed to be predictive of human outcomes. The consistent application of formal validation studies to Chemicals Bureau, the EC agencies responsible for all test models is clearly warranted, regardless of technical aspects of validation and for EU chemicals their animal, non-animal, historical, contemporary or regulations, respectively, at that time, made a simi- lar call in 1995, in which they urged that prevalida- should be based on such critical scientific review, tion and independent assessment be applied with with appropriate cons ideration also given to animal equal force to all new or modified animal and non- welfare, ethical, legal, economic and other relevant Likely benefits would include greater selection of models truly predictive for human outcomes, increased safety of people exposed to chemicals thathave passed toxicity tests, increased efficiency during The historical and contemporary paradigm, that ani- the development of human pharmaceuticals and mal models are generally reasonably predictive of other therapeutic interventions, and decreased human outcomes, provides the basis for their wide- wastage of animal, personnel and financial resources. spread use in toxicity testing and biomedical In addition, the poor human clinical and toxicolog- research aimed at preventing or developing cures for ical utility of most animal models for which data human diseases. However, their use persists for his- exists, in conjunction with their generally substantial torical and cultural reasons, rather than because animal welfare and economic costs, justify a ban on they have been demonstrated to be scientifically the use of animal models lacking scientific data valid. For example, many regulatory officials “feel clearly establishing their human predictivity or util- more comfortable” with animal data (145), and some even believe that animal tests are inherently valid,simply because they are conducted in animals (146).
However, most existing systematic reviews have Received 02.03.07; received in final form 10.07.07;accepted for publication 11.07.07. demonstrated that animal experiments are insuffi-ciently predictive of human outcomes to providesubstantial benefits during the development of human clinical interventions, or in deriving humantoxicity assessments. In only two of 20 reviews in Anon. (2007). Annex to the Fifth Report on the Stat - which clinical utility was examined, did the authors istics on the Number of Animals Used for Experi conclude that the animal models were either signif- mental and other Scientific Purposes in the Member icantly useful in contributing to the development of States of the European Union (COM(2007)675 final), clinical interventions, or were substantially consis- 277pp. Brussels, Belgium: European Commission.
tent with clinical outcomes (84, 85), and one of Goldberg, A.M. (2002). Use of animals in research:a science–society controversy? The American per- these conclusions was contentious. Seven additional spective: animal welfare issues. ALTEX 19,
reviews also failed to clearly demonstrate utility in predicting human toxicological outcomes, such as Stephens, M.L., Alvino, G.M. & Branson, J.B. (2002).
carcinogenicity and teratogenicity. Consequently, Animal pain and distress in vaccine testing in the Poor human clinical and toxicological utility of animal experiments 655 United States. Developments in Biologicals 111,
(2006). Strategies to reduce animal testing in US EPA’s HPV program. ALTEX 23 Special Issue,
Anon. (2007). FY 2006 AWA Inspections, 11pp.
Riverdale, MD, USA: United States Department of Brom, F.W. (2002). Science and society: different Agriculture Animal and Plant Health Inspection bioethical approaches towards animal experimenta- Service (USDA APHIS). Available at: http://www.
tion. ALTEX 19, 78–82.
a p h i s . u s d a . g o v / a n i m a l _ w e l f a r e / d o w n l o a d s / Festing, M.F.W. (2004). Is the use of animals in bio- awreports/awreport2006.pdf (Accessed 12.12.07).
medical research still necessary in 2002? Unfort Carbone, L. (2004). What Animals Want: Expertise unately, “Yes”. ATLA 32 Suppl. 1B, 733–739.
and Advocacy in Laboratory Animal Welfare Policy, Pawlik, W.W. (1998). The significance of animals in 291pp. Oxford, UK: Oxford University Press. biomedical research. [Znaczenie zwierzat w badani- Office of Technology Assessment, US Congress ach biomedycznych.] Folia Medica Cracoviensia 39,
(1986). Alternatives to Animal Use in Research, Test - ing and Education, OTA-BA-273, 437pp. Washing Kjellmer, I. (2002). Animal experiments are neces- ton, DC, USA: US Government Printing Office.
sary. Coordinated control functions are difficult to Home Office (2007). Statistics of Scientific Proced - study without the use of nature’s most complex sys- ures on Living Animals: Great Britain 2006, 49pp.
tems: mammals and human beings. [Djurförsök är nödvändiga. Samordnade kontrollfunktioner låter O’Shea, D. (2000). Johns Hopkins enters suit over sig svårligen studeras utan tillgång till naturens lab animal regulations. Press Release, 22 Septem - mest komplexa system: däggdjur och människa.] ber, 2000. Baltimore, MD, USA: Johns Hopkins Lakartidningen 99, 1172–1173.
Osswald, W. (1992). Ethics of animal research and Fishbein, E.A. (2001). What price mice? Journal of application to humans. [Etica da investigação no the American Medical Association 235, 939–941.
animal e aplicação ao homem.] Acta Medica Port - Sauer, U.G., Kolar, R. & Rusche, B. (2005). The use uguesa 5, 222–225.
of transgenic animals in biomedical research in Greek, C.R. & Greek, J.S. (2002). 4th World Con gress Germany. Part 1: Status Report 2001–2003. [Die Point/Counterpoint: Is Animal Research Necess ary in Verwendung transgener Tiere in der biomed 2002?, 54pp. Los Angeles, CA, US: Americans for ischen Forschung in Deutschland. Teil 1: Sach stands bericht 2001–2003.] ALTEX 22, 233–246.
Singer, P. (1990). Animal Liberation: A New Ethics Anon. (2007). Swiss animal use statistics for 2005.
for our Treatment of Animals, 2nd edn, 320pp. New Pain & Distress Report 7, 2. Available at: http://www.
York, NY, USA: New York Review/Random House. (Accessed 12.12.07).
La Follette, H. & Shanks, N. (1994). Animal experi- Rusche, B. (2003). The 3Rs and animal welfare — mentation: the legacy of Claude Bernard. Inter - conflict or the way forward? ALTEX 20 Suppl. 1,
national Studies in the Philosophy of Science 8,
Combes, R.D., Balls, M., Bansil, L., Barratt, M., Bell, Greek, C.R. & Greek, J.S. (2000). Sacred Cows and D., Botham, P., Broadhead, C., Clothier, R., George, Golden Geese, 242pp. New York, NY, USA: Cont - E., Fentem, J., Jackson, M., Indans, I., Loizou, G., Navaratnam, V., Pentreath, V., Phillips, B., Stemp - Greek, C.R. & Greek, J.S. (2002). Specious Science, lewski, H. & Stewart, J. (2004). The Third FRAME 288pp. New York, NY, USA: Continuum.
Toxicity Committee: Working toward greater imple- Anon. (2006). Statement re: TGN1412. Available at: mentation of alternatives in toxicity testing. ATLA 32 Suppl. 1B, 635–642.
Green, S. & Goldberg, A.M. (2004). TestSmart and Anon. (2006). Frequently asked questions regarding toxic ignorance. ATLA 32 Suppl. 1A, 359–363.
TGN1412. Available at: Fenner-Crisp, P.A., Maciorowski, A.F. & Timm, news/faqs_re_tgn1412/index.php (Accessed 18.04.06).
G.E. (2000). The endocrine disruptor screening pro- Bhogal, N. & Combes, R. (2006). TGN1412: time to gram developed by the US Environmental Protec - change the paradigm for the testing of new phar- tion Agency. Ecotoxicology 9, 85–91.
maceuticals. ATLA 34, 225–229.
Green, S., Goldberg, A.M. & Zurlo, J. (2001). The Coghlan, A. (2006). Mystery over drug trial debacle TestSmart-HPV program — Development of an deepens. news service, 14 August, integrated approach for testing high production vol- 2006. Available at: ume chemicals. Regulatory Toxicology & Pharm article.ns?id=dn9734 (Accessed 12.12.07).
acology 33, 105–109.
Graham, D.J., Campen, D., Hui, R., Spence, M., Armstrong, T.W., Zaleski, R.T., Konkel, W.J. & Park - Cheetham, C., Levy, G., Shoor, S. & Ray, W.A.
erton, T.J. (2002). A tiered approach to assessing chil- (2005). Risk of acute myocardial infarction and sud- dren’s exposure: a review of methods and data.
den cardiac death in patients treated with cyclo-oxy- Toxicology Letters 127, 111–119.
genase 2 selective and non-selective non-steroidal Charles, G.D. (2004). In vitro models in endocrine anti-inflammatory drugs: nested case-control study.
disruptor screening. ILAR Journal 45, 494–501.
Lancet 365, 475–481.
Stokes, W.S. (2004). Selecting appropriate animal Dahl, S.L. & Ward, J.R. (1982). Pharmacology, clin- models and experimental designs for endocrine dis- ical efficacy, and adverse effects of the nonsteroidal ruptor research and testing studies. ILAR Journal anti-inflammatory agent benoxaprofen. Pharmaco - 45, 387–393.
therapy 2, 354–366.
Louekari, K., Sihvonen, K., Kuittinen, M. & Sømnes, Gad, S.C. (1990). Model selection in toxicology: prin - V. (2006). In vitro tests within the REACH informa- ciples and practice. Journal of the American College of tion strategies. ATLA 34, 377–386.
Toxicology 9, 291–302.
Sandusky, C., Even, M., Stoick, K. & Sandler, J.
Ross-Degnan, D., Soumerai, S.B., Fortess, E.E. & Gurwitz, J.H. (1993). Examining product risk in human medicine within more than 10 years.
context. Market withdrawal of zomepirac as a case [Lecture abstract.] ALTEX 23, 111.
study. Journal of the American Medical Association Hackam, D.G. & Redelmeier, D.A. (2006). Trans 270, 1937–1942.
lation of research evidence from animals to humans.
Peters, T.S. (2005). Do preclinical testing strategies Journal of the American Medical Association 296,
help predict human hepatotoxic potentials? Tox - icologic Pathology 33, 146–154.
Hackam, D.G. (2007). Translating animal research Venning, G.R. (1983). Identification of adverse reac- into clinical benefit: poor methodological standards tions to new drugs. I: What have been the important in animal studies mean that positive results may adverse reactions since thalidomide? British Med - not translate to the clinical domain. British Medical ical Journal 286, 199–202.
Journal 334, 163–164.
Wallenstein, L. & Snyder, J. (1952). Neurotoxic reac - Knight, A. (2007). The poor contribution of chim- tion to chloromycetin. Annals of Internal Medicine panzee experiments to biomedical progress. Journal 36, 1526–1528.
of Applied Animal Welfare Science 10, 281–308.
Blum, M.D., Graham, D.J. & McCloskey, C.A.
Conlee, K.M., Hoffeld, E.H. & Stephens, M.L.
(1994). Temafloxacin syndrome: review of 95 cases.
(2004). A demographic analysis of primate research Clinical Infectious Diseases 18, 946–950.
in the United States. ATLA 32 Suppl. 1A, 315–322.
Mulder, P., Richard, V. & Thuillez, C. (1998). Diff - Morris, E. (Undated). Sampling from Small Popul - erent effects of calcium antagonists in a rat model of ations. Available at: heart failure. Cardiology 89 Suppl. 1, 33–37.
S o c i o l o g y / S a m p l i n g % 2 0 f r o m % 2 0 s m a l l % 2 0 Food and Drug Administration, US Department of populations.htm (Accessed 12.12.07).
Health and Human Services (2004). Innovation or Guenther, W.C. (1973). A sample size formula for Stagnation: Challenge and Opportunity on the Crit - the hypergeometric. Journal of Quality Technology ical Path to New Medical Products, 31pp. Available 5, 167–170.
at: Green, J. (1982). Asymptotic sample size for given confidence interval length. Applied Statistics 31,
Lazarou, J. & Pomeranz, B. (1998). Incidence of adverse drug reactions in hospitalized patients: a Macleod, M.R., O’Collins, T., Horky, L.L., Howells, meta-analysis of prospective studies. Journal of the D.W. & Donnan, G.A. (2005). Systematic review and American Medical Association 279, 1200–1205.
meta-analysis of the efficacy of melatonin in experi- Koppanyi, T. & Avery, M.A. (1966). Species differ- mental stroke. Journal of Pineal Research 38,
ences and the clinical trial of new drugs: a review.
Clinical Pharmacology & Therapeutics 7, 250–270.
The National Institute of Neurological Disorders and Villar, D., Buck, W.B. & Gonzalez, J.M. (1998).
Stroke rt-PA Stroke Study Group (1995). Tissue plas- Ibuprofen, aspirin and acetaminophen toxicosis and minogen activator for acute ischemic stroke. New treatment in dogs and cats. Veterinary & Human England Journal of Medicine 333, 1581–1588.
Toxicology 40, 156–162.
Chinese Acute Stroke Trial (CAST) Collaborative Wilson, J.G., Ritter, E.J., Scott, W.J. & Fradkin, R.
Group (1997). Randomised placebo-controlled trial (1977). Comparative distribution and embryotoxic- of early aspirin use in 20,000 patients with acute ity of acetylsalicylic acid in pregnant rats and rhe- ischaemic stroke. Lancet 349, 1641–1649.
sus monkeys. Toxicology & Applied Pharmacology International Stroke Trial Collaborative Group 41, 67–78.
(1997). The International Stroke Trial (IST): a ran- National Institutes of Health (2006). Information on domised trial of aspirin, subcutaneous heparin, or Clinical Trials and Human Research Studies.
both, or neither, among 19,435 patients with acute Available at:; ischaemic stroke. Lancet 349, 1569–1581.
jsessionid=B9D601AD55432DBDD59314931CA8385 Horn, J., de Haan, R.J., Vermeulen, M., Luiten, P.G.M. & Limburg, M. (2001). Nimodipine in ani- Pound, P., Ebrahim, S., Sandercock, P., Bracken, mal model experiments of focal cerebral ischemia: a M. & Roberts, I. (2004). Where is the evidence that systematic review. Stroke 32, 2433–2438.
animal research benefits humans? British Medical O’Collins, V.E., Macleod, M.R., Donnan, G.A., Horky, Journal 328, 514–517.
L.L., van der Worp, B.H. & Howells, D.W. (2006).
Nuffield Council on Bioethics (2005). The Ethics of 1026 experimental treatments in acute stroke.
Research Involving Animals, 376pp. London, UK: Annals of Neurology 59, 467–477.
Jonas, S., Aiyagari, V., Vieira, D. & Figueroa, M.
Anon. (2006). Scopus in detail: what does it cover? (2001). The failure of neuronal protective agents ver- Available at: sus the success of thrombolysis in the treatment of ischemic stroke: the predictive value of animal mod- National Center for Biotechnology Information els. Annals of the New York Academy of Sciences 939,
(2006). PubMed overview. Available at: http://www. Curry, S.H. (2003). Why have so many drugs with stellar results in laboratory stroke models failed in Lindl, T., Völkel, M. & Kolar, R. (2005). [Animal clinical trials? A theory based on allometric rela- experiments in biomedical research. An evaluation tionships. Annals of the New York Academy of of the clinical relevance of approved animal experi- Sciences 993, 69–74.
mental projects.] [German.] ALTEX 22, 143–151.
Macleod, M.R., O’Collins, T., Horky, L.L., Howells, Lindl, T., Völkel, M. & Kolar, R. (2006). Animal D.W. & Donnan, G.A. (2005). Systematic review and experiments in biomedical research. An evaluation meta-analysis of the efficacy of FK506 in experi- of the clinical relevance of approved animal experi- mental stroke. Journal of Cerebral Blood Flow & mental projects: No evident implementation in Metabolism 25, 1–9.
Poor human clinical and toxicological utility of animal experiments 657 van der Worp, H.B., de Haan, P., Morrema, E. & New Frontiers in Cancer Causation (ed. O. Iversen), Kalk man, C.J. (2005). Methodological quality of ani- pp. 371–387. Washington, DC, USA: Taylor and mal studies on neuroprotection in focal cerebral ischaemia. Journal of Neurology 252, 1108–1114.
Haseman, K. (2000). Using the NTP database to Willmot, M., Gray, L., Gibson, C., Murphy, S. & assess the value of rodent carcinogenicity studies Bath, P.M. (2005). A systematic review of nitric for determining human cancer risk. Drug Metab oxide donors and L-arginine in experimental stroke; olism Reviews 32, 169–186.
effects on infarct size and cerebral blood flow. Nitric Huff, J. (2002). Chemicals studied and evaluated in Oxide 12, 141–149.
long-term carcinogenesis bioassays by both the Willmot, M., Gibson, C., Gray, L., Murphy, S. & Ramazzini Foundation and the National Toxicology Bath, P. (2005). Nitric oxide synthase inhibitors in Program. Annals of the New York Academy of experimental ischemic stroke and their effects on Sciences 982, 208–230.
infarct size and cerebral blood flow: a systematic Ennever, F.K. & Lave, L.B. (2003). Implications of review. Free Radical Biology & Medicine 39,
the lack of accuracy of the lifetime rodent bioassay for predicting human carcinogenicity. Regulatory Perel, P., Roberts, I., Sena, E., Wheble, P., Briscoe, Toxicology & Pharmacology 38, 52–57.
C., Sandercock, P., Macleod, M., Mignini, L.E., Bailey, J., Knight, A. & Balcombe, J. (2005). The Jayaram, P. & Khan, K.S. (2007). Comparison of future of teratology research is in vitro. Biogenic treatment effects between animal experiments and Amines 19, 97–145.
clinical trials: systematic review. British Medical Olson, H., Betton, G., Stritar, J. & Robinson, D.
Journal 334, 197–200.
(1998). The predictivity of the toxicity of pharma- Stroke Therapy Academic Industry Roundtable ceuticals in humans from animal data — an interim (1999). Recommendations for standards regarding assessment. Toxicology Letters 102–103, 535–538.
preclinical neuroprotective and restorative drug International Agency for Research on Cancer (IARC) development. Stroke 30, 2752–2758.
(1972–1992). IARC Monographs on the Eval uation of Lucas, C., Criens-Poublon, L.J., Cockrell, C.T. & De Carcinogenic Risks to Humans, Volumes 1–55. Lyon, Haan, R.J. (2002). Wound healing in cell studies and animal model experiments by Low Level Laser International Agency for Research on Cancer Therapy; were clinical studies justified? A system- (IARC) (undated). IARC Monographs Programme atic review. Lasers in Medical Science 17, 110–134.
on the Evaluation of Carcinogenic Risks to Humans.
Roberts, I., Kwan, I., Evans, P. & Haig, S. (2002).
Available at: (Accessed Does animal experimentation inform human health - care? Observations from a systematic review of inter- Rall, D.P. (2000). Laboratory animal tests and human national animal experiments on fluid resuscitation.
cancer. Drug Metabolism Reviews 2, 119–128.
British Medical Journal 324, 474–476.
Ashby, J. & Purchase, I.F.H. (1993). Will all chemi- Mapstone, J., Roberts, I. & Evans, P. (2003). Fluid cals be carcinogenic to rodents when adequately resuscitation strategies: a systematic review of ani- evaluated? Carcinogenesis 8, 489–495.
mal trials. Journal of Trauma 55, 571–589.
Shirai, T., Fukushima, S., Ohshima, M. & Ito, N.
Lee, D.S., Nguyen, Q.T., Lapointe, N., Austin, P.C., (1984). Effects of butylated hydroxyanisole, buty- Ohlsson, A., Tu, J.V., Stewart, D.J. & Rouleau, J.L.
lated hydroxytoluene, and NaCl on gastric car- (2003). Meta-analysis of the effects of endothelin cinogenesis initiated with N-methyl-N-nitro-N- receptor blockade on survival in experimental heart nitrosoguanidine in F344 rats. Journal of the failure. Journal of Cardiac Failure 9, 368–374.
National Cancer Institute 72, 1189–1198.
Corry, D.B. & Kheradmand, F. (2005). The future of Fung, V., Barrett, J. & Huff, J. (1995). The carcino- asthma therapy: integrating clinical and experimen- genesis bioassay in perspective: application in iden- tal studies. Immunologic Research 33, 35–51.
tifying human hazards. Environmental Health Lazzarini, L., Overgaard, K.A., Conti, E. & Shirtliff, Perspectives 103, 680–683.
M.E. (2006). Experimental osteomyelitis: What have 100. Gold, L.S., Bernstein, L., Magaw, R. & Slone, T.H.
we learned from animal studies about the systemic (1989). Interspecies extrapolation in carcinogenesis: treatment of osteomyelitis? Journal of Chemotherapy prediction between rats and mice. Environmental 18, 451–460.
Health Perspectives 81, 211–219.
Scheld, W.M. (1987). Therapy of streptococcal endo- 101. Gold, L.S., Slone, T.H. & Ames, B.N. (1998). What carditis: correlation of animal model and clinical do animal cancer tests tell us about human cancer studies. Journal of Antimicrobial Chemotherapy 20
risk? Overview of analyses of the carcinogenic potency database. Drug Metabolism Reviews 30,
Corpet, D.E. & Pierre, F. (2005). How good are rodent models of carcinogenesis in predicting effi- 102. Johnson, F.M. (2001). Response to Tennant et al.: cacy in humans? A systematic review and meta- Attempts to replace the NTP rodent bioassay with analysis of colon chemoprevention in rats, mice and transgenic alternatives are unlikely to succeed.
men. European Journal of Cancer 41, 1911–1922.
Environmental Molecular Mutagenesis 37, 89–92.
Roberts, I., Evans, A., Bunn, F., Kwan, I. & Crow - 103. Bailey, J. (2005). Non-human primates in medical hurst, E. (2001). Normalising the blood pressure in research and drug development: a critical review.
bleeding trauma patients may be harmful. Lancet Biogenic Amines 19, 235–255.
357, 385–387.
104. Glazko, G., Veeramachaneni, V., Nei, M. & Makal - Knight, A., Bailey, J. & Balcombe, J. (2006). Animal owski, W. (2005). Eighty percent of proteins are dif- carcinogenicity studies: 1. Poor human predictivity.
ferent between humans and chimpanzees. Gene ATLA 34, 19–27.
346, 215–219.
Tomatis, L. & Wilbourn, J. (1993). Evaluation of car - 105. Balcombe, J., Barnard, N. & Sandusky, C. (2004).
cin ogenic risk to humans: the experience of IARC. In Laboratory routines cause animal stress. Contemp - orary Topics in Laboratory Animal Science 43,
Moore, G.J., Overend, P. & Wilson, M.S. (1998).
Reducing the use of laboratory animals in biomedical 106. Knight, A., Bailey, J. & Balcombe, J. (2006). Animal research: problems and possible solutions. ATLA 26,
carcinogenicity studies: 2. Obstacles to extrapola- tion of data to humans. ATLA 34, 29–38.
125. Balls, M., Goldberg, A.M., Fentem, J.H., Broadhead, 107. Poignet, H., Nowicki, J.P. & Scatton, B. (1992).
C.L., Burch, R.L., Festing, M.F.W., Frazier, J.M., Lack of neuroprotective effect of some sigma ligands Hendriksen, C.F., Jennings, M., van der Kamp, M.D., in a model of focal cerebral ischemia in the mouse.
Morton, D.B., Rowan, A.N., Russell, C., Russell, Brain Research 596, 320–324.
W.M.S., Spielmann, H., Stephens, M.L., Stokes, W.S., 108. Aronowski, J., Strong, R. & Grotta, J.C. (1996).
Straughan, D.W., Yager, J.D., Zurlo, J. & Van Treatment of experimental focal ischemia in rats Zutphen, B.F. (1995). The Three Rs: the way forward: with lubeluzole. Neuropharmacology 35, 689–693.
109. Marshall, J.W., Cross, A.J., Jackson, D.M., Green, Workshop 11. ATLA 23, 838–866.
A.R., Baker, H.F. & Ridley, R.M. (2000). Clometh - 126. Evidence-Based Medicine Working Group (1992).
iazole protects against hemineglect in a primate Evidence-based medicine. A new approach to teach- model of stroke. Brain Research Bulletin 52, 21–29.
ing the practice of medicine. Journal of the 110. Bebarta, V., Luyten, D. & Heard, K. (2003). Emer - American Medical Association 286, 2420–2425.
gency medicine animal research: does use of ran- 127. Watters, M.P.R. & Goodman, N.W. (1999). Com domisation and blinding affect the results? parison of basic methods in clinical studies and in Acad emic Emergency Medicine 10, 684–687.
vitro tissue and cell culture studies in three anaes- 111. Medical Research Council (MRC) (1993). Respon - thesia journals. British Journal of Anaesthesia 82,
sibility in the Use of Animals in Medical Research, 128. Moher, D., Schulz, K.F. & Altman, D.G. (2001). The 112. Balls, M., Festing, M.F.W. & Vaughan, S. (eds) CONSORT statement: revised recommendations (2004). Reducing the use of experimental animals for improving the quality of reports of parallel- where no replacement is yet available. ATLA 32
group randomised trials. Lancet 357, 1191–1194.
129. Arlt, S. & Heuwieser, W. (2005). [Evidence based 113. Festing, M.F.W. (2004). Good experimental design veterinary medicine.] [German.] Deutsche Tierärzt - and statistics can save animals, but how can it be liche Wochenschrift 112, 146–148.
promoted? ATLA 32 Suppl. 1A, 133–135.
130. Schulz, K.F. (2005). Assessing allocation concealment 114. De Boo, J. & Hendriksen, C. (2005). Reduction and blinding in randomised controlled trials: why strategies in animal research: a review of scientific bother? Equine Veterinary Journal 37, 394–395.
approaches at the intra-experimental, supra-experi- 131. Brown, C.M., Calder, C., Linton, C., Small, C., mental and extra-experimental levels. ATLA 33,
Kenny, B.A., Spedding, M. & Patmore, L. (1995).
Neuroprotective properties of lifarizine compared 115. Festing, M.F.W. (1997). Experimental design and with those of other agents in a mouse model of focal husbandry. Experimental Gerontology 32, 39–47.
cerebral ischaemia. British Journal of Pharm 116. van Wilgenburg, H., van Schaick Zillesen, P.G. & acology 115, 1425–1432.
Krulichova, I. (2003). Sample power and ExpDesign: 132. Oktem, I.S., Menku, A., Akdemir, H., Kontas, O., tools for improving design of animal experiments.
Kurtsoy, A. & Koc, R.K. (2000). Therapeutic effect of Laboratory Animals 32, 39–43.
tirilazad mesylate (U-74006F), mannitol, and their 117. van Wilgenburg, H., van Schaick Zillesen, P.G. & combination, on experimental ischemia. Research in Krulichova, I. (2004). Experimental design: com- Experimental Medicine 199, 231–242.



Aesthetic Dermatology S K I N & A L L E R G Y N E W S • J u l y 2 0 0 8 Photo at left shows a patient before treatment / with the OLDG ENTER photopneumatic device. Photo at ICHAEL right shows . M improvement of the patient’s acne B Y S H A R O N W O R C E S T E R after receiving TESTY L

Microsoft word - therapy-follow-up.doc

Details of my illness The first symptom of my AL-amyloidosis in 2001 was a decline in my immunoglobulin IgG level, which had remained a mystery for years. Otherwise, there was nothing unusual about my blood proteins, and the determination of free light chains was neither common nor Starting in the middle of 2003 some other symptoms appeared sporadically, including an occasional slight feel

Copyright © 2011-2018 Health Abstracts