jueves, 10 de agosto de 2017

THE CHANGING FACE OF CLINICAL TRIALS

REVIEW ARTICLE
THE CHANGING FACE OF CLINICAL TRIALS
Jeffrey M. Drazen, M.D., David P. Harrington, Ph.D., John J.V. McMurray, M.D., James H. Ware, Ph.D., Janet Woodcock, M.D., Editors
Evidence for Health Decision Making — Beyond Randomized, Controlled Trials
Thomas R. Frieden, M.D., M.P.H.
N Engl J Med 2017; 377:465-475August 3, 2017DOI: 10.1056/NEJMra1614394


Comments open through September 6, 2017


ArticleReferences Comments (2) Metrics
A core principle of good public health practice is to base all policy decisions on the highest-quality scientific data, openly and objectively derived.1
Determining whether data meet these conditions is difficult; uncertainty can lead to inaction by clinicians and public health decision makers.
Although randomized, controlled trials (RCTs) have long been presumed to be the ideal source for data on the effects of treatment, other methods of obtaining evidence for decisive action are receiving increased interest, prompting new approaches to leverage the strengths and overcome the limitations of different data sources.2-8 In this article, I describe the use of RCTs and alternative (and sometimes superior) data sources from the vantage point of public health, illustrate key limitations of RCTs, and suggest ways to improve the use of multiple data sources for health decision making.

In large, well-designed trials, randomization evenly distributes known and unknown factors among control and intervention groups, reducing the potential for confounding.
Despite their strengths, RCTs have substantial limitations. Although they can have strong internal validity, RCTs sometimes lack external validity; generalizations of findings outside the study population may be invalid.2,4,6 RCTs usually do not have sufficient study periods or population sizes to assess duration of treatment effect (e.g., waning immunity of vaccines) or to identify rare but serious adverse effects of treatment, which often become evident during postmarketing surveillance and long-term follow-up but could not be practically assessed in an RCT.
The increasingly high costs and time constraints of RCTs can also lead to reliance on surrogate markers that may not correlate well with the outcome of interest.

Selection of high-risk groups increases the likelihood of having adequate numbers of end points, but these groups may not be relevant to the broader target populations.
These limitations and the fact that RCTs often take years to plan, implement, and analyze reduce the ability of RCTs to keep pace with clinical innovations; new products and standards of care are often developed before earlier models complete evaluation. These limitations also affect the use of RCTs for urgent health issues, such as infectious disease outbreaks, for which public health decisions must be made quickly on the basis of limited and often imperfect available data.
RCTs are also limited in their ability to assess the individualized effect of treatment, as can result from differences in surgical techniques, and are generally impractical for rare diseases.

Many other data sources can provide valid evidence for clinical and public health action. Observational studies, including assessments of results from the implementation of new programs and policies, remain the foremost source, but other examples include analysis of aggregate clinical or epidemiologic data. In the late 1980s, the high rate of the sudden infant death syndrome (SIDS) in New Zealand led to a case–control study comparing information on 128 infants who died from SIDS and 503 control infants.9 The results identified several risk factors for SIDS, including prone sleeping position, and led to the implementation of a program to educate parents to avoid putting their infants to sleep on their stomachs — well before back-sleeping was definitively known to reduce the incidence of SIDS.
The substantial reduction in the incidence of SIDS that resulted from this program became strong evidence of efficacy; implementation of an RCT for SIDS would have presented ethical and logistic difficulties.
Similarly, the evidence base for tobacco-control interventions has depended heavily on analysis of the results of policies, such as taxes, smoke-free laws, and advertising campaigns that have generated robust evidence of effectiveness — that is, practice-based evidence.

Current evidence-grading systems are biased toward RCTs, which may lead to inadequate consideration of non-RCT data.10 Objections to observational studies include the potential for bias from unrecognized factors along with the belief that these studies overestimate treatment effects.11 Although overestimation bias has been shown in some observational studies (e.g., overestimation of the effect of influenza vaccination on reducing mortality among older persons as a result of bias from healthy vaccine recipients12), comparisons of validity between observational studies and RCTs have dispelled many misperceptions.4,6,13,14 A widely cited example involves the cardiovascular health risks associated with the use of menopausal hormone therapy.
Data from an observational study suggested that menopausal hormone therapy would reduce the risk of heart disease15; results from a subsequent RCT showed increased cardiovascular risks.16 Although initially these differences were thought to indicate weaknesses in the observational study, further analyses determined that both studies had valid results for their patient populations and that discrepancies were probably due to the timing of initiation of hormone therapy in relation to the onset of menopause.17-21 If so, then the RCT and observational study showed similar findings. However, a broad recommendation to use hormone therapy was made prematurely.
Determining when data are sufficient for action is difficult, but the bar should be much higher when recommending that millions of persons with no disease take medications. This line of reasoning does not suggest that the Food and Drug Administration should be less stringent in their review of drug safety and efficacy, but rather that there should be rigorous review of all potentially valid data sources.
No study design is flawless, and conflicting findings can emerge from all types of studies. The following examples show the importance of recognizing the strengths and limitations in all data sources and finding ways to obtain the most useful data for health decision making.

VALIDITY OF ALTERNATIVE DATA SOURCES — THE LIVE ATTENUATED INFLUENZA VACCINE
Rigorous analyses after the implementation of a public health program can provide critically important information, such as data on vaccine effectiveness. Analyses of influenza vaccination efforts are a prime example, because, unlike other vaccines, influenza vaccines are given and evaluated for effectiveness yearly.
The ability of an influenza vaccine to prevent influenza-related illness is affected by many factors, including genetic changes in the virus as well as host factors including age, underlying medical conditions, and previous infections and vaccinations. In the United States, the effectiveness of the influenza vaccine is monitored through the Influenza Vaccine Effectiveness Network. These data are used to derive estimates of the number of influenza-related illnesses, hospitalizations, and deaths prevented each year through vaccination, which, in turn, provide critical information to help measure, evaluate, and guide public health interventions.

First licensed in 2003, the live attenuated influenza vaccine, known as the “nasal spray” influenza vaccine, has been approved for use in healthy children and adults 2 to 49 years of age since 2007.22 The vaccine showed good protection for both adults and children in postlicensure RCTs, and, in June 2014, on the basis of results from several RCTs showing superior efficacy of the live attenuated vaccine over the inactivated influenza vaccine in children,23-25 the Advisory Committee for Immunization Practices (ACIP) issued a preference for its use in healthy children 2 to 8 years of age for the 2014–2015 influenza season.26 A subsequent observational study of the effectiveness of the live attenuated and inactivated influenza vaccines, however, showed worse performance for live attenuated vaccine than was shown in the RCTs,27 and the ACIP did not renew its preference for the live attenuated vaccine over inactivated vaccine in healthy children for the 2015–2016 season.
More recently, on the basis of an observed vaccine efficacy for the live attenuated vaccine that was at or near zero, especially against the 2009 H1N1 pandemic influenza virus,27-29 the ACIP recommended that the nasal spray vaccine not be used during the 2016–2017 influenza season.30 In this example, changes in vaccine formulation (from trivalent to quadrivalent), the population vaccinated (e.g., natural immunity resulting in neutralization of live vaccine), or another factor or factors caused the RCT data to lack external validity and be misleading, as compared with prospectively collected vaccine-efficacy data.
Future studies may provide clarification regarding the reasons for these differences, but both RCTs and observational data may be needed.

RELEVANCE TO PROGRAM CONDITIONSDIRECTLY OBSERVED TREATMENT FOR TUBERCULOSIS
Although the use of a single drug in the 1946 RCT of streptomycin for the treatment of tuberculosis31 rapidly led to resistance, the success of the trial spurred a series of long-term RCTs for tuberculosis treatment conducted over four decades by the British Medical Research Council with collaborators throughout the world.32,33 Each trial built on previous findings, with the effect of refining drug regimens and minimizing the duration of antituberculosis treatment. The importance of directly observed treatment was realized as treatment moved from sanatoriums to homes.34,35 The approach, implemented from 1958 forward,33 evolved to directly observed treatment, short-course (DOTS), with standard, first-line regimens, and, for persons infected with multidrug-resistant strains, “DOTS-plus,” involving second-line, reserve drugs.36

Studies have purported to show that directly observed treatment offers no advantage over self-administered treatment.37,38 A limitation of these studies has been lack of evaluation of the health, epidemiologic, and societal costs of relapse or of the rare but devastating progression to drug-resistant tuberculosis.
Although these studies have been conducted with intensive oversight, they have not established a method of treatment that can be consistently applied to a large program in which thousands or millions of patients are treated.
In addition, an RCT for a tuberculosis treatment method would be unable to predict or account for the harms from the rare but catastrophic secondary, population-wide effects of development and spread of multidrug resistance.

Examples of non-RCT efforts to evaluate the effect of DOTS and DOTS-plus on multidrug-resistant tuberculosis include decision analyses of program effect,39 genotyping of isolates from patients in communities with different directly observed treatment practices,40 and reviews of medical and public health records along with epidemiologic and laboratory analyses of multidrug-resistant tuberculosis outbreaks.41
These non-RCT studies have contributed to continued refinements in treatment and follow-up and reduced risks of resistance.
For these and other reasons, the American Thoracic Society, World Health Organization, and Centers for Disease Control and Prevention continue to recommend directly observed treatment as the standard of practice.

POPULATION-WIDE ANALYSIS — THE EFFECT OF SODIUM INTAKE ON CARDIOVASCULAR HEALTH
Cardiovascular disease remains the leading cause of death in the United States.42 A major risk factor for cardiovascular disease is hypertension, which currently affects approximately 29% of U.S. adults.43 An important strategy for lowering blood pressure is reducing excess sodium intake, particularly through changes to the food supply.44 A robust body of evidence, including an analysis of more than 100 randomized trials, shows that reducing sodium intake reduces blood pressure among adults.45 There is also evidence, based on trends at the population level, that reducing sodium intake prevents cardiovascular disease.46 Meta-analysis of sodium-reduction trials of at least 6 months’ duration in which moderate reductions in intake were achieved, as well as well-designed, long-term cohort studies, have provided strong evidence that lower sodium intake is associated with a reduced incidence of cardiovascular events.47,48

The benefits of sodium reduction have been questioned by some researchers on the basis of several studies that report a J-shaped relationship between sodium intake and cardiovascular outcomes.49-51 These studies, however, have been shown to have methodologic flaws, including those related to the assessment of usual sodium intake, the potential for reverse causality, inadequate follow-up, residual confounding, and insufficient power.52
Accurate assessment of long-term, usual sodium intake is critical in cohort studies that relate individual sodium intake to long-term outcomes and requires multiple 24-hour urine collections over a period of time.52-54 Spot or single 24-hour urine collections have a high degree of intraindividual variation that may not be overcome by correction or large sample size.54-56 Because of challenges in accurately measuring usual sodium intake and excretion and the potential for misclassification of exposure, cohort studies must use multiple 24-hour urine collections48 to be valid, and study designs that use population means, which are subject to less variation than measurements of individual intake, often provide more reliable information.57
This may be why studies that assess sodium intake and cardiovascular events on a population level have shown beneficial effects of sodium-intake reduction,46 whereas studies with less accurate measures of individual intake have not.54,57

Even for established risk factors, RCTs can yield answers that are simply wrong. A well-known example is the large Multiple Risk Factor Intervention Trial (MRFIT) on cardiovascular disease, which showed insufficient differences in health outcomes resulting from interventions such as smoking cessation and exercise.58 Although longer follow-up showed that the trial may have accurately identified benefits from smoking cessation and improvements in nutrition, the study highlighted problems in implementing and measuring the effects of substantial lifestyle changes — in particular, insufficient follow-up duration, possible adoption of interventions by participants in the comparison group, and inadequate adherence to recommended interventions by participants in the study population.

Although some researchers have called for large, long-term RCTs examining the effects of sodium-intake reduction on clinical outcomes to inform population-wide sodium-reduction efforts, this approach is similarly not feasible. Such trials would require tens of thousands of participants undergoing randomization to a high-sodium or low-sodium diet, with adherence to the intervention and follow-up of at least 5 years.47 This study design is impractical, particularly given the challenges with adherence to a low-sodium diet in our current food environment. As with many other topics in public health, conflicting findings from studies that use different methods are to be expected. Critical analysis of study methods and measurement and examination of the totality of the evidence are essential in order to interpret results correctly and make appropriate recommendations for action.59

RARE DISEASES — THE IMPORTANCE OF DISEASE REGISTRIES AND OTHER METHODS
Approximately 5000 to 7000 conditions fit the definition of a rare disease, with more than 50 million people affected throughout the world.60,61 Because of small sample sizes and logistic constraints, it is unlikely that RCTs will be performed for most of these conditions; actionable information may be most likely to be obtained from meticulous analysis of the treatment of different patients by different methods.
Such an approach was used to determine that isoniazid, injectable medications, and fluoroquinolone antibiotic agents were most likely to lead to successful treatment for common strains of multidrug-resistant tuberculosis.41 Despite the Orphan Drug Act, which was passed in 1983 to provide industry incentives for the development of clinical treatments for rare diseases, the options for most patients are limited. A movement to create a global rare-disease patient registry along with a centralized database of biorepositories for rare biospecimens followed from a 2010 workshop, sponsored by the National Institutes of Health, involving researchers, advocacy groups, and stakeholders.62 The Rare Diseases Human Biospecimens/Biorepositories (RD-HuB) makes rare-disease specimens available to researchers and informs patients of ongoing studies.
Although such registries could potentially lead to RCTs, attaining sufficient study-population sizes could remain an impediment. Alternately, these registries could be used to collect detailed case studies, including standardized information on individual treatment and clinical status, which could be used to enhance understanding of a particular disease and its treatment and improve the health of affected patients. For example, standardizing and aggregating data on clinical features, treatment, and outcomes from case reports and case series may reveal ways to improve diagnosis and treatment.

COSTS AND INFRASTRUCTURE — RELIABLE RESULTS FROM MORE FEASIBLE STUDY DESIGNS
Large observational studies, with longer follow-up, can be tailored to minimize bias in a manner analogous to the way bias is minimized in RCTs. In one such study, data from the Veterans Health Administration (VA) and Medicare were used to examine outcomes of treatment with sulfonylureas and thiazolidinediones — two second-line drugs for type 2 diabetes.63 The study used physician-prescribing patterns to approximate an RCT: determinations were made for patients to receive a sulfonylurea or thiazolidinedione on the basis of how often their physician had prescribed the drugs during the previous year (i.e., patients of physicians who usually prescribed sulfonylureas were assigned to receive a sulfonylurea, and those whose physicians usually prescribed thiazolidinediones were assigned to receive a thiazolidinedione).
With more than 80,000 patients monitored for up to 10 years, the study was 20 times larger and had a much longer follow-up than previous RCTs comparing the effectiveness of second-line diabetes drugs. The results showed a 68% higher risk of avoidable hospitalization and a 50% higher risk of death associated with treatment with sulfonylureas, as compared with thiazolidinediones, providing strong evidence-based information for clinical decision making while also avoiding many of the limitations of RCTs.

The VA is also undertaking a new type of randomized trial to compare the use of chlorthalidone versus hydrochlorothiazide for the treatment of hypertension.64 Both medications, which are diuretics, have been used for more than 50 years, but more than 95% of the million or more veterans who are prescribed this type of diuretic receive hydrochlorothiazide, as compared with the 2.5% receiving chlorthalidone.65 However, there is evidence that chlorthalidone, the older of the two drugs, is more effective in preventing cardiovascular events66 and reducing mortality.67,68
Using data from electronic medical records, with reliance on the patients’ primary care physician instead of additional study personnel, the trial plans to enroll approximately 13,500 veterans older than 65 years of age who are currently receiving hydrochlorothiazide. These patients will then be randomly assigned to receive hydrochlorothiazide or chlorthalidone over a 3-year study period. This study design simplifies the infrastructure and greatly reduces the costs involved in a traditional, large RCT.64 With approximately 50 million prescriptions for hydrochlorothiazide filled each year in the United States, even small reductions in cardiovascular events associated with chlorthalidone use that may be identified through this study would have a substantial effect in the prevention of cardiovascular disease.

MOVING FORWARD — OVERCOMING THE “DARK MATTER” OF CLINICAL MEDICINE

For much, and perhaps most, of modern medical practice, RCT-based data are lacking and no RCT is being planned or is likely to be completed to provide evidence for action.
This “dark matter” of clinical medicine leaves practitioners with large information gaps for most conditions and increases reliance on past practices and clinical lore.4,69,70
Elevating RCTs at the expense of other potentially highly valuable sources of data is counterproductive. A better approach is to clarify the health outcome being sought and determine whether existing data are available that can be rigorously and objectively evaluated, independently of or in comparison with data from RCTs, or whether new studies (RCT or otherwise) are needed.

New ways of obtaining valuable health data continue to emerge. “Big data,” including information from electronic health records and expanded patient registries, along with increased willingness of patients to participate and share health information, are generating useful data for large interventional studies and providing new opportunities for complementary use of multiple data sources to gain stronger evidence for action.71
For example, although an RCT may show the benefit of a drug, large observational studies can be conducted to refine dosages and identify rare adverse events. In addition, new strategies have been undertaken to increase the efficacy and efficiency of RCTs, including collaborative and adaptive trials to increase enrollment, reduce costs and time to completion, and better identify populations that benefit from treatments.72-74
Advances in genomic science may allow for better understanding of unique characteristics in patients that can affect outcomes of RCTs and other studies and be used to improve the validity of study findings.
There is no single, best approach to the study of health interventions; clinical and public health decisions are almost always made with imperfect data (Table 1TABLE 1

Selected Strengths and Weaknesses of Various Study Designs
, along with Examples of Studies with Effects on Policy or Practice.). Promoting transparency in study methods, ensuring standardized data collection for key outcomes, and using new approaches to improve data synthesis are critical steps in the interpretation of findings and in the identification of data for action, and it must be recognized that conclusions may change over time. There will always be an argument for more research and for better data, but waiting for more data is often an implicit decision not to act or to act on the basis of past practice rather than best available evidence. The goal must be actionable data — data that are sufficient for clinical and public health action that have been derived openly and objectively and that enable us to say, “Here’s what we recommend and why.”

Disclosure forms provided by the author are available with the full text of this article at NEJM.org.
The views expressed in this article are those of the author and do not necessarily represent the views of the Centers for Disease Control and Prevention or the Department of Health and Human Services.
I thank Kathryn Foti, M.P.H., Drew Blakeman, M.S., and Robin Moseley, M.A.T., for assistance with preparation of an earlier version of the manuscript and review of relevant literature, and Joanna Taliano, M.L.S., for assistance with literature searches.

SOURCE INFORMATION
From Atlanta, GA. The author is the former director of the Centers for Disease Control and Prevention.
Address reprint requests to Dr. Frieden at tfrieden@gmail.com.

No hay comentarios: