Postdoctoral Fellow, University of Pennsylvania School of Medicine
I develop statistical frameworks to jointly study mortal and nonmortal endpoints in critical care
My research examines randomized trials conducted in intensive care units (ICUs), where they have very infrequently identified new critical care therapies in recent decades. For instance, out of over 200 trials that sought to decrease mortality for sepsis or respiratory failure in the past 20 years, only 9 have been successful. The results have been so abysmal that some thought-leaders have suggested that randomized trials be abandoned entirely in critical care.
In this presentation, I offer an alternative set of methodological explanations for this limited success rate.
First, I found that with few exceptions, the majority of clinical trials that sought to identify approaches to reduce ICU mortality were deemed to fail. That is, they were designed, and thus financed, to enroll such small numbers of patients that even if a new intervention improved patients’ care, it could never do so with statistical certainty. This issue also cuts the other way, with our evidence suggesting that many trials were too small to identify interventions that may have been harmful.
I also found that many ICU researchers pursue to modify what I call “duration endpoints,” such as how long patients require ICU care. In these studies, I found that the manner in which these types of endpoints are measured and compared statistically is highly prone to false conclusions about the impact, positive or negative, of interventions on patient outcomes.
The final arm of this research provides practical solutions (that is new statistical frameworks) and recommendations to the issues I identified in my empirical work.
Abstract: Mortality is the most widely accepted outcome measure in randomized controlled trials of therapies for critically ill adults, but most of these trials fail to show a statistically significant mortality benefit. The reasons for this are unknown.We searched five high impact journals (Annals of Internal Medicine, British Medical Journal, JAMA, The Lancet, New England Journal of Medicine) for randomized controlled trials comparing mortality of therapies for critically ill adults over a ten year period. We abstracted data on the statistical design and results of these trials to compare the predicted delta (delta; the effect size of the therapy compared to control expressed as an absolute mortality reduction) to the observed delta to determine if there is a systematic overestimation of predicted delta that might explain the high prevalence of negative results in these trials.We found 38 trials meeting our inclusion criteria. Only 5/38 (13.2%) of the trials provided justification for the predicted delta. The mean predicted delta among the 38 trials was 10.1% and the mean observed delta was 1.4% (P < 0.0001), resulting in a delta-gap of 8.7%. In only 2/38 (5.3%) of the trials did the observed delta exceed the predicted delta and only 7/38 (18.4%) of the trials demonstrated statistically significant results in the hypothesized direction; these trials had smaller delta-gaps than the remainder of the trials (delta-gap 0.9% versus 10.5%; P < 0.0001). For trials showing non-significant trends toward benefit greater than 3%, large increases in sample size (380% - 1100%) would be required if repeat trials use the observed delta from the index trial as the predicted delta for a follow-up study.Investigators of therapies for critical illness systematically overestimate treatment effect size (delta) during the design of randomized controlled trials. This bias, which we refer to as "delta inflation", is a potential reason that these trials have a high rate of negative results."Absence of evidence is not evidence of absence."
Pub.: 01 May '10, Pinned: 18 Aug '17
Abstract: To determine how many multicenter, randomized controlled trials have been published that assess mortality as a primary outcome in the adult intensive care unit population, and to evaluate their methodologic quality.A sensitive search strategy for randomized controlled trials was conducted in the Cochrane Central Register of Controlled Trials and in MedLine using the PubMed interface.All publications of adult, multicenter randomized controlled trials carried out in the intensive care unit, with mortality as a primary outcome, and including >50 patients were selected.Seventy-two randomized controlled trials were retrieved and were classified according to their effect on mortality: beneficial, detrimental, or neutral.Ten of the studies reported a positive impact of the studied intervention on mortality, seven studies reported a detrimental effect of the intervention, and 55 studies showed no effect on mortality.This literature search demonstrates that relatively few of the randomized controlled trials conducted in intensive care units and using mortality as a primary outcome show a beneficial impact of the intervention on the survival of critically ill patients. Methodological limitations of some of the randomized controlled trials may have prevented positive results. Other forms of evidence and end points other than mortality need to be considered when evaluating interventions in critically ill patients.
Pub.: 02 Apr '08, Pinned: 18 Aug '17
Abstract: Increasing numbers of intensive care units (ICUs) are adopting the practice of nighttime intensivist staffing despite the lack of experimental evidence of its effectiveness.We conducted a 1-year randomized trial in an academic medical ICU of the effects of nighttime staffing with in-hospital intensivists (intervention) as compared with nighttime coverage by daytime intensivists who were available for consultation by telephone (control). We randomly assigned blocks of 7 consecutive nights to the intervention or the control strategy. The primary outcome was patients' length of stay in the ICU. Secondary outcomes were patients' length of stay in the hospital, ICU and in-hospital mortality, discharge disposition, and rates of readmission to the ICU. For length-of-stay outcomes, we performed time-to-event analyses, with data censored at the time of a patient's death or transfer to another ICU.A total of 1598 patients were included in the analyses. The median Acute Physiology and Chronic Health Evaluation (APACHE) III score (in which scores range from 0 to 299, with higher scores indicating more severe illness) was 67 (interquartile range, 47 to 91), the median length of stay in the ICU was 52.7 hours (interquartile range, 29.0 to 113.4), and mortality in the ICU was 18%. Patients who were admitted on intervention days were exposed to nighttime intensivists on more nights than were patients admitted on control days (median, 100% of nights [interquartile range, 67 to 100] vs. median, 0% [interquartile range, 0 to 33]; P<0.001). Nonetheless, intensivist staffing on the night of admission did not have a significant effect on the length of stay in the ICU (rate ratio for the time to ICU discharge, 0.98; 95% confidence interval [CI], 0.88 to 1.09; P=0.72), ICU mortality (relative risk, 1.07; 95% CI, 0.90 to 1.28), or any other end point. Analyses restricted to patients who were admitted at night showed similar results, as did sensitivity analyses that used different definitions of exposure and outcome.In an academic medical ICU in the United States, nighttime in-hospital intensivist staffing did not improve patient outcomes. (Funded by University of Pennsylvania Health System and others; ClinicalTrials.gov number, NCT01434823.).
Pub.: 22 May '13, Pinned: 17 Aug '17
Abstract: Intensive care unit (ICU)-based randomized clinical trials (RCTs) among adult critically ill patients commonly fail to detect treatment benefits.Appraise the rates of success, outcomes used, statistical power, and design characteristics of published trials.One hundred forty-six ICU-based RCTs of diagnostic, therapeutic, or process/systems interventions published from January 2007 to May 2013 in 16 high-impact general or critical care journals were studied.Of 146 RCTs, 54 (37%) were positive (i.e., the a priori hypothesis was found to be statistically significant). The most common primary outcomes were mortality (n = 40 trials), infection-related outcomes (n = 33), and ventilation-related outcomes (n = 30), with positive results found in 10, 58, and 43%, respectively. Statistical power was discussed in 135 RCTs (92%); 92 cited a rationale for their power parameters. Twenty trials failed to achieve at least 95% of their reported target sample size, including 11 that were stopped early due to insufficient accrual/logistical issues. Of 34 superiority RCTs comparing mortality between treatment arms, 13 (38%) accrued a sample size large enough to find an absolute mortality reduction of 10% or less. In 22 of these trials the observed control-arm mortality rate differed from the predicted rate by at least 7.5%.ICU-based RCTs are commonly negative and powered to identify what appear to be unrealistic treatment effects, particularly when using mortality as the primary outcome. Additional concerns include a lack of standardized methods for assessing common outcomes, unclear justifications for statistical power calculations, insufficient patient accrual, and incorrect predictions of baseline event rates.
Pub.: 03 May '14, Pinned: 17 Aug '17
Abstract: Clinical endpoints measured in terms of duration, such as intensive care unit (ICU) length of stay (LOS), are widely used in randomized clinical trials (RCTs) and observational research. In analyses of patient-level data from a recent RCT, in which ICU LOS was the primary endpoint, and in administrative data, we show that additional ICU time is often accrued by patients after they are deemed ready for discharge. This "immutable" time (which cannot plausibly be altered by interventions under study) varies by day, week, and year, adding on average one-third of a day to total LOS. We then use statistical simulations, informed by the administrative data and RCT, to assess the impact of immutable time on the measurement and statistical comparison of patients' ICU LOS. These simulations demonstrate that immutable time combines with clinically necessary ICU time (neither of which is likely to be normally distributed) to produce overall LOS distributions that may either mask true treatment effects or suggest false treatment effects relative to analyses of time-to-discharge-readiness. The extent and direction of bias are complex functions of the statistical method used, mortality rates and distributions, and the magnitude of immutable time relative to intervention-associated reductions in LOS.
Pub.: 13 Jun '17, Pinned: 17 Aug '17