PhD student, University of California Los Angeles
Using latest optimization technique for finding the best experimental designs
Efficient experiment designs for data collection are critically important in making meaningful statistical inferences and saving costs, and experiments utilizing optimal design theory can help scientists produce the most reliable results and more likely to translate into useful biomedical applications. My research focuses on developing methodologies and applies them to build powerful statistical tools that scientists and clinicians can take advantage of to save time and costs in their studies and to make more informed and timely medical decisions. Utilizing a class of cutting-edge algorithms called metaheuristics, I try to solve high dimensional and complex design problems in bioscience with much faster computational speed and better searching capabilities than conventional algorithms. I also build statistical computation tools for easier use in practice and I’m trying to make it widely available to promote the use of optimal designs in scientific studies.
Abstract: Identifying optimal designs for generalized linear models with a binary response can be a challenging task, especially when there are both continuous and discrete independent factors in the model. Theoretical results rarely exist for such models, and the handful that do exist come with restrictive assumptions. This paper investigates the use of particle swarm optimization (PSO) to search for locally $D$-optimal designs for generalized linear models with discrete and continuous factors and a binary outcome and demonstrates that PSO can be an effective method. We provide two real applications using PSO to identify designs for experiments with mixed factors: one to redesign an odor removal study and the second to find an optimal design for an electrostatic discharge study. In both cases we show that the $D$-efficiencies of the designs found by PSO are much better than the implemented designs. In addition, we show PSO can efficiently find $D$-optimal designs on a prototype or an irregularly shaped design space, provide insights on the existence of minimally supported optimal designs, and evaluate sensitivity of the $D$-optimal design to mis-specifications in the link function.
Pub.: 05 Feb '16, Pinned: 28 Jun '17
Abstract: We consider design issues for cluster randomized trials (CRTs) with a binary outcome where both unit costs and intraclass correlation coefficients (ICCs) in the two arms may be unequal. We first propose a design that maximizes cost efficiency (CE), defined as the ratio of the precision of the efficacy measure to the study cost. Because such designs can be highly sensitive to the unknown ICCs and the anticipated success rates in the two arms, a local strategy based on a single set of best guesses for the ICCs and success rates can be risky. To mitigate this issue, we propose a maximin optimal design that permits ranges of values to be specified for the success rate and the ICC in each arm. We derive maximin optimal designs for three common measures of the efficacy of the intervention, risk difference, relative risk and odds ratio, and study their properties. Using a real cancer control and prevention trial example, we ascertain the efficiency of the widely used balanced design relative to the maximin optimal design and show that the former can be quite inefficient and less robust to mis-specifications of the ICCs and the success rates in the two arms.
Pub.: 09 Feb '17, Pinned: 28 Jun '17
Abstract: We use mathematical programming tools, such as Semidefinite Programming (SDP) and Nonlinear Programming (NLP)-based formulations to find optimal designs for models used in chemistry and chemical engineering. In particular, we employ local design-based setups in linear models and a Bayesian setup in nonlinear models to find optimal designs. In the latter case, Gaussian Quadrature Formulas (GQFs) are used to evaluate the optimality criterion averaged over the prior distribution for the model parameters. Mathematical programming techniques are then applied to solve the optimization problems. Because such methods require the design space be discretized, we also evaluate the impact of the discretization scheme on the generated design. We demonstrate the techniques for finding D-, A- and E-optimal designs using design problems in biochemical engineering and show the method can also be directly applied to tackle additional issues, such as heteroscedasticity in the model. Our results show that the NLP formulation produces highly efficient D-optimal designs but is computationally less efficient than that required for the SDP formulation. The efficiencies of the generated designs from the two methods are generally very close and so we recommend the SDP formulation in practice.
Pub.: 08 Mar '16, Pinned: 28 Jun '17