Homogeneity test in finite mixture models using EM-test

  • Author / Creator
    Niu, Xiaoqing
  • The class of finite mixture models is widely used in many areas, including science, humanities, medicine, engineering, among many others. Testing homogeneity is one of the important and challenging problems in the application of finite mixture models. It has been investigated by many researchers and most of the existing works have focused on the univariate mixture models, normal mixture models on the mean parameters only, and normal mixture models on both mean and variance parameters. This thesis concentrates on testing homogeneity in multivariate mixture models, scale mixtures of normal distributions, and a class of contaminated normal models. We first propose the use of the EM-test (Li, Chen, & Marriott, 2009) to test homogeneity in multivariate mixture models. We show that the EM-test statistic has asymptotically the same distribution as the likelihood ratio test for testing the restricted mean of a multivariate normal distribution given one observation. Based on this result, we suggest a resampling procedure to approximate the p-value of the EM-test. Scale mixture of normal distributions, i.e., mixture of normal distributions on the variance parameters, has wide applications. However, an effective testing procedure specifically for testing homogeneity in this class of mixture models is not available. We retool the EM-test (Chen & Li, 2009) for testing homogeneity in the scale mixture of normal distributions. We show that the retooled EM-test has the simple limiting distribution $\ rac{1}{2} \chi^2_0 + \ rac{1}{2}\chi^2_1$. Large-scale hypothesis testing problem appears in many areas such as microarray studies. We propose a new class of contaminated normal models, which is a two-component normal mixture model with one component mean being zero and different component variances, and can be used in large-scale hypotheses. We further design a new EM-test for testing homogeneity in this class of mixture models. It is shown that the new EM-test statistic has a simple shifted $\ rac{1}{2} \chi^2_1 + \ rac{1}{2}\chi^2_2$ limiting distribution. In all the three scenarios, extensive simulation studies are conducted to examine whether the limiting distributions approximate the finite sample distributions reasonably well and whether the EM-tests have appropriate power to detect heterogeneity in the alternative models. To demonstrate the application of the proposed methods, several real-data examples are analyzed.

  • Subjects / Keywords
  • Graduation date
    Spring 2014
  • Type of Item
  • Degree
    Doctor of Philosophy
  • DOI
  • License
    This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.