Missing data imputation in multivariate distribution with unknown degrees of freedom using expectation maximization algorithm and its stochastic variants

Kinyanjui, Paul Kimani; Tamba, Cox Lwaka; Orawo, Luke Akong’o; Okenye, Justin Obwoge

DSpace Home
→
Research Publications From Faculties
→
Research Publications by Postgraduate Students
→
Faculty of Arts and Social Sciences
→
View Item

dc.contributor.author	Kinyanjui, Paul Kimani
dc.contributor.author	Tamba, Cox Lwaka
dc.contributor.author	Orawo, Luke Akong’o
dc.contributor.author	Okenye, Justin Obwoge
dc.date.issued	2020-10
dc.date.accessioned	2024-01-26T08:01:46Z
dc.date.available	2024-01-26T08:01:46Z
dc.identifier.uri	https://www.researchgate.net/publication/346213616_Missing_data_imputation_in_multivariate_t_distribution_with_unknown_degrees_of_freedom_using_expectation_maximization_algorithm_and_its_stochastic_variants
dc.identifier.uri	http://41.89.96.81:8080/xmlui/handle/123456789/3242
dc.description.abstract	. Many researchers encounter the missing data problem. The phenomenon may be occasioned by data omission, nonresponse, death of respondents, recording errors, among others. It is important to find an appropriate data imputation technique to fill in the missing positions. In this study, the Expectation Maximization (EM) algorithm and two of its stochastic variants, stochastic EM (SEM) and Monte Carlo EM (MCEM), are employed in missing data imputation and parameter estimation in multivariate t distribution with unknown degrees of freedom. The imputation efficiencies of the three methods are then compared using mean square error (MSE) criterion. SEM yields the lowest MSE, making it the most efficient method in data imputation when the data assumes the multivariate t distribution. The algorithm’s stochastic nature enables it to avoid local saddle points and achieve global maxima; ultimately increasing its efficiency. The EM and MCEM techniques yield almost similar results. Large sample draws in the MCEM’s E-step yield more or less the same results as the deterministic EM. In parameter estimation, it is observed that the parameter estimates for EM and MCEM are relatively close to the simulated data’s maximum likelihood (ML) estimates. This is not the case in SEM, owing to the random nature of the algorithm. Keywords: Expectation maximization (EM), stochastic EM, Monte Carlo EM, unknown degrees of freedom	en_US
dc.language.iso	en	en_US
dc.publisher	Model Assisted Statistics and Applications	en_US
dc.subject	Missing data imputation in multivariate distribution	en_US
dc.title	Missing data imputation in multivariate distribution with unknown degrees of freedom using expectation maximization algorithm and its stochastic variants	en_US
dc.type	Article	en_US