Department of Statistics
University of Manitoba
 
   
   
 
Methodology Research
Collaborating Research
Publications
Research Grants

Currently my theoretical and methodological research focuses on three areas in applied probability, statistical inference and computation.

The first area is estimation and inference in nonlinear semiparametric systems with covariate measurement error . These models are widely used in biostatistics to analyze data from epidemiology, environmental, medical and health sciences. Measurement error models are also called errors-in-variables models in econometrics, and latent variable models in psychology and other social sciences.

The second area is boundary crossing probabilities, also called the first passage time distributions, for diffusion or Markov processes. The computation of BCP plays an important role in modern finance, such as barrier option pricing and credit risk modeling. It also arises in biology, computational genetics, engineering reliability, epidemiology, physics, seismology, and statistics.

The third area is Monte Carlo simulation methods for statistical computation. This includes multivariate random sample generation, stochastic optimization, and Bayesian inference. The main effort here is to develop practical and efficient methods and algorithms.

Besides these areas, I am also interested in applied interdisciplinary research in econometrics, environmetrics, epidemiology, medical and health sciences, as well as engineering design and optimization. Some of these research projects are carried out in collaboration with researchers from the subject-matter fields.

More details about my research are given below.


1. Nonlinear Inference with Covariate Measurement Error

The problem of measurement error (ME) arises when a regression analysis involves predictor variables that either cannot be measured directly (latent variables) or are measured with substantial error (imprecise measurements). Examples of such variables include long-term systolic blood pressure, cholesterol level, drug concentration in patient's blood, exposure to air pollutants or radio-active substances, social ability and wealth. It is well-known that statistical methods ignoring ME lead to biased and inconsistent estimates.

In statistics, the widely used estimation and inference methods are approximately consistent and therefore are applicable to small ME situations only. On the other hand, most consistent estimation methods rely on restrictive mathematical assumptions which are difficult or impossible to check in practice. Another challenging problem in nonlinear inference with ME is that the objective function to be minimized or maximized typically involves multiple integrals of no closed forms, so that the entailed numerical optimization is difficult or intractable.

My research in this area focuses on consistent estimation approaches which have wider applicability. In particular, [1, 2] developed a second-order least squares (SLS) and a simulation-based estimation method for general nonlinear models with Berkson-type ME. To deal with classical ME in limited dependent variable models including censored linear and categorical response variable models, [3] proposed a two-stage instrumental variable (IV) approach, whereas [4, 5] derived the maximum likelihood and the method of moments estimators. Currently, these approaches are being combined to solve harder problems in general nonlinear models with classical ME.

Recently, [6] extended the second-order least squares and simulation-based methods to mixed effects models for panel data (longitudinal data) and thus provided a unified framework for estimation of these models and Berkson ME models. Historically, these two classes of models have very different origins and therefore are treated separately in the literature.

2. Boundary Crossing Probabilities for Diffusion Processes

This research is concerned with statistical distribution of time when a random process first reaches a threshold, for example, the time when the population of an endangered species reaches a certain critical level, or the time when the number of infected individuals with a disease reaches a limit. The computation of such first passage time (FPT) distributions, or boundary crossing probabilities (BCP), is crucial in many scientific investigations.

However, the computation of BCP for nonlinear boundaries is a long-standing and challenging problem, and explicit analytic solutions do not exist except for a few instances. Traditionally, the mainstream of research focuses on solving certain integral or differential equations for the FPT density. These methods are usually limited to one-sided and smooth boundaries only, and the accuracy of the numerical solutions is difficult to assess.

In contrast, [7] proposed a new direction of research by calculating the boundary crossing probabilities directly. This BCP-approach yields an explicit integral representation of the BCP for Brownian motion crossing any piecewise linear boundary. The derived formula can be used to obtain approximations of the BCP for general nonlinear boundaries. This approach was subsequently extended to two-sided boundary crossing problems in [8], where an approximation error rate was also derived. Up to date, this error rate remains the best rate obtained in the literature. Recently, [9] extended this approach further to a fairly large class of diffusion processes which are widely used in applications.

3. Monte Carlo Methods for Statistical Computation

Modern research activities in science and engineering often involve numerical optimization and integration. Moreover, complex data structures require highly sophisticated statistical modeling and inference procedures. Consequently, practical and efficient computational methods and algorithms are crucial. The task is particularly challenging when the problem is high-dimensional.

Article [10] developed a discretization-based multivariate sampling algorithm, which is fairly efficient in generating a large sample of independent points from a relatively high-dimensional distribution without knowing the normalizing constant. This method overcomes many typical drawbacks of the Markov Chain Monte Carlo (MCMC) methods such as problems associated with ill-shaped or disconnected sample spaces. This algorithm has been successfully applied to a large class of global optimization problems in engineering design and reliability assessment including [11, 12]. It has also been applied to Bayesian finite mixture modeling of genetic and environmental data analysis in [13, 14].

References

  1. Wang, L. (2003). Estimation of nonlinear Berkson-type measurement error models. Statistica Sinica, 13, 1201-1210.
  2. Wang, L. (2004). Estimation of nonlinear models with Berkson measurement errors. Annals of Statistics, 32, 2559-2579.
  3. Wang, L. and Hsiao, C. (2007). Two-stage estimation of limited dependent variable models with errors-in-variables. Econometrics Journal, 10, 426-438.
  4. Wang, L. (1998). Estimation of censored linear errors-in-variables models. Journal of Econometrics, 84, 383-400.
  5. Wang, L. (2002). A simple adjustment for measurement errors in some limited dependent variable models. Statistics & Probability Letters, 58, 427-433.
  6. Wang L. (2007). A unified approach to estimation of nonlinear mixed effects and Berkson measurement error models. Canadian Journal of Statistics, 35, 233-248.
  7. Wang, L. and Poezelberger, K. (1997). Boundary crossing probability for Brownian motion and general boundaries. Journal of Applied Probability, 34, 54-65.
  8. Poezelberger, K. and Wang, L. (2001). Boundary crossing probability for Brownian motion. Journal of Applied Probability, 38, 152-164.
  9. Wang, L. and Poezelberger, K. (2007). Crossing probability for some diffusion processes with piecewise continuous boundaries. Methodology and Computing in Applied Probability, 9, 21-40.
  10. Fu, J. C. and Wang, L. (2002). A random-discretization based Monte Carlo sampling method and its application. Methodology and Computing in Applied Probability, 4, 5-25.
  11. Wang, L., Shan, S. and Wang, G. G. (2004). Mode-pursuing sampling method for global optimization on expensive black-box functions. Engineering Optimization, 36, 419-438.
  12. Wang, G. G., Wang, L. and Shan, S. (2005). Reliability assessment using discriminative sampling and metamodeling. SAE Transactions Journal of Passenger Cars: Mechanical Systems, 114, 291-300.
  13. Wang, L. and Fu, J. C. (2007). A practical sampling approach for a Bayesian mixture model with unknown number of components. Statistical Papers, 48, 631-653.
  14. Xue, L., Fu, J. C., Wang, F. and Wang, L. (2005). A mixture model approach to analyzing major element chemistry data of the Changjiang (Yangtze River). Environmetrics, 16, 305-318.

liqun.wang@umanitoba.ca    |     332 Machray Hall Department of Statistics University of Manitoba     |    204 - 474 - 6270