ellipsoidhull package:cluster R Documentation _C_o_m_p_u_t_e _t_h_e _E_l_l_i_p_s_o_i_d _H_u_l_l _o_r _S_p_a_n_n_i_n_g _E_l_l_i_p_s_o_i_d _o_f _a _P_o_i_n_t _S_e_t _D_e_s_c_r_i_p_t_i_o_n: Compute the "ellipsoid hull" or "spanning ellipsoid", i.e. the ellipsoid of minimal volume ('area' in 2D) such that all given points lie just inside or on the boundary of the ellipsoid. _U_s_a_g_e: ellipsoidhull(x, tol=0.01, maxit=5000, ret.wt = FALSE, ret.sqdist = FALSE, ret.pr = FALSE) ## S3 method for class 'ellipsoid': print(x, digits = max(1, getOption("digits") - 2), ...) _A_r_g_u_m_e_n_t_s: x: the n p-dimensional points asnumeric n x p matrix. tol: convergence tolerance for Titterington's algorithm. Setting this to much smaller values may drastically increase the number of iterations needed, and you may want to increas 'maxit' as well. maxit: integer giving the maximal number of iteration steps for the algorithm. ret.wt, ret.sqdist, ret.pr: logicals indicating if additional information should be returned, 'ret.wt' specifying the _weights_, 'ret.sqdist' the _*sq*uared *dist*ances_ and 'ret.pr' the final *pr*obabilities in the algorithms. digits,...: the usual arguments to 'print' methods. _D_e_t_a_i_l_s: The "spanning ellipsoid" algorithm is said to stem from Titterington(1976), in Pison et al(1999) who use it for 'clusplot.default'. The problem can be seen as a special case of the "Min.Vol." ellipsoid of which a more more flexible and general implementation is 'cov.mve' in the 'MASS' package. _V_a_l_u_e: an object of class '"ellipsoid"', basically a 'list' with several components, comprising at least cov: p x p _covariance_ matrix description the ellipsoid. loc: p-dimensional location of the ellipsoid center. d2: average squared radius. Further, d2 = t^2, where t is "the value of a t-statistic on the ellipse boundary" (from 'ellipse' in the 'ellipse' package), and hence, more usefully, 'd2 = qchisq(alpha, df = p)', where 'alpha' is the confidence level for p-variate normally distributed data with location and covariance 'loc' and 'cov' to lie inside the ellipsoid. wt: the vector of weights iff 'ret.wt' was true. sqdist: the vector of squared distances iff 'ret.sqdist' was true. prob: the vector of algorithm probabilities iff 'ret.pr' was true. it: number of iterations used. tol, maxit: just the input argument, see above. eps: the achieved tolerance which is the maximal squared radius minus p. ierr: error code as from the algorithm; '0' means _ok_. conv: logical indicating if the converged. This is defined as 'it < maxit && ierr == 0'. _A_u_t_h_o_r(_s): Martin Maechler did the present class implementation; Rousseeuw et al did the underlying code. _R_e_f_e_r_e_n_c_e_s: Pison, G., Struyf, A. and Rousseeuw, P.J. (1999) Displaying a Clustering with CLUSPLOT, _Computational Statistics and Data Analysis_, *30*, 381-392. A version of this is available as technical report from D.M. Titterington (1976) Algorithms for computing D-optimal design on finite design spaces. In _Proc. of the 1976 Conf. on Information Science and Systems_, 213-216; John Hopkins University. _S_e_e _A_l_s_o: 'predict.ellipsoid' which is also the 'predict' method for 'ellipsoid' objects. 'volume.ellipsoid' for an example of 'manual' 'ellipsoid' object construction; further 'ellipse' from package 'ellipse' and 'ellipsePoints' from package 'sfsmisc'. 'chull' for the convex hull, 'clusplot' which makes use of this; 'cov.mve'. _E_x_a_m_p_l_e_s: x <- rnorm(100) xy <- unname(cbind(x, rnorm(100) + 2*x + 10)) exy <- ellipsoidhull(xy) exy # >> calling print.ellipsoid() plot(xy) lines(predict(exy)) points(rbind(exy$loc), col = "red", cex = 3, pch = 13) exy <- ellipsoidhull(xy, tol = 1e-7, ret.wt = TRUE, ret.sq = TRUE) str(exy) # had small `tol', hence many iterations (ii <- which(zapsmall(exy $ wt) > 1e-6)) # only about 4 to 6 points round(exy$wt[ii],3); sum(exy$wt[ii]) # sum to 1