quantile                package:stats                R Documentation

_S_a_m_p_l_e _Q_u_a_n_t_i_l_e_s

_D_e_s_c_r_i_p_t_i_o_n:

     The generic function 'quantile' produces sample quantiles
     corresponding to the given probabilities. The smallest observation
     corresponds to a probability of 0 and the largest to a probability
     of 1.

_U_s_a_g_e:

     quantile(x, ...)

     ## Default S3 method:
     quantile(x, probs = seq(0, 1, 0.25), na.rm = FALSE,
              names = TRUE, type = 7, ...)

_A_r_g_u_m_e_n_t_s:

       x: numeric vector whose sample quantiles are wanted.   'NA' and
          'NaN' values are not allowed unless 'na.rm' is 'TRUE'.

   probs: numeric vector of probabilities with values in [0,1].  (As
          from R 2.8.0 values up to 2e-14 outside that range are
          accepted and moved to the nearby endpoint.

   na.rm: logical; if true, any 'NA' and 'NaN''s are removed from 'x'
          before the quantiles are computed.

   names: logical; if true, the result has a 'names' attribute.  Set to
          'FALSE' for speedup with many 'probs'.

    type: an integer between 1 and 9 selecting one of the nine quantile
          algorithms detailed below to be used.

     ...: further arguments passed to or from other methods.

_D_e_t_a_i_l_s:

     A vector of length 'length(probs)' is returned; if 'names = TRUE',
     it has a 'names' attribute.

     'NA' and 'NaN' values in 'probs' are propagated to the result.

_T_y_p_e_s:

     'quantile' returns estimates of underlying distribution quantiles
     based on one or two order statistics from the supplied elements in
     'x' at probabilities in 'probs'.  One of the nine quantile
     algorithms discussed in Hyndman and Fan (1996), selected by
     'type', is employed.

     Sample quantiles of type i are defined by

              Q[i](p) = (1 - gamma) x[j] + gamma x[j+1],

     where 1 <= i <= 9, (j-m)/n <= p < (j-m+1)/ n, x[j] is the jth
     order statistic, n is the sample size, and m is a constant
     determined by the sample quantile type. Here gamma depends on the
     fractional part of g = np+m-j.

     For the continuous sample quantile types (4 through 9), the sample
     quantiles can be obtained by linear interpolation between the kth
     order statistic and p(k):

             p(k) = (k - alpha) / (n - alpha - beta + 1),

     where alpha and beta are constants determined by the type. 
     Further, m = alpha + p(1 - alpha - beta), and gamma = g.

     *Discontinuous sample quantile types 1, 2, and 3*


     _T_y_p_e _1 Inverse of empirical distribution function.

     _T_y_p_e _2 Similar to type 1 but with averaging at discontinuities.

     _T_y_p_e _3 SAS definition: nearest even order statistic.

     *Continuous sample quantile types 4 through 9*


     _T_y_p_e _4 p(k) = k / n. That is, linear interpolation of the
          empirical cdf.


     _T_y_p_e _5 p(k) = (k - 0.5) / n. That is a piecewise linear function
          where the knots are the values midway through the steps of
          the empirical cdf. This is popular amongst hydrologists.


     _T_y_p_e _6 p(k) = k / (n + 1). Thus p(k) = E[F(x[k])]. This is used by
          Minitab and by SPSS.


     _T_y_p_e _7 p(k) = (k - 1) / (n - 1). In this case, p(k) =
          mode[F(x[k])]. This is used by S.


     _T_y_p_e _8 p(k) = (k - 1/3) / (n + 1/3). Then p(k) =~ median[F(x[k])].
          The resulting quantile estimates are approximately
          median-unbiased regardless of the distribution of 'x'.


     _T_y_p_e _9 p(k) = (k - 3/8) / (n + 1/4). The resulting quantile
          estimates are approximately unbiased for the expected order
          statistics if 'x' is normally distributed.


     Hyndman and Fan (1996) recommend type 8. The default method is
     type 7, as used by S and by R < 2.0.0.

_A_u_t_h_o_r(_s):

     of the version used in R >= 2.0.0, Ivan Frohne and Rob J Hyndman.

_R_e_f_e_r_e_n_c_e_s:

     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S
     Language_. Wadsworth & Brooks/Cole.

     Hyndman, R. J. and Fan, Y. (1996) Sample quantiles in statistical
     packages, _American Statistician_, *50*, 361-365.

_S_e_e _A_l_s_o:

     'ecdf' for empirical distributions of which 'quantile' is an
     inverse; 'boxplot.stats' and 'fivenum' for computing other
     versions of quartiles, etc.

_E_x_a_m_p_l_e_s:

     quantile(x <- rnorm(1001))# Extremes & Quartiles by default
     quantile(x,  probs=c(.1,.5,1,2,5,10,50, NA)/100)

     ### Compare different types
     p <- c(0.1,0.5,1,2,5,10,50)/100
     res <- matrix(as.numeric(NA), 9, 7)
     for(type in 1:9) res[type, ] <- y <- quantile(x,  p, type=type)
     dimnames(res) <- list(1:9, names(y))
     round(res, 3)