smooth.f                package:boot                R Documentation

_S_m_o_o_t_h _D_i_s_t_r_i_b_u_t_i_o_n_s _o_n _D_a_t_a _P_o_i_n_t_s

_D_e_s_c_r_i_p_t_i_o_n:

     This function uses the method of frequency smoothing to find a
     distribution  on a data set which has a required value, 'theta', 
     of the statistic of  interest.  The method results in
     distributions which vary smoothly with  'theta'.

_U_s_a_g_e:

     smooth.f(theta, boot.out, index=1, t=boot.out$t[, index], width=0.5)

_A_r_g_u_m_e_n_t_s:

   theta: The required value for the statistic of interest.  If 'theta'
          is a vector, a separate distribution will be found for each
          element of 'theta'. 

boot.out: A bootstrap output object returned by a call to 'boot'.   

   index: The index of the variable of interest in the output of
          'boot.out$statistic'. This argument is ignored if 't' is
          supplied.  'index' must be a scalar. 

       t: The bootstrap values of the statistic of interest.  This must
          be a vector of length 'boot.out$R' and the values must be in
          the same order as the bootstrap replicates in 'boot.out'. 

   width: The standardized width for the kernel smoothing.  The
          smoothing uses a value of 'width*s' for epsilon, where 's' is
          the bootstrap estimate of the  standard error of the
          statistic of interest.  'width' should take a value in  the
          range (0.2, 1) to produce a reasonable smoothed distribution.
           If 'width' is too large then the distribution becomes closer
          to uniform.  

_D_e_t_a_i_l_s:

     The new distributional weights are found by applying a normal
     kernel smoother to the observed values of 't' weighted by the
     observed frequencies in the bootstrap simulation.  The resulting
     distribution may not have parameter value exactly equal to the
     required value 'theta' but it will  typically have a value which
     is close to 'theta'.  The details of how this method works can be
     found in Davison, Hinkley and Worton (1995) and Section 3.9.2 of
     Davison and Hinkley (1997).

_V_a_l_u_e:

     If 'length(theta)' is 1 then a vector with the same length as the
     data set 'boot.out$data' is returned.  The value in position 'i'
     is the probability  to be given to the data point in position 'i'
     so that the distribution has  parameter value approximately equal
     to 'theta'. If 'length(theta)' is bigger than 1 then the returned
     value is a matrix with  'length(theta)' rows each of which
     corresponds to a distribution with the  parameter value
     approximately equal to the corresponding value of 'theta'.

_R_e_f_e_r_e_n_c_e_s:

     Davison, A.C. and Hinkley, D.V. (1997) _Bootstrap Methods and
     Their Application_. Cambridge University Press.

     Davison, A.C., Hinkley, D.V. and Worton, B.J. (1995) Accurate and
     efficient  construction of bootstrap likelihoods. _Statistics and
     Computing_,  *5*, 257-264.

_S_e_e _A_l_s_o:

     'boot', 'exp.tilt', 'tilt.boot'

_E_x_a_m_p_l_e_s:

     # Example 9.8 of Davison and Hinkley (1997) requires tilting the resampling
     # distribution of the studentized statistic to be centred at the observed
     # value of the test statistic 1.84.  In the book exponential tilting was used
     # but it is also possible to use smooth.f.
     grav1 <- gravity[as.numeric(gravity[,2])>=7,]
     grav.fun <- function(dat, w, orig)
     {    strata <- tapply(dat[, 2], as.numeric(dat[, 2]))
          d <- dat[, 1]
          ns <- tabulate(strata)
          w <- w/tapply(w, strata, sum)[strata]
          mns <- tapply(d * w, strata, sum)
          mn2 <- tapply(d * d * w, strata, sum)
          s2hat <- sum((mn2 - mns^2)/ns)
          c(mns[2]-mns[1], s2hat, (mns[2]-mns[1]-orig)/sqrt(s2hat))
     }
     grav.z0 <- grav.fun(grav1,rep(1,26),0)
     grav.boot <- boot(grav1, grav.fun, R=499, stype="w", 
                       strata=grav1[,2], orig=grav.z0[1])
     grav.sm <- smooth.f(grav.z0[3], grav.boot, index=3)

     # Now we can run another bootstrap using these weights
     grav.boot2 <- boot(grav1, grav.fun, R=499, stype="w", 
                        strata=grav1[,2], orig=grav.z0[1],
                        weights=grav.sm)

     # Estimated p-values can be found from these as follows
     mean(grav.boot$t[,3] >= grav.z0[3])
     imp.prob(grav.boot2,t0=-grav.z0[3],t=-grav.boot2$t[,3])

     # Note that for the importance sampling probability we must 
     # multiply everything by -1 to ensure that we find the correct
     # probability.  Raw resampling is not reliable for probabilities
     # greater than 0.5. Thus
     1-imp.prob(grav.boot2,index=3,t0=grav.z0[3])$raw
     # can give very strange results (negative probabilities).