dpill package:KernSmooth R Documentation _S_e_l_e_c_t _a _B_a_n_d_w_i_d_t_h _f_o_r _L_o_c_a_l _L_i_n_e_a_r _R_e_g_r_e_s_s_i_o_n _D_e_s_c_r_i_p_t_i_o_n: Use direct plug-in methodology to select the bandwidth of a local linear Gaussian kernel regression estimate, as described by Ruppert, Sheather and Wand (1995). _U_s_a_g_e: dpill(x, y, blockmax = 5, divisor = 20, trim = 0.01, proptrun = 0.05, gridsize = 401L, range.x, truncate = TRUE) _A_r_g_u_m_e_n_t_s: x: vector of x data. Missing values are not accepted. y: vector of y data. This must be same length as 'x', and missing values are not accepted. blockmax: the maximum number of blocks of the data for construction of an initial parametric estimate. divisor: the value that the sample size is divided by to determine a lower limit on the number of blocks of the data for construction of an initial parametric estimate. trim: the proportion of the sample trimmed from each end in the 'x' direction before application of the plug-in methodology. proptrun: the proportion of the range of 'x' at each end truncated in the functional estimates. gridsize: number of equally-spaced grid points over which the function is to be estimated. range.x: vector containing the minimum and maximum values of 'x' at which to compute the estimate. For density estimation the default is the minimum and maximum data values with 5% of the range added to each end. For regression estimation the default is the minimum and maximum data values. truncate: logical flag: if 'TRUE', data with 'x' values outside the range specified by 'range.x' are ignored. _D_e_t_a_i_l_s: The direct plug-in approach, where unknown functionals that appear in expressions for the asymptotically optimal bandwidths are replaced by kernel estimates, is used. The kernel is the standard normal density. Least squares quartic fits over blocks of data are used to obtain an initial estimate. Mallow's Cp is used to select the number of blocks. _V_a_l_u_e: the selected bandwidth. _W_a_r_n_i_n_g: If there are severe irregularities (i.e. outliers, sparse regions) in the 'x' values then the local polynomial smooths required for the bandwidth selection algorithm may become degenerate and the function will crash. Outliers in the 'y' direction may lead to deterioration of the quality of the selected bandwidth. _R_e_f_e_r_e_n_c_e_s: Ruppert, D., Sheather, S. J. and Wand, M. P. (1995). An effective bandwidth selector for local least squares regression. _Journal of the American Statistical Association_, *90*, 1257-1270. Wand, M. P. and Jones, M. C. (1995). _Kernel Smoothing._ Chapman and Hall, London. _S_e_e _A_l_s_o: 'ksmooth', 'locpoly'. _E_x_a_m_p_l_e_s: data(geyser, package = "MASS") x <- geyser$duration y <- geyser$waiting plot(x, y) h <- dpill(x, y) fit <- locpoly(x, y, bandwidth = h) lines(fit)