faithful package:datasets R Documentation(latin1) _O_l_d _F_a_i_t_h_f_u_l _G_e_y_s_e_r _D_a_t_a _D_e_s_c_r_i_p_t_i_o_n: Waiting time between eruptions and the duration of the eruption for the Old Faithful geyser in Yellowstone National Park, Wyoming, USA. _U_s_a_g_e: faithful _F_o_r_m_a_t: A data frame with 272 observations on 2 variables. [,1] eruptions numeric Eruption time in mins [,2] waiting numeric Waiting time to next eruption (in mins) _D_e_t_a_i_l_s: A closer look at 'faithful$eruptions' reveals that these are heavily rounded times originally in seconds, where multiples of 5 are more frequent than expected under non-human measurement. For a better version of the eruption times, see the example below. There are many versions of this dataset around: Azzalini and Bowman (1990) use a more complete version. _S_o_u_r_c_e: W. Härdle. _R_e_f_e_r_e_n_c_e_s: Haerdle, W. (1991) _Smoothing Techniques with Implementation in S_. New York: Springer. Azzalini, A. and Bowman, A. W. (1990). A look at some data on the Old Faithful geyser. _Applied Statistics_ *39*, 357-365. _S_e_e _A_l_s_o: 'geyser' in package 'MASS' for the Azzalini-Bowman version. _E_x_a_m_p_l_e_s: require(stats); require(graphics) f.tit <- "faithful data: Eruptions of Old Faithful" ne60 <- round(e60 <- 60 * faithful$eruptions) all.equal(e60, ne60) # relative diff. ~ 1/10000 table(zapsmall(abs(e60 - ne60))) # 0, 0.02 or 0.04 faithful$better.eruptions <- ne60 / 60 te <- table(ne60) te[te >= 4] # (too) many multiples of 5 ! plot(names(te), te, type="h", main = f.tit, xlab = "Eruption time (sec)") plot(faithful[, -3], main = f.tit, xlab = "Eruption time (min)", ylab = "Waiting time to next eruption (min)") lines(lowess(faithful$eruptions, faithful$waiting, f = 2/3, iter = 3), col = "red")