Wilcoxon package:stats R Documentation _D_i_s_t_r_i_b_u_t_i_o_n _o_f _t_h_e _W_i_l_c_o_x_o_n _R_a_n_k _S_u_m _S_t_a_t_i_s_t_i_c _D_e_s_c_r_i_p_t_i_o_n: Density, distribution function, quantile function and random generation for the distribution of the Wilcoxon rank sum statistic obtained from samples with size 'm' and 'n', respectively. _U_s_a_g_e: dwilcox(x, m, n, log = FALSE) pwilcox(q, m, n, lower.tail = TRUE, log.p = FALSE) qwilcox(p, m, n, lower.tail = TRUE, log.p = FALSE) rwilcox(nn, m, n) _A_r_g_u_m_e_n_t_s: x, q: vector of quantiles. p: vector of probabilities. nn: number of observations. If 'length(nn) > 1', the length is taken to be the number required. m, n: numbers of observations in the first and second sample, respectively. Can be vectors of positive integers. log, log.p: logical; if TRUE, probabilities p are given as log(p). lower.tail: logical; if TRUE (default), probabilities are P[X <= x], otherwise, P[X > x]. _D_e_t_a_i_l_s: This distribution is obtained as follows. Let 'x' and 'y' be two random, independent samples of size 'm' and 'n'. Then the Wilcoxon rank sum statistic is the number of all pairs '(x[i], y[j])' for which 'y[j]' is not greater than 'x[i]'. This statistic takes values between '0' and 'm * n', and its mean and variance are 'm * n / 2' and 'm * n * (m + n + 1) / 12', respectively. If any of the first three arguments are vectors, the recycling rule is used to do the calculations for all combinations of the three up to the length of the longest vector. _V_a_l_u_e: 'dwilcox' gives the density, 'pwilcox' gives the distribution function, 'qwilcox' gives the quantile function, and 'rwilcox' generates random deviates. _W_a_r_n_i_n_g: These functions can use large amounts of memory and stack (and even crash R if the stack limit is exceeded and stack-checking is not in place) if one sample is large (several thousands or more). _N_o_t_e: S-PLUS uses a different (but equivalent) definition of the Wilcoxon statistic: see 'wilcox.test' for details. _A_u_t_h_o_r(_s): Kurt Hornik _S_o_u_r_c_e: These are calculated via recursion, based on 'cwilcox(k, m, n)', the number of choices with statistic 'k' from samples of size 'm' and 'n', which is itself calculated recursively and the results cached. Then 'dwilcox' and 'pwilcox' sum appropriate values of 'cwilcox', and 'qwilcox' is based on inversion. 'rwilcox' generates a random permutation of ranks and evaluates the statistic. _S_e_e _A_l_s_o: 'wilcox.test' to calculate the statistic from data, find p values and so on. 'dsignrank' etc, for the distribution of the _one-sample_ Wilcoxon signed rank statistic. _E_x_a_m_p_l_e_s: require(graphics) x <- -1:(4*6 + 1) fx <- dwilcox(x, 4, 6) Fx <- pwilcox(x, 4, 6) layout(rbind(1,2), widths=1, heights=c(3,2)) plot(x, fx,type='h', col="violet", main= "Probabilities (density) of Wilcoxon-Statist.(n=6,m=4)") plot(x, Fx,type="s", col="blue", main= "Distribution of Wilcoxon-Statist.(n=6,m=4)") abline(h=0:1, col="gray20",lty=2) layout(1)# set back N <- 200 hist(U <- rwilcox(N, m=4,n=6), breaks=0:25 - 1/2, border="red", col="pink", sub = paste("N =",N)) mtext("N * f(x), f() = true \"density\"", side=3, col="blue") lines(x, N*fx, type='h', col='blue', lwd=2) points(x, N*fx, cex=2) ## Better is a Quantile-Quantile Plot qqplot(U, qw <- qwilcox((1:N - 1/2)/N, m=4,n=6), main = paste("Q-Q-Plot of empirical and theoretical quantiles", "Wilcoxon Statistic, (m=4, n=6)",sep="\n")) n <- as.numeric(names(print(tU <- table(U)))) text(n+.2, n+.5, labels=tU, col="red")