unique package:base R Documentation _E_x_t_r_a_c_t _U_n_i_q_u_e _E_l_e_m_e_n_t_s _D_e_s_c_r_i_p_t_i_o_n: 'unique' returns a vector, data frame or array like 'x' but with duplicate elements/rows removed. _U_s_a_g_e: unique(x, incomparables = FALSE, ...) ## Default S3 method: unique(x, incomparables = FALSE, fromLast = FALSE, ...) ## S3 method for class 'matrix': unique(x, incomparables = FALSE, MARGIN = 1, fromLast = FALSE, ...) ## S3 method for class 'array': unique(x, incomparables = FALSE, MARGIN = 1, fromLast = FALSE, ...) _A_r_g_u_m_e_n_t_s: x: a vector or a data frame or an array or 'NULL'. incomparables: a vector of values that cannot be compared. 'FALSE' is a special value, meaning that all values can be compared, and may be the only value accepted for methods other than the default. It will be coerced internally to the same type as 'x'. fromLast: logical indicating if duplication should be considered from the last, i.e., the last (or rightmost) of identical elements will be kept. This only matters for 'names' or 'dimnames'. ...: arguments for particular methods. MARGIN: the array margin to be held fixed: a single integer. _D_e_t_a_i_l_s: This is a generic function with methods for vectors, data frames and arrays (including matrices). The array method calculates for each element of the dimension specified by 'MARGIN' if the remaining dimensions are identical to those for an earlier element (in row-major order). This would most commonly be used for matrices to find unique rows (the default) or columns (with 'MARGIN = 2'). Note that unlike the Unix command 'uniq' this omits _duplicated_ and not just _repeated_ elements/rows. That is, an element is omitted if it is identical to any previous element and not just if it is the same as the immediately previous one. (For the latter, see 'rle'). Missing values are regarded as equal, but 'NaN' is not equal to 'NA_real_'. Values in 'incomparables' will never be marked as duplicated. This is intended to be used for a fairly small set of values and will not be efficient for a very large set. _V_a_l_u_e: For a vector, an object of the same type of 'x', but with only one copy of each duplicated element. No attributes are copied (so the result has no names). For a data frame, a data frame is returned with the same columns but possibly fewer rows (and with row names from the first occurrences of the unique rows). A matrix or array is subsetted by '[, drop = FALSE]', so dimensions and dimnames are copied appropriately, and the result always has the same number of dimensions as 'x'. _W_a_r_n_i_n_g: Using this for lists is potentially slow, especially if the elements are not atomic vectors (see 'vector') or differ only in their attributes. In the worst case it is O(n^2). _R_e_f_e_r_e_n_c_e_s: Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S Language_. Wadsworth & Brooks/Cole. _S_e_e _A_l_s_o: 'duplicated' which gives the indices of duplicated elements. 'rle' which is the equivalent of the Unix 'uniq -c' command. _E_x_a_m_p_l_e_s: x <- c(3:5, 11:8, 8 + 0:5) (ux <- unique(x)) (u2 <- unique(x, fromLast = TRUE)) # different order stopifnot(identical(sort(ux), sort(u2))) length(unique(sample(100, 100, replace=TRUE))) ## approximately 100(1 - 1/e) = 63.21 unique(iris)