sample {dplyr} | R Documentation |
This is a wrapper around sample.int()
to make it easy to
select random rows from a table. It currently only works for local
tbls.
sample_n(tbl, size, replace = FALSE, weight = NULL, .env = NULL) sample_frac(tbl, size = 1, replace = FALSE, weight = NULL, .env = NULL)
tbl |
tbl of data. |
size |
For |
replace |
Sample with or without replacement? |
weight |
Sampling weights. This must evaluate to a vector of non-negative numbers the same length as the input. Weights are automatically standardised to sum to 1. This argument is automatically quoted and later
evaluated in the context of the data
frame. It supports unquoting. See
|
.env |
This variable is deprecated and no longer has any
effect. To evaluate |
by_cyl <- mtcars %>% group_by(cyl) # Sample fixed number per group sample_n(mtcars, 10) sample_n(mtcars, 50, replace = TRUE) sample_n(mtcars, 10, weight = mpg) sample_n(by_cyl, 3) sample_n(by_cyl, 10, replace = TRUE) sample_n(by_cyl, 3, weight = mpg / mean(mpg)) # Sample fixed fraction per group # Default is to sample all data = randomly resample rows sample_frac(mtcars) sample_frac(mtcars, 0.1) sample_frac(mtcars, 1.5, replace = TRUE) sample_frac(mtcars, 0.1, weight = 1 / mpg) sample_frac(by_cyl, 0.2) sample_frac(by_cyl, 1, replace = TRUE)