substr_ctl {fansi} | R Documentation |
substr_ctl
is a drop-in replacement for substr
. Performance is
slightly slower than substr
.
substr_ctl(x, start, stop, warn = getOption("fansi.warn"), term.cap = getOption("fansi.term.cap")) substr2_ctl(x, start, stop, type = "chars", round = "start", tabs.as.spaces = getOption("fansi.tabs.as.spaces"), tab.stops = getOption("fansi.tab.stops"), warn = getOption("fansi.warn"), term.cap = getOption("fansi.term.cap"))
x |
a character vector or object that can be coerced to character. |
start |
integer. The first element to be replaced. |
stop |
integer. The last element to be replaced. |
warn |
TRUE (default) or FALSE, whether to warn when potentially
problematic Control Sequences are encountered. These could cause the
assumptions |
term.cap |
character a vector of the capabilities of the terminal, can
be any combination "bright" (SGR codes 90-97, 100-107), "256" (SGR codes
starting with "38;5" or "48;5"), and "truecolor" (SGR codes starting with
"38;2" or "48;2"). Changing this parameter changes how |
type |
character(1L) partial matching |
round |
character(1L) partial matching
|
tabs.as.spaces |
FALSE (default) or TRUE, whether to convert tabs to
spaces. This can only be set to TRUE if |
tab.stops |
integer(1:n) indicating position of tab stops to use when converting tabs to spaces. If there are more tabs in a line than defined tab stops the last tab stop is re-used. For the purposes of applying tab stops, each input line is considered a line and the character count begins from the beginning of the input line. |
substr2_ctl
adds the ability to retrieve substrings based on display width,
and byte width in addition to the normal character width. substr2_ctl
also
provides the option to convert tabs to spaces with tabs_as_spaces prior to
taking substrings.
Because exact substrings on anything other than character width cannot be
guaranteed (e.g. because of multi-byte encodings, or double display-width
characters) substr2_ctl
must make assumptions on how to resolve provided
start
/stop
values that are infeasible and does so via the round
parameter.
If we use "start" as the round
value, then any time the start
value corresponds to the middle of a multi-byte or a wide character, then
that character is included in the substring, while any similar partially
included character via the stop
is left out. The converse is true if we
use "stop" as the round
value. "neither" would cause all partial
characters to be dropped irrespective whether they correspond to start
or
stop
, and "both" could cause all of them to be included.
Non-ASCII strings are converted to and returned in UTF-8 encoding.
fansi for details on how Control Sequences are interpreted, particularly if you are getting unexpected results.
substr_ctl("\033[42mhello\033[m world", 1, 9) substr_ctl("\033[42mhello\033[m world", 3, 9) ## Width 2 and 3 are in the middle of an ideogram as ## start and stop positions respectively, so we control ## what we get with `round` cn.string <- paste0("\033[42m", "\u4E00\u4E01\u4E03", "\033[m") substr2_ctl(cn.string, 2, 3, type='width') substr2_ctl(cn.string, 2, 3, type='width', round='both') substr2_ctl(cn.string, 2, 3, type='width', round='start') substr2_ctl(cn.string, 2, 3, type='width', round='stop')