diff --git a/.Rbuildignore b/.Rbuildignore
index 438500dd3..dfad9c5ea 100644
--- a/.Rbuildignore
+++ b/.Rbuildignore
@@ -38,3 +38,5 @@ kate-swp$
^Makefile
^stringi_.*\.tar\.gz$
^CODE_OF_CONDUCT
+TODO
+^CITATION\.cff$
diff --git a/.github/workflows/r-icu-bundle.yml b/.github/workflows/r-icu-bundle.yml
index c79c2dd7d..36648001e 100644
--- a/.github/workflows/r-icu-bundle.yml
+++ b/.github/workflows/r-icu-bundle.yml
@@ -29,4 +29,4 @@ jobs:
sudo make tinytest
- name: Check stringi
run: |
- sudo make check
+ sudo make check-cran
diff --git a/CODE_OF_CONDUCT.md b/CODE_OF_CONDUCT.md
index e6ee8e912..5b126b08b 100644
--- a/CODE_OF_CONDUCT.md
+++ b/CODE_OF_CONDUCT.md
@@ -1,15 +1,9 @@
Code of Conduct
===============
-Come in and make yourself at home!
-
This is a project conveyed in the authors' free time. It is their little
act of charity to make this world an (even) better place.
It will most likely pass unnoticed, but if you happen to find it useful,
informative, amusing, or stimulating, we're happy for you.
-Please be civilised, well-mannered, and courteous. Primum non nocere.
-Let us all strive to be better versions of ourselves, exercise forgiveness
-and generosity, and assume good faith in others.
-
We are looking forward to your contributions and ideas.
diff --git a/DESCRIPTION b/DESCRIPTION
index bea323166..b5ade09dc 100644
--- a/DESCRIPTION
+++ b/DESCRIPTION
@@ -1,7 +1,7 @@
Package: stringi
-Version: 1.7.6.9002
-Date: 2022-06-28
-Title: Character String Processing Facilities
+Version: 1.7.7
+Date: 2022-07-02
+Title: Fast and Portable Character String Processing Facilities
Description: A collection of character string/text/natural language
processing tools for pattern searching (e.g., with 'Java'-like regular
expressions or the 'Unicode' collation algorithm), random string generation,
@@ -10,6 +10,9 @@ Description: A collection of character string/text/natural language
and many more. They are fast, consistent, convenient, and -
thanks to 'ICU' (International Components for Unicode) -
portable across all locales and platforms.
+ Documentation about 'stringi' is provided
+ via its website at and
+ the paper by Gagolewski (2022, ).
URL:
https://stringi.gagolewski.com/,
https://github.com/gagolews/stringi,
diff --git a/INSTALL b/INSTALL
index 53bcb0a48..0fefbdf0a 100644
--- a/INSTALL
+++ b/INSTALL
@@ -14,10 +14,10 @@ install.packages("stringi")
However, due to the overwhelming complexity of the ICU4C library,
upon which *stringi* is based, and the colourful diversity of operating systems,
-their flavors, and particular setups, some users may still experience
+their flavours, and particular setups, some users may still experience
a few issues that hopefully can be resolved with the help of this short manual.
-Also, some additional build tweaks are possible in case we require a more
+Also, some additional build tweaks are possible if we require a more
customised installation.
@@ -32,25 +32,25 @@ If we install the package from sources and either:
the `libicu-devel` rpm on Fedora/CentOS/OpenSUSE,
`libicu-dev` on Ubuntu/Debian, etc.),
-* `pkg-config` is fails to find appropriate build settings
+* `pkg-config` fails to find appropriate build settings
for ICU-based projects, or
* `R CMD INSTALL` is called with the `--configure-args='--disable-pkg-config'`
- argument or environment variable `STRINGI_DISABLE_PKG_CONFIG` is
+ argument, or environment variable `STRINGI_DISABLE_PKG_CONFIG` is
set to non-zero or
`install.packages("stringi", configure.args="--disable-pkg-config")`
is executed,
then ICU will be built together with stringi.
A custom subset of ICU4C 69.1 is shipped with the package.
-We also include ICU4C 55.1 that can be used as a fallback version
+We also include ICU4C 55.1 which can be used as a fallback version
(e.g., on older Solaris boxes).
> To get the most out of stringi, you are strongly encouraged to rely on our
> ICU4C package bundle. This ensures maximum portability across all platforms
> (Windows and macOS users by default fetch the pre-compiled binaries
-> from CRAN built exactly this way).
+> from CRAN built precisely this way).
@@ -59,7 +59,7 @@ We also include ICU4C 55.1 that can be used as a fallback version
Note that if you choose to use our ICU4C bundle, then -- by default -- the
ICU data library will be downloaded from one of our mirror servers.
However, if you have already downloaded a version of `icudt*.zip` suitable
-for your platform (big/little endian), you may wish to install the
+for your platform (big/little-endian), you may wish to install the
package by calling:
```r
@@ -115,8 +115,8 @@ amongst others, `/etc/Makeconf` (e.g., are you using
for more details.
-There is an option of using the fallback version of ICU4C 55.1
-which however requires the support of the `long long` type in a few functions,
+There is an option of using the fallback version of ICU4C 55.1.
+However, it requires the support of the `long long` type in a few functions,
(this is not part of the C++98 standard; works on Solaris, though). Try:
```r
@@ -155,7 +155,7 @@ Some influential environment variables:
path relative to `/src`; defaults to `icuXX/data`.
* `PKG_CONFIG_PATH`: An optional list of directories to search for
- `pkg-config`s `.pc` files.
+ `pkg-config`'s `.pc` files.
* `R_HOME`: Override the R directory, e.g.,
`/usr/lib64/R`. Note that `$R_HOME/bin/R` point to the R executable.
@@ -163,15 +163,15 @@ Some influential environment variables:
* `CAT`: The `cat` command used to generate the list of source files to compile.
* `PKG_CONFIG`:The `pkg-config` command used to fetch the necessary compiler
- flags to link to and existing `libicu` installation.
+ flags to link to the existing `libicu` installation.
-* `STRINGI_DISABLE_CXX11`: Disable C++11,
+* `STRINGI_DISABLE_CXX11`: Disable C++11;
see also `--disable-cxx11`.
-* `STRINGI_DISABLE_PKG_CONFIG`: Compile ICU from sources,
+* `STRINGI_DISABLE_PKG_CONFIG`: Compile ICU from sources;
see also `--disable-pkg-config`.
-* `STRINGI_DISABLE_ICU_BUNDLE`: Enforce system ICU,
+* `STRINGI_DISABLE_ICU_BUNDLE`: Enforce system ICU;
see also `--disable-icu-bundle`.
* `STRINGI_CFLAGS`: see `--with-extra-cflags`.
@@ -191,7 +191,7 @@ Some influential environment variables:
We expect that with a correctly configured C++11 compiler and properly
installed system ICU4C distribution, you should face no problems
-with installing the package, especially if you use our ICU4C bundle and you
+installing the package, especially if you use our ICU4C bundle and
have a working internet access.
If you do not manage to set up a successful stringi build, do not
diff --git a/NEWS b/NEWS
index 8a79efd35..b12b54446 100644
--- a/NEWS
+++ b/NEWS
@@ -1,47 +1,21 @@
# What Is New in *stringi*
-## 1.7.6.9xxx (under development)
+## 1.7.7 (2022-07-02)
-* [DOCUMENTATION] ...Paper on *stringi* has been published in
- the *Journal of Statistical Software*....
+* [DOCUMENTATION] Paper on *stringi* has been published in
+ the *Journal of Statistical Software*, see .
* [BUGFIX] #473, #397: Fixed buffer overflow in `stri_dup`.
`stri_dup`, `stri_paste`, ... fail more graciously on attempts to
generate strings of length >= 2^31 each.
-* [BUGFIX] #480: Using `Rf_isNull` instead of isNull`.
+* [BUILD TIME] #480: Using `Rf_isNull` instead of `isNull`.
* [DOCUMENTATION] #462: That the `numeric=TRUE` collator
does not handle negative numbers correctly is now mentioned in the manual.
-... checkRd: (-1) stri_trans_nf.Rd:74: Escaped LaTeX specials: \#
-... checkRd: (-1) stri_trans_nf.Rd:92: Escaped LaTeX specials: \#
-... icu69/common/cstring.h:43:70: warning: 'char* strncpy(char*, const char*, size_t)' output may be truncated copying 156 bytes from a string of length 156 [-Wstringop-truncation]
-
-
-
-* [NEW FEATURE] TODO.... #469: `stri_datetime_parse` .. new argument -
-`default_time`
- a Calendar set on input to the date and time to be used for missing values in the date/time string being parsed
-
-* [BUGFIX] TODO.... #469: `stri_datetime_parse` did not reset the `Calendar` object
- when parsing multiple dates.
-
-* [NEW FEATURE] TODO... #476 U_USING_DEFAULT_ERROR on unknown locales
-
-* [NEW FEATURE] TODO... #81 number format
-
-* [NEW FEATURE] TODO... #477 sprintf localised number format
-
-* [NEW FEATURE] TODO... #471: split into overlapping or non-overlapping chunks,
- possibly of different lengths
-
-
-
-
-
## 1.7.6 (2021-11-29)
* [BUILD TIME] #463: Added loongarch support in ICU's double conversion
diff --git a/R/stringi-package.R b/R/stringi-package.R
index 8b466d2e2..938a9eb76 100644
--- a/R/stringi-package.R
+++ b/R/stringi-package.R
@@ -1,7 +1,7 @@
# kate: default-dictionary en_US
## This file is part of the 'stringi' package for R.
-## Copyright (c) 2013-2021, Marek Gagolewski
+## Copyright (c) 2013-2022, Marek Gagolewski
## All rights reserved.
##
## Redistribution and use in source and binary forms, with or without
@@ -31,7 +31,7 @@
## EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-#' @title THE String Processing Package
+#' @title Fast and Portable Character String Processing in R
#'
#' @description
#' \pkg{stringi} is THE R package for fast, correct, consistent,
@@ -115,7 +115,7 @@
#' i.e., conversion to lower, UPPER, or Title Case,
#' \code{\link{stri_trans_nfc}} (among others) for Unicode normalization,
#' \code{\link{stri_trans_char}} for translating individual code points,
-#' and \code{\link{stri_trans_general}} for other universal yet powerful
+#' and \code{\link{stri_trans_general}} for other universal
#' text transforms, including transliteration.
#'
#' \item \code{\link{stri_cmp}}, \code{\link{\%s<\%}}, \code{\link{stri_order}},
@@ -150,9 +150,13 @@
#' ICU4C was developed by IBM, Unicode, Inc., and others.
#'
#' @references
-#' \emph{\pkg{stringi} Package homepage},
+#' \emph{\pkg{stringi} Package Homepage},
#' \url{https://stringi.gagolewski.com/}
#'
+#' Gagolewski M., \pkg{stringi}: Fast and portable character string
+#' processing in R, \emph{Journal of Statistical Software} 103(2), 2022, 1-59,
+#' doi:\url{https://dx.doi.org/10.18637/jss.v103.i02}
+#'
#' \emph{ICU -- International Components for Unicode},
#' \url{https://icu.unicode.org/}
#'
@@ -162,7 +166,7 @@
#' \emph{The Unicode Consortium},
#' \url{https://home.unicode.org/}
#'
-#' \emph{UTF-8, a transformation format of ISO 10646} -- RFC 3629,
+#' \emph{UTF-8, A Transformation Format of ISO 10646} -- RFC 3629,
#' \url{https://tools.ietf.org/html/rfc3629}
#'
#' @family stringi_general_topics
diff --git a/R/trans_normalization.R b/R/trans_normalization.R
index bc890499e..6d86cbde3 100644
--- a/R/trans_normalization.R
+++ b/R/trans_normalization.R
@@ -1,7 +1,7 @@
# kate: default-dictionary en_US
## This file is part of the 'stringi' package for R.
-## Copyright (c) 2013-2021, Marek Gagolewski
+## Copyright (c) 2013-2022, Marek Gagolewski
## All rights reserved.
##
## Redistribution and use in source and binary forms, with or without
@@ -63,7 +63,7 @@
#' character sequences in document formats on the Web.
#' Thus, you will rather not use these functions in typical
#' string processing activities. Most often you may assume
-#' that a string is in NFC, see RFC\#5198.
+#' that a string is in NFC, see RFC5198.
#'
#' As usual in \pkg{stringi},
#' if the input character vector is in the native encoding,
@@ -84,7 +84,7 @@
#' \url{https://unicode.org/reports/tr15/}
#'
#' \emph{Unicode Format for Network Interchange}
-#' -- RFC\#5198, \url{https://tools.ietf.org/rfc/rfc5198.txt}
+#' -- RFC5198, \url{https://tools.ietf.org/rfc/rfc5198.txt}
#'
#' \emph{Character Model for the World Wide Web 1.0: Normalization}
#' -- W3C Working Draft, \url{https://www.w3.org/TR/charmod-norm/}
diff --git a/README.md b/README.md
index 5fb122b49..a5fe79c25 100644
--- a/README.md
+++ b/README.md
@@ -1,6 +1,6 @@
# [**stringi**](https://stringi.gagolewski.com/)
-### THE String Processing Package for *R*
+### Fast and Portable Character String Processing in R (with the Unicode ICU)
![Build Status](https://github.com/gagolews/stringi/workflows/stringi%20for%20R/badge.svg)
![RStudio CRAN mirror downloads](http://cranlogs.r-pkg.org/badges/grand-total/stringi)
diff --git a/configure b/configure
index 3e0203b1d..3b1bdbea5 100755
--- a/configure
+++ b/configure
@@ -1376,18 +1376,18 @@ Optional Packages:
--with-PACKAGE[=ARG] use PACKAGE [ARG=yes]
--without-PACKAGE do not use PACKAGE (same as --with-PACKAGE=no)
--with-extra-cflags=FLAGS
- Additional C compiler flags, see also the
+ Additional C compiler flags; see also the
STRINGI_CFLAGS environment variable
--with-extra-cppflags=FLAGS
- Additional C/C++ preprocessor flags, see also the
+ Additional C/C++ preprocessor flags; see also the
STRINGI_CPPFLAGS environment variable
--with-extra-cxxflags=FLAGS
- Additional C++ compiler flags, see also the
+ Additional C++ compiler flags; see also the
STRINGI_CXXFLAGS environment variable
--with-extra-ldflags=FLAGS
- Additional linker flags, see also the
+ Additional linker flags; see also the
STRINGI_LDFLAGS environment variable
- --with-extra-libs=FLAGS Additional libraries to link against, see also the
+ --with-extra-libs=FLAGS Additional libraries to link against; see also the
STRINGI_LIBS environment variable
Some influential environment variables:
@@ -1413,22 +1413,22 @@ Some influential environment variables:
LDFLAGS Purposely ignored.
LIBS Purposely ignored.
STRINGI_DISABLE_CXX11
- Disable C++11, see also --disable-cxx11.
+ Disable C++11; see also --disable-cxx11.
STRINGI_DISABLE_ICU_BUNDLE
- Enforce system ICU, see also --disable-icu-bundle.
+ Enforce system ICU; see also --disable-icu-bundle.
STRINGI_DISABLE_PKG_CONFIG
- Enforce our ICU source bundle, see also --disable-pkg-config.
+ Enforce our ICU source bundle; see also --disable-pkg-config.
STRINGI_CFLAGS
- Additional C compiler flags, see also --with-extra-cflags.
+ Additional C compiler flags; see also --with-extra-cflags.
STRINGI_CPPFLAGS
- Additional C/C++ preprocessor flags, see also
+ Additional C/C++ preprocessor flags; see also
--with-extra-cppflags.
STRINGI_CXXFLAGS
- Additional C++ compiler flags, see also --with-extra-cxxflags.
+ Additional C++ compiler flags; see also --with-extra-cxxflags.
STRINGI_LDFLAGS
- Additional linker flags, see also --with-extra-ldflags.
+ Additional linker flags; see also --with-extra-ldflags.
STRINGI_LIBS
- Additional libraries to link against, see also
+ Additional libraries to link against; see also
--with-extra-libs.
Use these variables to override the choices made by `configure' or to help
diff --git a/configure.ac b/configure.ac
index c7a86c77b..e49b4501f 100644
--- a/configure.ac
+++ b/configure.ac
@@ -97,7 +97,7 @@ AC_ARG_ENABLE([cxx11],
If C++11 is disabled, the older ICU4C 55.1 bundle will be used.]))
AC_ARG_VAR([STRINGI_DISABLE_CXX11],
- [Disable C++11, see also --disable-cxx11.])
+ [Disable C++11; see also --disable-cxx11.])
if test "x$enable_cxx11" != "xno" -a -z "${STRINGI_DISABLE_CXX11}"; then
@@ -113,7 +113,7 @@ AC_ARG_ENABLE([icu_bundle],
[Enforce system ICU.]))
AC_ARG_VAR([STRINGI_DISABLE_ICU_BUNDLE],
- [Enforce system ICU, see also --disable-icu-bundle.])
+ [Enforce system ICU; see also --disable-icu-bundle.])
if test "x$enable_icu_bundle" != "xno" -a -z "${STRINGI_DISABLE_ICU_BUNDLE}"; then
enable_icu_bundle="yes"
@@ -133,7 +133,7 @@ AC_ARG_ENABLE([pkg_config],
(strongly recommended for portability across platforms).]))
AC_ARG_VAR([STRINGI_DISABLE_PKG_CONFIG],
- [Enforce our ICU source bundle, see also --disable-pkg-config.])
+ [Enforce our ICU source bundle; see also --disable-pkg-config.])
if test "x$enable_pkg_config" != "xno" -a -z "${STRINGI_DISABLE_PKG_CONFIG}"; then
enable_pkg_config="yes"
@@ -167,38 +167,38 @@ fi
AC_ARG_WITH([extra_cflags],
AS_HELP_STRING([--with-extra-cflags=FLAGS],
- [Additional C compiler flags, see also the STRINGI_CFLAGS environment variable]))
+ [Additional C compiler flags; see also the STRINGI_CFLAGS environment variable]))
AC_ARG_WITH([extra_cppflags],
AS_HELP_STRING([--with-extra-cppflags=FLAGS],
- [Additional C/C++ preprocessor flags, see also the STRINGI_CPPFLAGS environment variable]))
+ [Additional C/C++ preprocessor flags; see also the STRINGI_CPPFLAGS environment variable]))
AC_ARG_WITH([extra_cxxflags],
AS_HELP_STRING([--with-extra-cxxflags=FLAGS],
- [Additional C++ compiler flags, see also the STRINGI_CXXFLAGS environment variable]))
+ [Additional C++ compiler flags; see also the STRINGI_CXXFLAGS environment variable]))
AC_ARG_WITH([extra_ldflags],
AS_HELP_STRING([--with-extra-ldflags=FLAGS],
- [Additional linker flags, see also the STRINGI_LDFLAGS environment variable]))
+ [Additional linker flags; see also the STRINGI_LDFLAGS environment variable]))
AC_ARG_WITH([extra_libs],
AS_HELP_STRING([--with-extra-libs=FLAGS],
- [Additional libraries to link against, see also the STRINGI_LIBS environment variable]))
+ [Additional libraries to link against; see also the STRINGI_LIBS environment variable]))
AC_ARG_VAR([STRINGI_CFLAGS],
- [Additional C compiler flags, see also --with-extra-cflags.])
+ [Additional C compiler flags; see also --with-extra-cflags.])
AC_ARG_VAR([STRINGI_CPPFLAGS],
- [Additional C/C++ preprocessor flags, see also --with-extra-cppflags.])
+ [Additional C/C++ preprocessor flags; see also --with-extra-cppflags.])
AC_ARG_VAR([STRINGI_CXXFLAGS],
- [Additional C++ compiler flags, see also --with-extra-cxxflags.])
+ [Additional C++ compiler flags; see also --with-extra-cxxflags.])
AC_ARG_VAR([STRINGI_LDFLAGS],
- [Additional linker flags, see also --with-extra-ldflags.])
+ [Additional linker flags; see also --with-extra-ldflags.])
AC_ARG_VAR([STRINGI_LIBS],
- [Additional libraries to link against, see also --with-extra-libs.])
+ [Additional libraries to link against; see also --with-extra-libs.])
with_extra_cflags="${with_extra_cflags} ${STRINGI_CFLAGS}"
diff --git a/devel/roxygen2-patch.R b/devel/roxygen2-patch.R
index c74f46f6c..ac5d4f443 100644
--- a/devel/roxygen2-patch.R
+++ b/devel/roxygen2-patch.R
@@ -22,21 +22,29 @@ postprocess_contents <- function(contents)
# There is one and only one official manual. Ad- and tracker-free.
# Enjoy the free internet.
+ seealso <- stringi::stri_paste(
+ "The official online manual of \\pkg{stringi} at ",
+ "\\url{https://stringi.gagolewski.com/}\n",
+ "\n",
+ "Gagolewski M., ",
+ "\\pkg{stringi}: Fast and portable character string processing in R, ",
+ "\\emph{Journal of Statistical Software} 103(2), 2022, 1-59, ",
+ "doi:\\url{https://dx.doi.org/10.18637/jss.v103.i02}\n",
+ "\n"
+ )
+
if (!stringi::stri_detect_fixed(contents, "\\seealso{\n")) {
- contents <- stringi::stri_paste(contents,
- "\\seealso{\n",
- "The official online manual of \\pkg{stringi} at ",
- "\\url{https://stringi.gagolewski.com/}\n",
- "}\n")
+ contents <- stringi::stri_paste(
+ contents,
+ stringi::stri_paste("\\seealso{\n", seealso, "}\n")
+ )
}
else {
- contents <- stringi::stri_replace_first_fixed(contents, "\\seealso{\n",
- stringi::stri_paste(
- "\\seealso{\n",
- "The official online manual of \\pkg{stringi} at ",
- "\\url{https://stringi.gagolewski.com/}\n",
- "\n"
- ))
+ contents <- stringi::stri_replace_first_fixed(
+ contents,
+ "\\seealso{\n",
+ stringi::stri_paste("\\seealso{\n", seealso)
+ )
}
contents
diff --git a/devel/sphinx/_static/vignette/stringi.Rnw b/devel/sphinx/_static/vignette/stringi.Rnw
index c792ad400..f20c5e622 100644
--- a/devel/sphinx/_static/vignette/stringi.Rnw
+++ b/devel/sphinx/_static/vignette/stringi.Rnw
@@ -92,13 +92,16 @@ expressions, data cleansing, natural language processing, R}
\begin{document}
{\color{blue}
-This is a pre-print version of the paper on \pkg{stringi} (last updated on \today).
+This is an older pre-print version of the paper on \pkg{stringi}.
-Please cite as:
-Gagolewski M (2021).
+Please cite it as:
+Gagolewski M (2022).
\pkg{stringi}: Fast and Portable Character String Processing in \proglang{R}.
-\textit{Journal of Statistical Software}, to appear.
-URL \url{https://stringi.gagolewski.com}.
+\textit{Journal of Statistical Software} 103(2):1--59, 2022.
+DOI \url{https://dx.doi.org/10.18637/jss.v103.i02}.
+
+The most recent, Web browser-friendly version thereof
+is available at \url{https://stringi.gagolewski.com}.
}
diff --git a/devel/sphinx/_static/vignette/stringi.pdf b/devel/sphinx/_static/vignette/stringi.pdf
index 3833601f4..18fcc3c24 100644
Binary files a/devel/sphinx/_static/vignette/stringi.pdf and b/devel/sphinx/_static/vignette/stringi.pdf differ
diff --git a/devel/sphinx/_static/vignette/stringi.tex b/devel/sphinx/_static/vignette/stringi.tex
index 125151fd0..f3c4d58f2 100644
--- a/devel/sphinx/_static/vignette/stringi.tex
+++ b/devel/sphinx/_static/vignette/stringi.tex
@@ -7,7 +7,7 @@
% \documentclass[nojss]{jss}
-\documentclass[nojss]{jss}\usepackage[]{graphicx}\usepackage[]{color}
+\documentclass[nojss]{jss}\usepackage[]{graphicx}\usepackage[]{xcolor}
% maxwidth is the original width if it is less than linewidth
% otherwise use linewidth (to make sure the graphics do not exceed the margin)
\makeatletter
@@ -142,13 +142,16 @@
\begin{document}
{\color{blue}
-This is a pre-print version of the paper on \pkg{stringi} (last updated on \today).
+This is an older pre-print version of the paper on \pkg{stringi}.
-Please cite as:
-Gagolewski M (2021).
+Please cite it as:
+Gagolewski M (2022).
\pkg{stringi}: Fast and Portable Character String Processing in \proglang{R}.
-\textit{Journal of Statistical Software}, to appear.
-URL \url{https://stringi.gagolewski.com}.
+\textit{Journal of Statistical Software} 103(2):1--59, 2022.
+DOI \url{https://dx.doi.org/10.18637/jss.v103.i02}.
+
+The most recent, Web browser-friendly version thereof
+is available at \url{https://stringi.gagolewski.com}.
}
@@ -343,7 +346,7 @@ \section{Introduction}\label{Sec:intro}
\medskip
All code chunk outputs presented in this paper were obtained in
-\proglang{R}~4.1.1.
+\proglang{R}~4.1.2.
The \proglang{R} environment itself and all the packages used herein
are available from CRAN (the Comprehensive \proglang{R} Archive Network)
at \url{https://CRAN.R-project.org/}.
@@ -364,7 +367,7 @@ \section{Introduction}\label{Sec:intro}
\begin{knitrout}
\definecolor{shadecolor}{rgb}{0.969, 0.969, 0.969}\color{fgcolor}\begin{kframe}
\begin{verbatim}
-## stringi_1.7.4 (en_AU.UTF-8; ICU4C 69.1 [bundle]; Unicode 13.0)
+## stringi_1.7.7 (en_AU.UTF-8; ICU4C 69.1 [bundle]; Unicode 13.0)
\end{verbatim}
\end{kframe}
\end{knitrout}
@@ -374,7 +377,7 @@ \section{Introduction}\label{Sec:intro}
we can load and attach the package's namespace
and display some basic information thereon.
Hence, below we shall be working with
-\pkg{stringi} 1.7.4, however, as the package's
+\pkg{stringi} 1.7.7, however, as the package's
API is considered stable, the presented material should be relevant to
any later versions.
@@ -1276,11 +1279,11 @@ \subsection{Data flow}
\end{alltt}
\begin{verbatim}
## Unit: milliseconds
-## expr min lq mean median uq max neval
-## join2 35.520 36.950 49.907 37.359 48.415 92.546 100
-## join3 79.256 83.561 88.068 84.182 91.533 152.904 100
-## r_paste2 91.120 95.665 112.532 99.024 114.853 171.701 100
-## r_paste3 190.893 201.956 243.964 264.924 272.475 310.661 100
+## expr min lq mean median uq max neval
+## join2 38.709 41.789 56.454 44.435 54.954 112.79 100
+## join3 88.587 91.429 97.006 93.379 102.399 169.62 100
+## r_paste2 99.571 105.767 125.653 110.924 132.105 207.18 100
+## r_paste3 209.337 219.332 265.422 283.929 298.692 329.03 100
\end{verbatim}
\end{kframe}
\end{knitrout}
@@ -1303,19 +1306,19 @@ \subsection{Data flow}
\end{alltt}
\begin{verbatim}
## Unit: milliseconds
-## expr min lq mean median uq max neval
-## fixed 5.0097 5.173 5.2386 5.1945 5.2562 5.9541 100
-## regex 115.2815 116.205 117.2227 116.8521 118.1444 122.2590 100
-## coll 420.0686 424.757 426.5984 426.1803 427.9800 437.8152 100
-## r_tre 130.6598 132.098 133.2256 132.7224 133.6728 140.1607 100
-## r_pcre 76.3878 77.536 78.1141 77.9131 78.3949 81.2092 100
-## r_fixed 45.1295 45.738 46.0227 45.8944 46.1677 49.2137 100
+## expr min lq mean median uq max neval
+## fixed 5.4153 5.5769 5.7989 5.7007 5.9165 6.6195 100
+## regex 126.9775 129.5199 134.3005 131.5441 135.9457 158.8042 100
+## coll 404.0636 412.0106 428.2215 417.0754 435.0853 542.9017 100
+## r_tre 144.1503 147.8288 154.2366 150.4777 158.2060 220.8468 100
+## r_pcre 77.2884 79.2489 82.5763 80.6958 85.8608 97.1374 100
+## r_fixed 43.9071 44.8707 47.2532 45.8249 49.4261 56.0313 100
\end{verbatim}
\end{kframe}
\end{knitrout}
-\paragraph{Different default argument and greater configurability.}
+\paragraph{Different default arguments and greater configurability.}
Some functions in \pkg{stringi} have different,
more natural default arguments,
e.g., \code{paste()} has \code{sep=" "} but
@@ -2497,7 +2500,7 @@ \section{Regular expressions}\label{Sec:regex}
by \proglang{Java}'s \pkg{util.regex}
in \pkg{JDK 1.4}. Their syntax is mostly compatible with that of \pkg{PCRE},
although certain more advanced facets might not be supported (e.g., recursive
-patters). On the other hand, \pkg{ICU} regexes fully conform to the
+patterns). On the other hand, \pkg{ICU} regexes fully conform to the
Unicode Technical Standard \#18 \citep{uts18:regex} and hence provide
comprehensive support for Unicode.
@@ -2505,7 +2508,7 @@ \section{Regular expressions}\label{Sec:regex}
It is worth noting that most programming languages
as well as advanced text editors and development environments (including
\pkg{Kate}, \pkg{Eclipse}, \pkg{VSCode}, and \pkg{RStudio})
-support finding or replacing patters with regexes.
+support finding or replacing patterns with regexes.
Therefore, they should be amongst the instruments
at every data scientist's disposal.
One general introduction to regexes is \citep{friedl}.
@@ -2981,7 +2984,7 @@ \subsection{Alternating and grouping subexpressions}
Hence, if the left branch is a subset of the right one, the latter will
never be matched.
In particular, ``\code{(al|alga|algae)}'' can only match ``\code{al}''.
-To fix this, we can write ``\code{(algae|alga|al))}''.
+To fix this, we can write ``\code{(algae|alga|al)}''.
\paragraph{Non-grouping parentheses.}
@@ -3145,7 +3148,7 @@ \subsection{Quantifiers}
\begin{verbatim}
## stopped.
## user system elapsed
-## 16.818 0.000 16.823
+## 22.746 0.004 22.770
\end{verbatim}
\end{kframe}
\end{knitrout}
@@ -3689,7 +3692,7 @@ \subsection{Linear ordering of strings}
-Operators such that \code{\%s<\%}, \code{\%<=\%}, etc.,
+Operators such as \code{\%s<\%}, \code{\%<=\%}, etc.,
and the corresponding functions
\code{stri_cmp_lt()} (``less than''),
\code{stri_cmp_le()} (``less than or equal''), etc.,
@@ -4736,8 +4739,7 @@ \subsection{Character encodings}\label{Sec:encoding}
the ``\code{windows-1250}'' code page.
MacOS as well as most Linux boxes work with UTF-8 by default\footnote{
It is expected that future \proglang{R} releases will support UTF-8 natively
-thanks to the Universal \proglang{C} Runtime (UCRT) that is available for Windows 10.
-}
+thanks to the Universal \proglang{C} Runtime (UCRT) that is available for Windows 10.}.
All strings in \proglang{R} have an associated encoding mark
which can be read by calling \code{Encoding()} or, more conveniently,
@@ -4796,7 +4798,7 @@ \subsection{Reading and writing text files and converting between encodings}\lab
\paragraph{Detecting encoding.}
However, if a file's encoding is not known in advance, there are
a certain functions that can aid in encoding detection.
-First, we can read the resource in form of a raw-type vector:
+First, we can read the resource in the form of a raw-type vector:
\begin{knitrout}
diff --git a/devel/sphinx/bibliography.bib b/devel/sphinx/bibliography.bib
index 62dce0328..bcb3fda30 100644
--- a/devel/sphinx/bibliography.bib
+++ b/devel/sphinx/bibliography.bib
@@ -3,7 +3,10 @@ @article{stringi
title = {{stringi}: {F}ast and portable character string processing in {R}},
journal = {Journal of Statistical Software},
year = {2022},
- note = {to appear}
+ doi = {10.18637/jss.v103.i02},
+ volume = {103},
+ number = {2},
+ pages = {1--59}
}
@article{genome,
diff --git a/devel/sphinx/index.rst b/devel/sphinx/index.rst
index 514d647db..2f664dad7 100644
--- a/devel/sphinx/index.rst
+++ b/devel/sphinx/index.rst
@@ -1,5 +1,5 @@
-stringi: THE String Processing Package for R
-============================================
+stringi: Fast and Portable Character String Processing in R
+===========================================================
**stringi (pronounced “stringy”, IPA [strinɡi]) is THE R package
for very fast, portable, correct, consistent, and convenient string/text
@@ -9,9 +9,8 @@ stringi: THE String Processing Package for R
-Thanks to `ICU `_,
-*stringi* fully supports a wide range
-of `Unicode `_ standards
+Thanks to `ICU `_, *stringi* fully supports a wide
+range of `Unicode `_ standards
(see also `this video `_).
.. code-block:: r
@@ -73,8 +72,14 @@ The contributions from Bartłomiej Tartanus and
`many others `_
is greatly appreciated. Thanks!
-Also check out `stringx `_
-for a set of wrappers around *stringi* with a base R-compatible API.
+**See also**: `stringx `_ –
+a set of wrappers around *stringi* with a base R-compatible API.
+
+**Citation**: Gagolewski M.,
+*stringi*: Fast and portable character string processing in R,
+*Journal of Statistical Software* 103(2), 2022, 1–59,
+`doi:10.18637/jss.v103.i02 `_.
+
.. COMMENT
@@ -134,6 +139,7 @@ for a set of wrappers around *stringi* with a base R-compatible API.
Source Code (GitHub)
Bug Tracker and Feature Suggestions
CRAN Entry
+ JStatSoft Paper
Author's Homepage
C++ API — Rcpp Example
news.md
diff --git a/devel/sphinx/install.md b/devel/sphinx/install.md
index ba4045edd..0fefbdf0a 100644
--- a/devel/sphinx/install.md
+++ b/devel/sphinx/install.md
@@ -14,10 +14,10 @@ install.packages("stringi")
However, due to the overwhelming complexity of the ICU4C library,
upon which *stringi* is based, and the colourful diversity of operating systems,
-their flavors, and particular setups, some users may still experience
+their flavours, and particular setups, some users may still experience
a few issues that hopefully can be resolved with the help of this short manual.
-Also, some additional build tweaks are possible in case we require a more
+Also, some additional build tweaks are possible if we require a more
customised installation.
@@ -32,25 +32,25 @@ If we install the package from sources and either:
the `libicu-devel` rpm on Fedora/CentOS/OpenSUSE,
`libicu-dev` on Ubuntu/Debian, etc.),
-* `pkg-config` is fails to find appropriate build settings
+* `pkg-config` fails to find appropriate build settings
for ICU-based projects, or
* `R CMD INSTALL` is called with the `--configure-args='--disable-pkg-config'`
- argument or environment variable `STRINGI_DISABLE_PKG_CONFIG` is
+ argument, or environment variable `STRINGI_DISABLE_PKG_CONFIG` is
set to non-zero or
`install.packages("stringi", configure.args="--disable-pkg-config")`
is executed,
then ICU will be built together with stringi.
A custom subset of ICU4C 69.1 is shipped with the package.
-We also include ICU4C 55.1 that can be used as a fallback version
+We also include ICU4C 55.1 which can be used as a fallback version
(e.g., on older Solaris boxes).
> To get the most out of stringi, you are strongly encouraged to rely on our
> ICU4C package bundle. This ensures maximum portability across all platforms
> (Windows and macOS users by default fetch the pre-compiled binaries
-> from CRAN built exactly this way).
+> from CRAN built precisely this way).
@@ -59,7 +59,7 @@ We also include ICU4C 55.1 that can be used as a fallback version
Note that if you choose to use our ICU4C bundle, then -- by default -- the
ICU data library will be downloaded from one of our mirror servers.
However, if you have already downloaded a version of `icudt*.zip` suitable
-for your platform (big/little endian), you may wish to install the
+for your platform (big/little-endian), you may wish to install the
package by calling:
```r
@@ -110,12 +110,13 @@ install.packages("stringi", configure.args="--with-extra-cxxflags='-std=c++11'")
```
Overall, your build chain may be misconfigured, check out,
-amongst others, `/etc/Makeconf`
-(e.g., are you using `-std=gnu++11` instead of `-std=c++11`?). Refer to
-https://cran.r-project.org/doc/manuals/r-release/R-admin.html for more details.
+amongst others, `/etc/Makeconf` (e.g., are you using
+`-std=gnu++11` instead of `-std=c++11`?). Refer to
+
+for more details.
-There is an option of using the fallback version of ICU4C 55.1
-which however requires the support of the `long long` type in a few functions,
+There is an option of using the fallback version of ICU4C 55.1.
+However, it requires the support of the `long long` type in a few functions,
(this is not part of the C++98 standard; works on Solaris, though). Try:
```r
@@ -154,7 +155,7 @@ Some influential environment variables:
path relative to `/src`; defaults to `icuXX/data`.
* `PKG_CONFIG_PATH`: An optional list of directories to search for
- `pkg-config`s `.pc` files.
+ `pkg-config`'s `.pc` files.
* `R_HOME`: Override the R directory, e.g.,
`/usr/lib64/R`. Note that `$R_HOME/bin/R` point to the R executable.
@@ -162,15 +163,15 @@ Some influential environment variables:
* `CAT`: The `cat` command used to generate the list of source files to compile.
* `PKG_CONFIG`:The `pkg-config` command used to fetch the necessary compiler
- flags to link to and existing `libicu` installation.
+ flags to link to the existing `libicu` installation.
-* `STRINGI_DISABLE_CXX11`: Disable C++11,
+* `STRINGI_DISABLE_CXX11`: Disable C++11;
see also `--disable-cxx11`.
-* `STRINGI_DISABLE_PKG_CONFIG`: Compile ICU from sources,
+* `STRINGI_DISABLE_PKG_CONFIG`: Compile ICU from sources;
see also `--disable-pkg-config`.
-* `STRINGI_DISABLE_ICU_BUNDLE`: Enforce system ICU,
+* `STRINGI_DISABLE_ICU_BUNDLE`: Enforce system ICU;
see also `--disable-icu-bundle`.
* `STRINGI_CFLAGS`: see `--with-extra-cflags`.
@@ -190,7 +191,7 @@ Some influential environment variables:
We expect that with a correctly configured C++11 compiler and properly
installed system ICU4C distribution, you should face no problems
-with installing the package, especially if you use our ICU4C bundle and you
+installing the package, especially if you use our ICU4C bundle and
have a working internet access.
If you do not manage to set up a successful stringi build, do not
diff --git a/devel/sphinx/news.md b/devel/sphinx/news.md
index 72f91b5bf..b12b54446 100644
--- a/devel/sphinx/news.md
+++ b/devel/sphinx/news.md
@@ -1,31 +1,19 @@
# What Is New in *stringi*
-## 1.7.6.9xxx (under development)
+## 1.7.7 (2022-07-02)
+
+* [DOCUMENTATION] Paper on *stringi* has been published in
+ the *Journal of Statistical Software*, see .
* [BUGFIX] #473, #397: Fixed buffer overflow in `stri_dup`.
`stri_dup`, `stri_paste`, ... fail more graciously on attempts to
generate strings of length >= 2^31 each.
-* [NEW FEATURE] TODO.... #469: `stri_datetime_parse` .. new argument -
-`default_time`
- a Calendar set on input to the date and time to be used for missing values in the date/time string being parsed
-
-* [BUGFIX] TODO.... #469: `stri_datetime_parse` did not reset the `Calendar` object
- when parsing multiple dates.
-
-* [NEW FEATURE] TODO... #476 U_USING_DEFAULT_ERROR on unknown locales
-
-* [NEW FEATURE] TODO... #81 number format
-
-* [NEW FEATURE] TODO... #477 sprintf localised number format
-
-* [NEW FEATURE] #471: split into overlapping or non-overlapping chunks,
- possibly of different lengths
-
-* [DOCUMENTATION] #462...
-
+* [BUILD TIME] #480: Using `Rf_isNull` instead of `isNull`.
+* [DOCUMENTATION] #462: That the `numeric=TRUE` collator
+ does not handle negative numbers correctly is now mentioned in the manual.
## 1.7.6 (2021-11-29)
@@ -356,9 +344,9 @@ documentation object `stri_datetime_format`: `...`
* [BUGFIX] #319: Fixed overflow in `stri_rand_shuffle()`.
-* [BUGFIX] #337: Empty search patters in search functions (e.g.,
+* [BUGFIX] #337: Empty search patterns in search functions (e.g.,
`stri_split_regex()` and `stri_count_fixed()`) used to raise
- too many warnings on empty search patters.
+ too many warnings on empty search patterns.
## 1.2.4 (2018-07-20)
diff --git a/devel/sphinx/rapi/about_arguments.md b/devel/sphinx/rapi/about_arguments.md
index 431aa9731..609fbb732 100644
--- a/devel/sphinx/rapi/about_arguments.md
+++ b/devel/sphinx/rapi/about_arguments.md
@@ -40,4 +40,6 @@ Generally, all our functions drop input objects\' attributes (e.g., [`names`](ht
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other stringi_general_topics: [`about_encoding`](about_encoding.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_charclass`](about_search_charclass.md), [`about_search_coll`](about_search_coll.md), [`about_search_fixed`](about_search_fixed.md), [`about_search_regex`](about_search_regex.md), [`about_search`](about_search.md), [`about_stringi`](about_stringi.md)
diff --git a/devel/sphinx/rapi/about_encoding.md b/devel/sphinx/rapi/about_encoding.md
index 853d21d6b..0f1a08eb6 100644
--- a/devel/sphinx/rapi/about_encoding.md
+++ b/devel/sphinx/rapi/about_encoding.md
@@ -96,6 +96,8 @@ Check out [`stri_enc_detect`](stri_enc_detect.md) (among others) for a useful fu
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other stringi_general_topics: [`about_arguments`](about_arguments.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_charclass`](about_search_charclass.md), [`about_search_coll`](about_search_coll.md), [`about_search_fixed`](about_search_fixed.md), [`about_search_regex`](about_search_regex.md), [`about_search`](about_search.md), [`about_stringi`](about_stringi.md)
Other encoding_management: [`stri_enc_info()`](stri_enc_info.md), [`stri_enc_list()`](stri_enc_list.md), [`stri_enc_mark()`](stri_enc_mark.md), [`stri_enc_set()`](stri_enc_set.md)
diff --git a/devel/sphinx/rapi/about_locale.md b/devel/sphinx/rapi/about_locale.md
index 74f365bfe..051b1e85e 100644
--- a/devel/sphinx/rapi/about_locale.md
+++ b/devel/sphinx/rapi/about_locale.md
@@ -50,6 +50,8 @@ Other locale-sensitive functions include, e.g., [`stri_trans_tolower`](stri_tran
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other locale_management: [`stri_locale_info()`](stri_locale_info.md), [`stri_locale_list()`](stri_locale_list.md), [`stri_locale_set()`](stri_locale_set.md)
Other locale_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
diff --git a/devel/sphinx/rapi/about_search.md b/devel/sphinx/rapi/about_search.md
index 74226fc79..816032a02 100644
--- a/devel/sphinx/rapi/about_search.md
+++ b/devel/sphinx/rapi/about_search.md
@@ -44,6 +44,8 @@ Each search engine is able to perform many search-based operations. These may in
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other text_boundaries: [`about_search_boundaries`](about_search_boundaries.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_brkiter()`](stri_opts_brkiter.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_split_lines()`](stri_split_lines.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_wrap()`](stri_wrap.md)
Other search_regex: [`about_search_regex`](about_search_regex.md), [`stri_opts_regex()`](stri_opts_regex.md)
diff --git a/devel/sphinx/rapi/about_search_boundaries.md b/devel/sphinx/rapi/about_search_boundaries.md
index fa153ef0f..aa74cc838 100644
--- a/devel/sphinx/rapi/about_search_boundaries.md
+++ b/devel/sphinx/rapi/about_search_boundaries.md
@@ -44,6 +44,8 @@ For technical details on different classes of text boundaries refer to the stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other locale_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
Other text_boundaries: [`about_search`](about_search.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_brkiter()`](stri_opts_brkiter.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_split_lines()`](stri_split_lines.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_wrap()`](stri_wrap.md)
diff --git a/devel/sphinx/rapi/about_search_charclass.md b/devel/sphinx/rapi/about_search_charclass.md
index 18239b858..bb98e5ca3 100644
--- a/devel/sphinx/rapi/about_search_charclass.md
+++ b/devel/sphinx/rapi/about_search_charclass.md
@@ -446,6 +446,8 @@ Therefore, a POSIX flavor of `[:punct:]` is more like `[\p{P}\p{S}]` in stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other search_charclass: [`about_search`](about_search.md), [`stri_trim_both()`](stri_trim.md)
Other stringi_general_topics: [`about_arguments`](about_arguments.md), [`about_encoding`](about_encoding.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`about_search_fixed`](about_search_fixed.md), [`about_search_regex`](about_search_regex.md), [`about_search`](about_search.md), [`about_stringi`](about_stringi.md)
diff --git a/devel/sphinx/rapi/about_search_coll.md b/devel/sphinx/rapi/about_search_coll.md
index 56f66b168..7ff318468 100644
--- a/devel/sphinx/rapi/about_search_coll.md
+++ b/devel/sphinx/rapi/about_search_coll.md
@@ -28,6 +28,8 @@ L. Werner, *Efficient Text Searching in Java*, 1999, stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other search_coll: [`about_search`](about_search.md), [`stri_opts_collator()`](stri_opts_collator.md)
Other locale_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
diff --git a/devel/sphinx/rapi/about_search_fixed.md b/devel/sphinx/rapi/about_search_fixed.md
index 76d761956..9cf8c75ff 100644
--- a/devel/sphinx/rapi/about_search_fixed.md
+++ b/devel/sphinx/rapi/about_search_fixed.md
@@ -30,6 +30,8 @@ Note that the conversion of input data to Unicode is done as usual.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other search_fixed: [`about_search`](about_search.md), [`stri_opts_fixed()`](stri_opts_fixed.md)
Other stringi_general_topics: [`about_arguments`](about_arguments.md), [`about_encoding`](about_encoding.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_charclass`](about_search_charclass.md), [`about_search_coll`](about_search_coll.md), [`about_search_regex`](about_search_regex.md), [`about_search`](about_search.md), [`about_stringi`](about_stringi.md)
diff --git a/devel/sphinx/rapi/about_search_regex.md b/devel/sphinx/rapi/about_search_regex.md
index 5db8bc699..68e6fc2e2 100644
--- a/devel/sphinx/rapi/about_search_regex.md
+++ b/devel/sphinx/rapi/about_search_regex.md
@@ -364,6 +364,8 @@ J.E.F. Friedl, *Mastering Regular Expressions*, O\'Reilly, 2002
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other search_regex: [`about_search`](about_search.md), [`stri_opts_regex()`](stri_opts_regex.md)
Other stringi_general_topics: [`about_arguments`](about_arguments.md), [`about_encoding`](about_encoding.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_charclass`](about_search_charclass.md), [`about_search_coll`](about_search_coll.md), [`about_search_fixed`](about_search_fixed.md), [`about_search`](about_search.md), [`about_stringi`](about_stringi.md)
diff --git a/devel/sphinx/rapi/about_stringi.md b/devel/sphinx/rapi/about_stringi.md
index 8bba31c3d..9d3099f97 100644
--- a/devel/sphinx/rapi/about_stringi.md
+++ b/devel/sphinx/rapi/about_stringi.md
@@ -1,4 +1,4 @@
-# about_stringi: THE String Processing Package
+# about_stringi: Fast and Portable Character String Processing in R
## Description
@@ -48,7 +48,7 @@ Refer to the following:
- [`stri_trim`](stri_trim.md) (among others) for trimming characters from the beginning or/and end of a string, see also [about_search_charclass](about_search_charclass.md), and [`stri_pad`](stri_pad.md) for padding strings so that they are of the same width. Additionally, [`stri_wrap`](stri_wrap.md) wraps text into lines.
-- [`stri_trans_tolower`](stri_trans_casemap.md) (among others) for case mapping, i.e., conversion to lower, UPPER, or Title Case, [`stri_trans_nfc`](stri_trans_nf.md) (among others) for Unicode normalization, [`stri_trans_char`](stri_trans_char.md) for translating individual code points, and [`stri_trans_general`](stri_trans_general.md) for other universal yet powerful text transforms, including transliteration.
+- [`stri_trans_tolower`](stri_trans_casemap.md) (among others) for case mapping, i.e., conversion to lower, UPPER, or Title Case, [`stri_trans_nfc`](stri_trans_nf.md) (among others) for Unicode normalization, [`stri_trans_char`](stri_trans_char.md) for translating individual code points, and [`stri_trans_general`](stri_trans_general.md) for other universal text transforms, including transliteration.
- [`stri_cmp`](stri_compare.md), [`%s<%`](+25s+3C+25.md), [`stri_order`](stri_order.md), [`stri_sort`](stri_sort.md), [`stri_rank`](stri_rank.md), [`stri_unique`](stri_unique.md), and [`stri_duplicated`](stri_duplicated.md) for collation-based, locale-aware operations, see also [about_locale](about_locale.md).
@@ -68,7 +68,9 @@ Marek Gagolewski, with contributions from Bartek Tartanus and many others. ICU4C
## References
-*stringi Package homepage*,
+*stringi Package Homepage*,
+
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
*ICU -- International Components for Unicode*,
@@ -76,10 +78,12 @@ Marek Gagolewski, with contributions from Bartek Tartanus and many others. ICU4C
*The Unicode Consortium*,
-*UTF-8, a transformation format of ISO 10646* -- RFC 3629,
+*UTF-8, A Transformation Format of ISO 10646* -- RFC 3629,
## See Also
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other stringi_general_topics: [`about_arguments`](about_arguments.md), [`about_encoding`](about_encoding.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_charclass`](about_search_charclass.md), [`about_search_coll`](about_search_coll.md), [`about_search_fixed`](about_search_fixed.md), [`about_search_regex`](about_search_regex.md), [`about_search`](about_search.md)
diff --git a/devel/sphinx/rapi/operator_add.md b/devel/sphinx/rapi/operator_add.md
index 0394c88cf..8b4415df9 100644
--- a/devel/sphinx/rapi/operator_add.md
+++ b/devel/sphinx/rapi/operator_add.md
@@ -37,6 +37,8 @@ Returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other join: [`stri_dup()`](stri_dup.md), [`stri_flatten()`](stri_flatten.md), [`stri_join_list()`](stri_join_list.md), [`stri_join()`](stri_join.md)
## Examples
diff --git a/devel/sphinx/rapi/operator_compare.md b/devel/sphinx/rapi/operator_compare.md
index 4649e7972..cf1db6192 100644
--- a/devel/sphinx/rapi/operator_compare.md
+++ b/devel/sphinx/rapi/operator_compare.md
@@ -66,6 +66,8 @@ All the functions return a logical vector indicating the result of a pairwise co
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other locale_sensitive: [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
## Examples
diff --git a/devel/sphinx/rapi/operator_dollar.md b/devel/sphinx/rapi/operator_dollar.md
index 23133e2c8..5b00409d3 100644
--- a/devel/sphinx/rapi/operator_dollar.md
+++ b/devel/sphinx/rapi/operator_dollar.md
@@ -39,6 +39,8 @@ Returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other length: [`stri_isempty()`](stri_isempty.md), [`stri_length()`](stri_length.md), [`stri_numbytes()`](stri_numbytes.md), [`stri_pad_both()`](stri_pad.md), [`stri_sprintf()`](stri_sprintf.md), [`stri_width()`](stri_width.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_compare.md b/devel/sphinx/rapi/stri_compare.md
index 59af81f06..c67d882bb 100644
--- a/devel/sphinx/rapi/stri_compare.md
+++ b/devel/sphinx/rapi/stri_compare.md
@@ -68,6 +68,8 @@ All the other functions return a logical vector that indicates whether a given r
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other locale_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_count.md b/devel/sphinx/rapi/stri_count.md
index 3d90e3ebd..176b1d8c6 100644
--- a/devel/sphinx/rapi/stri_count.md
+++ b/devel/sphinx/rapi/stri_count.md
@@ -47,6 +47,8 @@ All the functions return an integer vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other search_count: [`about_search`](about_search.md), [`stri_count_boundaries()`](stri_count_boundaries.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_count_boundaries.md b/devel/sphinx/rapi/stri_count_boundaries.md
index b509614b6..9605ce658 100644
--- a/devel/sphinx/rapi/stri_count_boundaries.md
+++ b/devel/sphinx/rapi/stri_count_boundaries.md
@@ -45,6 +45,8 @@ Both functions return an integer vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other search_count: [`about_search`](about_search.md), [`stri_count()`](stri_count.md)
Other locale_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
diff --git a/devel/sphinx/rapi/stri_datetime_add.md b/devel/sphinx/rapi/stri_datetime_add.md
index 9529c7aeb..10619dbab 100644
--- a/devel/sphinx/rapi/stri_datetime_add.md
+++ b/devel/sphinx/rapi/stri_datetime_add.md
@@ -52,6 +52,8 @@ The replacement version of `stri_datetime_add` modifies the state of the `time`
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other datetime: [`stri_datetime_create()`](stri_datetime_create.md), [`stri_datetime_fields()`](stri_datetime_fields.md), [`stri_datetime_format()`](stri_datetime_format.md), [`stri_datetime_fstr()`](stri_datetime_fstr.md), [`stri_datetime_now()`](stri_datetime_now.md), [`stri_datetime_symbols()`](stri_datetime_symbols.md), [`stri_timezone_get()`](stri_timezone_set.md), [`stri_timezone_info()`](stri_timezone_info.md), [`stri_timezone_list()`](stri_timezone_list.md)
## Examples
@@ -63,9 +65,9 @@ Other datetime: [`stri_datetime_create()`](stri_datetime_create.md), [`stri_date
x <- stri_datetime_now()
stri_datetime_add(x, units='months') <- 2
print(x)
-## [1] "2022-08-13 15:50:57 AEST"
+## [1] "2022-09-02 17:40:19 AEST"
stri_datetime_add(x, -2, units='months')
-## [1] "2022-06-13 15:50:57 AEST"
+## [1] "2022-07-02 17:40:19 AEST"
stri_datetime_add(stri_datetime_create(2014, 4, 20), 1, units='years')
## [1] "2015-04-20 12:00:00 AEST"
stri_datetime_add(stri_datetime_create(2014, 4, 20), 1, units='years', locale='@calendar=hebrew')
diff --git a/devel/sphinx/rapi/stri_datetime_create.md b/devel/sphinx/rapi/stri_datetime_create.md
index fb43c7735..b46c677ab 100644
--- a/devel/sphinx/rapi/stri_datetime_create.md
+++ b/devel/sphinx/rapi/stri_datetime_create.md
@@ -50,6 +50,8 @@ Returns an object of class [`POSIXct`](https://stat.ethz.ch/R-manual/R-devel/lib
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other datetime: [`stri_datetime_add()`](stri_datetime_add.md), [`stri_datetime_fields()`](stri_datetime_fields.md), [`stri_datetime_format()`](stri_datetime_format.md), [`stri_datetime_fstr()`](stri_datetime_fstr.md), [`stri_datetime_now()`](stri_datetime_now.md), [`stri_datetime_symbols()`](stri_datetime_symbols.md), [`stri_timezone_get()`](stri_timezone_set.md), [`stri_timezone_info()`](stri_timezone_info.md), [`stri_timezone_list()`](stri_timezone_list.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_datetime_fields.md b/devel/sphinx/rapi/stri_datetime_fields.md
index ebed1ce6a..6352db775 100644
--- a/devel/sphinx/rapi/stri_datetime_fields.md
+++ b/devel/sphinx/rapi/stri_datetime_fields.md
@@ -62,6 +62,8 @@ Returns a data frame with the following columns:
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other datetime: [`stri_datetime_add()`](stri_datetime_add.md), [`stri_datetime_create()`](stri_datetime_create.md), [`stri_datetime_format()`](stri_datetime_format.md), [`stri_datetime_fstr()`](stri_datetime_fstr.md), [`stri_datetime_now()`](stri_datetime_now.md), [`stri_datetime_symbols()`](stri_datetime_symbols.md), [`stri_timezone_get()`](stri_timezone_set.md), [`stri_timezone_info()`](stri_timezone_info.md), [`stri_timezone_list()`](stri_timezone_list.md)
## Examples
@@ -72,16 +74,16 @@ Other datetime: [`stri_datetime_add()`](stri_datetime_add.md), [`stri_datetime_c
```r
stri_datetime_fields(stri_datetime_now())
## Year Month Day Hour Minute Second Millisecond WeekOfYear WeekOfMonth
-## 1 2022 6 13 15 50 57 879 25 3
+## 1 2022 7 2 17 40 19 644 27 1
## DayOfYear DayOfWeek Hour12 AmPm Era
-## 1 164 2 3 2 2
+## 1 183 7 5 2 2
stri_datetime_fields(stri_datetime_now(), locale='@calendar=hebrew')
## Year Month Day Hour Minute Second Millisecond WeekOfYear WeekOfMonth
-## 1 5782 10 14 15 50 57 882 41 3
+## 1 5782 11 3 17 40 19 647 43 1
## DayOfYear DayOfWeek Hour12 AmPm Era
-## 1 280 2 3 2 1
+## 1 299 7 5 2 1
stri_datetime_symbols(locale='@calendar=hebrew')$Month[
stri_datetime_fields(stri_datetime_now(), locale='@calendar=hebrew')$Month
]
-## [1] "Sivan"
+## [1] "Tamuz"
```
diff --git a/devel/sphinx/rapi/stri_datetime_format.md b/devel/sphinx/rapi/stri_datetime_format.md
index 38bf0513c..2f1d23140 100644
--- a/devel/sphinx/rapi/stri_datetime_format.md
+++ b/devel/sphinx/rapi/stri_datetime_format.md
@@ -174,6 +174,8 @@ A few examples:
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other datetime: [`stri_datetime_add()`](stri_datetime_add.md), [`stri_datetime_create()`](stri_datetime_create.md), [`stri_datetime_fields()`](stri_datetime_fields.md), [`stri_datetime_fstr()`](stri_datetime_fstr.md), [`stri_datetime_now()`](stri_datetime_now.md), [`stri_datetime_symbols()`](stri_datetime_symbols.md), [`stri_timezone_get()`](stri_timezone_set.md), [`stri_timezone_info()`](stri_timezone_info.md), [`stri_timezone_list()`](stri_timezone_list.md)
## Examples
@@ -184,13 +186,13 @@ Other datetime: [`stri_datetime_add()`](stri_datetime_add.md), [`stri_datetime_c
```r
x <- c('2015-02-28', '2015-02-29')
stri_datetime_parse(x, 'yyyy-MM-dd')
-## [1] "2015-02-28 15:50:58 AEDT" NA
+## [1] "2015-02-28 17:40:19 AEDT" NA
stri_datetime_parse(x, 'yyyy-MM-dd', lenient=TRUE)
-## [1] "2015-02-28 15:50:58 AEDT" "2015-03-01 15:50:58 AEDT"
+## [1] "2015-02-28 17:40:19 AEDT" "2015-03-01 17:40:19 AEDT"
stri_datetime_parse(x %s+% " 00:00:00", "yyyy-MM-dd HH:mm:ss")
## [1] "2015-02-28 00:00:00 AEDT" NA
stri_datetime_parse('19 lipca 2015', 'date_long', locale='pl_PL')
-## [1] "2015-07-19 15:50:58 AEST"
+## [1] "2015-07-19 17:40:19 AEST"
stri_datetime_format(stri_datetime_now(), 'datetime_relative_medium')
-## [1] "today, 3:50:58 pm"
+## [1] "today, 5:40:19 pm"
```
diff --git a/devel/sphinx/rapi/stri_datetime_fstr.md b/devel/sphinx/rapi/stri_datetime_fstr.md
index 359294bfb..458bbbfdc 100644
--- a/devel/sphinx/rapi/stri_datetime_fstr.md
+++ b/devel/sphinx/rapi/stri_datetime_fstr.md
@@ -35,6 +35,8 @@ Returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other datetime: [`stri_datetime_add()`](stri_datetime_add.md), [`stri_datetime_create()`](stri_datetime_create.md), [`stri_datetime_fields()`](stri_datetime_fields.md), [`stri_datetime_format()`](stri_datetime_format.md), [`stri_datetime_now()`](stri_datetime_now.md), [`stri_datetime_symbols()`](stri_datetime_symbols.md), [`stri_timezone_get()`](stri_timezone_set.md), [`stri_timezone_info()`](stri_timezone_info.md), [`stri_timezone_list()`](stri_timezone_list.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_datetime_now.md b/devel/sphinx/rapi/stri_datetime_now.md
index 542864ba8..87ee87974 100644
--- a/devel/sphinx/rapi/stri_datetime_now.md
+++ b/devel/sphinx/rapi/stri_datetime_now.md
@@ -26,4 +26,6 @@ Returns an object of class [`POSIXct`](https://stat.ethz.ch/R-manual/R-devel/lib
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other datetime: [`stri_datetime_add()`](stri_datetime_add.md), [`stri_datetime_create()`](stri_datetime_create.md), [`stri_datetime_fields()`](stri_datetime_fields.md), [`stri_datetime_format()`](stri_datetime_format.md), [`stri_datetime_fstr()`](stri_datetime_fstr.md), [`stri_datetime_symbols()`](stri_datetime_symbols.md), [`stri_timezone_get()`](stri_timezone_set.md), [`stri_timezone_info()`](stri_timezone_info.md), [`stri_timezone_list()`](stri_timezone_list.md)
diff --git a/devel/sphinx/rapi/stri_datetime_symbols.md b/devel/sphinx/rapi/stri_datetime_symbols.md
index a292df962..b4e56c778 100644
--- a/devel/sphinx/rapi/stri_datetime_symbols.md
+++ b/devel/sphinx/rapi/stri_datetime_symbols.md
@@ -52,6 +52,8 @@ Returns a list with the following named components:
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other datetime: [`stri_datetime_add()`](stri_datetime_add.md), [`stri_datetime_create()`](stri_datetime_create.md), [`stri_datetime_fields()`](stri_datetime_fields.md), [`stri_datetime_format()`](stri_datetime_format.md), [`stri_datetime_fstr()`](stri_datetime_fstr.md), [`stri_datetime_now()`](stri_datetime_now.md), [`stri_timezone_get()`](stri_timezone_set.md), [`stri_timezone_info()`](stri_timezone_info.md), [`stri_timezone_list()`](stri_timezone_list.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_detect.md b/devel/sphinx/rapi/stri_detect.md
index b95b1a02a..4a36dcde2 100644
--- a/devel/sphinx/rapi/stri_detect.md
+++ b/devel/sphinx/rapi/stri_detect.md
@@ -74,6 +74,8 @@ Each function returns a logical vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other search_detect: [`about_search`](about_search.md), [`stri_startswith()`](stri_startsendswith.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_dup.md b/devel/sphinx/rapi/stri_dup.md
index b70963997..250ffb747 100644
--- a/devel/sphinx/rapi/stri_dup.md
+++ b/devel/sphinx/rapi/stri_dup.md
@@ -39,6 +39,8 @@ Returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other join: [`%s+%()`](+25s+2B+25.md), [`stri_flatten()`](stri_flatten.md), [`stri_join_list()`](stri_join_list.md), [`stri_join()`](stri_join.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_duplicated.md b/devel/sphinx/rapi/stri_duplicated.md
index 4669de961..762c65594 100644
--- a/devel/sphinx/rapi/stri_duplicated.md
+++ b/devel/sphinx/rapi/stri_duplicated.md
@@ -62,6 +62,8 @@ See also [`stri_unique`](stri_unique.md) for extracting unique elements.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other locale_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_enc_detect.md b/devel/sphinx/rapi/stri_enc_detect.md
index fe3f20eeb..f651c9f36 100644
--- a/devel/sphinx/rapi/stri_enc_detect.md
+++ b/devel/sphinx/rapi/stri_enc_detect.md
@@ -92,6 +92,8 @@ The guesses are ordered by decreasing confidence.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other encoding_detection: [`about_encoding`](about_encoding.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_enc_isascii()`](stri_enc_isascii.md), [`stri_enc_isutf16be()`](stri_enc_isutf16.md), [`stri_enc_isutf8()`](stri_enc_isutf8.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_enc_detect2.md b/devel/sphinx/rapi/stri_enc_detect2.md
index 6b78e4af4..9ae05bb2c 100644
--- a/devel/sphinx/rapi/stri_enc_detect2.md
+++ b/devel/sphinx/rapi/stri_enc_detect2.md
@@ -49,6 +49,8 @@ The guesses are ordered by decreasing confidence.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other locale_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
Other encoding_detection: [`about_encoding`](about_encoding.md), [`stri_enc_detect()`](stri_enc_detect.md), [`stri_enc_isascii()`](stri_enc_isascii.md), [`stri_enc_isutf16be()`](stri_enc_isutf16.md), [`stri_enc_isutf8()`](stri_enc_isutf8.md)
diff --git a/devel/sphinx/rapi/stri_enc_fromutf32.md b/devel/sphinx/rapi/stri_enc_fromutf32.md
index f195a4741..cd85b7757 100644
--- a/devel/sphinx/rapi/stri_enc_fromutf32.md
+++ b/devel/sphinx/rapi/stri_enc_fromutf32.md
@@ -38,4 +38,6 @@ Returns a character vector (in UTF-8). `NULL`s in the input list are converted t
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other encoding_conversion: [`about_encoding`](about_encoding.md), [`stri_enc_toascii()`](stri_enc_toascii.md), [`stri_enc_tonative()`](stri_enc_tonative.md), [`stri_enc_toutf32()`](stri_enc_toutf32.md), [`stri_enc_toutf8()`](stri_enc_toutf8.md), [`stri_encode()`](stri_encode.md)
diff --git a/devel/sphinx/rapi/stri_enc_info.md b/devel/sphinx/rapi/stri_enc_info.md
index 4cec4f8d1..c38e1ef25 100644
--- a/devel/sphinx/rapi/stri_enc_info.md
+++ b/devel/sphinx/rapi/stri_enc_info.md
@@ -48,4 +48,6 @@ Returns a list with the following components:
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other encoding_management: [`about_encoding`](about_encoding.md), [`stri_enc_list()`](stri_enc_list.md), [`stri_enc_mark()`](stri_enc_mark.md), [`stri_enc_set()`](stri_enc_set.md)
diff --git a/devel/sphinx/rapi/stri_enc_isascii.md b/devel/sphinx/rapi/stri_enc_isascii.md
index 87af5b9a9..e91d7a006 100644
--- a/devel/sphinx/rapi/stri_enc_isascii.md
+++ b/devel/sphinx/rapi/stri_enc_isascii.md
@@ -32,6 +32,8 @@ Returns a logical vector. The i-th element indicates whether the i-th string cor
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other encoding_detection: [`about_encoding`](about_encoding.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_enc_detect()`](stri_enc_detect.md), [`stri_enc_isutf16be()`](stri_enc_isutf16.md), [`stri_enc_isutf8()`](stri_enc_isutf8.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_enc_isutf16.md b/devel/sphinx/rapi/stri_enc_isutf16.md
index e57eadb8c..9c7488936 100644
--- a/devel/sphinx/rapi/stri_enc_isutf16.md
+++ b/devel/sphinx/rapi/stri_enc_isutf16.md
@@ -42,4 +42,6 @@ Returns a logical vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other encoding_detection: [`about_encoding`](about_encoding.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_enc_detect()`](stri_enc_detect.md), [`stri_enc_isascii()`](stri_enc_isascii.md), [`stri_enc_isutf8()`](stri_enc_isutf8.md)
diff --git a/devel/sphinx/rapi/stri_enc_isutf8.md b/devel/sphinx/rapi/stri_enc_isutf8.md
index 5d8706d13..3490d7399 100644
--- a/devel/sphinx/rapi/stri_enc_isutf8.md
+++ b/devel/sphinx/rapi/stri_enc_isutf8.md
@@ -36,6 +36,8 @@ Returns a logical vector. Its i-th element indicates whether the i-th string cor
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other encoding_detection: [`about_encoding`](about_encoding.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_enc_detect()`](stri_enc_detect.md), [`stri_enc_isascii()`](stri_enc_isascii.md), [`stri_enc_isutf16be()`](stri_enc_isutf16.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_enc_list.md b/devel/sphinx/rapi/stri_enc_list.md
index c09d947dc..620b8ccce 100644
--- a/devel/sphinx/rapi/stri_enc_list.md
+++ b/devel/sphinx/rapi/stri_enc_list.md
@@ -34,6 +34,8 @@ If `simplify` is `TRUE` (the default), then the resulting list is coerced to a c
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other encoding_management: [`about_encoding`](about_encoding.md), [`stri_enc_info()`](stri_enc_info.md), [`stri_enc_mark()`](stri_enc_mark.md), [`stri_enc_set()`](stri_enc_set.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_enc_mark.md b/devel/sphinx/rapi/stri_enc_mark.md
index 920eb3876..5699a49cd 100644
--- a/devel/sphinx/rapi/stri_enc_mark.md
+++ b/devel/sphinx/rapi/stri_enc_mark.md
@@ -38,4 +38,6 @@ This gives exactly the same data that is used by all the functions in stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other encoding_management: [`about_encoding`](about_encoding.md), [`stri_enc_info()`](stri_enc_info.md), [`stri_enc_list()`](stri_enc_list.md), [`stri_enc_set()`](stri_enc_set.md)
diff --git a/devel/sphinx/rapi/stri_enc_set.md b/devel/sphinx/rapi/stri_enc_set.md
index 0bbb622f4..e1c480e42 100644
--- a/devel/sphinx/rapi/stri_enc_set.md
+++ b/devel/sphinx/rapi/stri_enc_set.md
@@ -42,4 +42,6 @@ If you set a default encoding that is neither a superset of ASCII, nor an 8-bit
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other encoding_management: [`about_encoding`](about_encoding.md), [`stri_enc_info()`](stri_enc_info.md), [`stri_enc_list()`](stri_enc_list.md), [`stri_enc_mark()`](stri_enc_mark.md)
diff --git a/devel/sphinx/rapi/stri_enc_toascii.md b/devel/sphinx/rapi/stri_enc_toascii.md
index 90ace571d..c6d3e1868 100644
--- a/devel/sphinx/rapi/stri_enc_toascii.md
+++ b/devel/sphinx/rapi/stri_enc_toascii.md
@@ -36,4 +36,6 @@ Returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other encoding_conversion: [`about_encoding`](about_encoding.md), [`stri_enc_fromutf32()`](stri_enc_fromutf32.md), [`stri_enc_tonative()`](stri_enc_tonative.md), [`stri_enc_toutf32()`](stri_enc_toutf32.md), [`stri_enc_toutf8()`](stri_enc_toutf8.md), [`stri_encode()`](stri_encode.md)
diff --git a/devel/sphinx/rapi/stri_enc_tonative.md b/devel/sphinx/rapi/stri_enc_tonative.md
index a6d93249f..ff07199f9 100644
--- a/devel/sphinx/rapi/stri_enc_tonative.md
+++ b/devel/sphinx/rapi/stri_enc_tonative.md
@@ -34,4 +34,6 @@ Returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other encoding_conversion: [`about_encoding`](about_encoding.md), [`stri_enc_fromutf32()`](stri_enc_fromutf32.md), [`stri_enc_toascii()`](stri_enc_toascii.md), [`stri_enc_toutf32()`](stri_enc_toutf32.md), [`stri_enc_toutf8()`](stri_enc_toutf8.md), [`stri_encode()`](stri_encode.md)
diff --git a/devel/sphinx/rapi/stri_enc_toutf32.md b/devel/sphinx/rapi/stri_enc_toutf32.md
index f65d1682a..e80becbee 100644
--- a/devel/sphinx/rapi/stri_enc_toutf32.md
+++ b/devel/sphinx/rapi/stri_enc_toutf32.md
@@ -36,4 +36,6 @@ Returns a list of integer vectors. Missing values are converted to `NULL`s.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other encoding_conversion: [`about_encoding`](about_encoding.md), [`stri_enc_fromutf32()`](stri_enc_fromutf32.md), [`stri_enc_toascii()`](stri_enc_toascii.md), [`stri_enc_tonative()`](stri_enc_tonative.md), [`stri_enc_toutf8()`](stri_enc_toutf8.md), [`stri_encode()`](stri_encode.md)
diff --git a/devel/sphinx/rapi/stri_enc_toutf8.md b/devel/sphinx/rapi/stri_enc_toutf8.md
index 1215ba0d1..b3faee679 100644
--- a/devel/sphinx/rapi/stri_enc_toutf8.md
+++ b/devel/sphinx/rapi/stri_enc_toutf8.md
@@ -42,4 +42,6 @@ Returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other encoding_conversion: [`about_encoding`](about_encoding.md), [`stri_enc_fromutf32()`](stri_enc_fromutf32.md), [`stri_enc_toascii()`](stri_enc_toascii.md), [`stri_enc_tonative()`](stri_enc_tonative.md), [`stri_enc_toutf32()`](stri_enc_toutf32.md), [`stri_encode()`](stri_encode.md)
diff --git a/devel/sphinx/rapi/stri_encode.md b/devel/sphinx/rapi/stri_encode.md
index 0e1662e3a..d9bced7f9 100644
--- a/devel/sphinx/rapi/stri_encode.md
+++ b/devel/sphinx/rapi/stri_encode.md
@@ -57,4 +57,6 @@ If `to_raw` is `FALSE`, then a character vector with encoded strings (and approp
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other encoding_conversion: [`about_encoding`](about_encoding.md), [`stri_enc_fromutf32()`](stri_enc_fromutf32.md), [`stri_enc_toascii()`](stri_enc_toascii.md), [`stri_enc_tonative()`](stri_enc_tonative.md), [`stri_enc_toutf32()`](stri_enc_toutf32.md), [`stri_enc_toutf8()`](stri_enc_toutf8.md)
diff --git a/devel/sphinx/rapi/stri_escape_unicode.md b/devel/sphinx/rapi/stri_escape_unicode.md
index 3e4591684..61931b9f8 100644
--- a/devel/sphinx/rapi/stri_escape_unicode.md
+++ b/devel/sphinx/rapi/stri_escape_unicode.md
@@ -34,6 +34,8 @@ Returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other escape: [`stri_unescape_unicode()`](stri_unescape_unicode.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_extract.md b/devel/sphinx/rapi/stri_extract.md
index 77e2a59bc..2b4c8ed5f 100644
--- a/devel/sphinx/rapi/stri_extract.md
+++ b/devel/sphinx/rapi/stri_extract.md
@@ -116,6 +116,8 @@ Note that `stri_extract_last_regex` searches from start to end, but skips overla
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other search_extract: [`about_search`](about_search.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_match_all()`](stri_match.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_extract_boundaries.md b/devel/sphinx/rapi/stri_extract_boundaries.md
index 449c13b90..a1e52c666 100644
--- a/devel/sphinx/rapi/stri_extract_boundaries.md
+++ b/devel/sphinx/rapi/stri_extract_boundaries.md
@@ -66,6 +66,8 @@ For `stri_extract_first_*` and `stri_extract_last_*`, a character vector is retu
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other search_extract: [`about_search`](about_search.md), [`stri_extract_all()`](stri_extract.md), [`stri_match_all()`](stri_match.md)
Other locale_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
diff --git a/devel/sphinx/rapi/stri_flatten.md b/devel/sphinx/rapi/stri_flatten.md
index ba17a87d3..f31627b44 100644
--- a/devel/sphinx/rapi/stri_flatten.md
+++ b/devel/sphinx/rapi/stri_flatten.md
@@ -39,6 +39,8 @@ Returns a single string, i.e., a character vector of length 1.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other join: [`%s+%()`](+25s+2B+25.md), [`stri_dup()`](stri_dup.md), [`stri_join_list()`](stri_join_list.md), [`stri_join()`](stri_join.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_info.md b/devel/sphinx/rapi/stri_info.md
index fb2b73077..a84c199ea 100644
--- a/devel/sphinx/rapi/stri_info.md
+++ b/devel/sphinx/rapi/stri_info.md
@@ -43,3 +43,5 @@ Otherwise, a list with the following components is returned:
## See Also
The official online manual of stringi at
+
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
diff --git a/devel/sphinx/rapi/stri_isempty.md b/devel/sphinx/rapi/stri_isempty.md
index 69a118a31..55de7b34d 100644
--- a/devel/sphinx/rapi/stri_isempty.md
+++ b/devel/sphinx/rapi/stri_isempty.md
@@ -32,6 +32,8 @@ Returns a logical vector of the same length as `str`.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other length: [`%s$%()`](+25s+24+25.md), [`stri_length()`](stri_length.md), [`stri_numbytes()`](stri_numbytes.md), [`stri_pad_both()`](stri_pad.md), [`stri_sprintf()`](stri_sprintf.md), [`stri_width()`](stri_width.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_join.md b/devel/sphinx/rapi/stri_join.md
index 8404ca134..c1b818977 100644
--- a/devel/sphinx/rapi/stri_join.md
+++ b/devel/sphinx/rapi/stri_join.md
@@ -47,6 +47,8 @@ Returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other join: [`%s+%()`](+25s+2B+25.md), [`stri_dup()`](stri_dup.md), [`stri_flatten()`](stri_flatten.md), [`stri_join_list()`](stri_join_list.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_join_list.md b/devel/sphinx/rapi/stri_join_list.md
index 093b77172..1b929ddd9 100644
--- a/devel/sphinx/rapi/stri_join_list.md
+++ b/devel/sphinx/rapi/stri_join_list.md
@@ -42,6 +42,8 @@ Returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other join: [`%s+%()`](+25s+2B+25.md), [`stri_dup()`](stri_dup.md), [`stri_flatten()`](stri_flatten.md), [`stri_join()`](stri_join.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_length.md b/devel/sphinx/rapi/stri_length.md
index f3ee31b89..0d327113e 100644
--- a/devel/sphinx/rapi/stri_length.md
+++ b/devel/sphinx/rapi/stri_length.md
@@ -36,6 +36,8 @@ Returns an integer vector of the same length as `str`.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other length: [`%s$%()`](+25s+24+25.md), [`stri_isempty()`](stri_isempty.md), [`stri_numbytes()`](stri_numbytes.md), [`stri_pad_both()`](stri_pad.md), [`stri_sprintf()`](stri_sprintf.md), [`stri_width()`](stri_width.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_list2matrix.md b/devel/sphinx/rapi/stri_list2matrix.md
index d4f7d9edf..7e199163e 100644
--- a/devel/sphinx/rapi/stri_list2matrix.md
+++ b/devel/sphinx/rapi/stri_list2matrix.md
@@ -48,6 +48,8 @@ Returns a character matrix.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other utils: [`stri_na2empty()`](stri_na2empty.md), [`stri_remove_empty()`](stri_remove_empty.md), [`stri_replace_na()`](stri_replace_na.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_locale_info.md b/devel/sphinx/rapi/stri_locale_info.md
index 47bf2740d..2eed39c0f 100644
--- a/devel/sphinx/rapi/stri_locale_info.md
+++ b/devel/sphinx/rapi/stri_locale_info.md
@@ -34,6 +34,8 @@ Returns a list with the following named character strings: `Language`, `Country`
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other locale_management: [`about_locale`](about_locale.md), [`stri_locale_list()`](stri_locale_list.md), [`stri_locale_set()`](stri_locale_set.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_locale_list.md b/devel/sphinx/rapi/stri_locale_list.md
index 75cbb4c9f..0856c9b20 100644
--- a/devel/sphinx/rapi/stri_locale_list.md
+++ b/devel/sphinx/rapi/stri_locale_list.md
@@ -28,6 +28,8 @@ Returns a character vector with locale identifiers that are known to stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other locale_management: [`about_locale`](about_locale.md), [`stri_locale_info()`](stri_locale_info.md), [`stri_locale_set()`](stri_locale_set.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_locale_set.md b/devel/sphinx/rapi/stri_locale_set.md
index e6809087a..74fdc9e86 100644
--- a/devel/sphinx/rapi/stri_locale_set.md
+++ b/devel/sphinx/rapi/stri_locale_set.md
@@ -38,6 +38,8 @@ See [stringi-locale](about_locale.md) for more information on the effect of chan
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other locale_management: [`about_locale`](about_locale.md), [`stri_locale_info()`](stri_locale_info.md), [`stri_locale_list()`](stri_locale_list.md)
## Examples
@@ -48,12 +50,12 @@ Other locale_management: [`about_locale`](about_locale.md), [`stri_locale_info()
```r
## Not run:
oldloc <- stri_locale_set('pt_BR')
-## You are now working with stringi_1.7.6.9002 (pt_BR.UTF-8; ICU4C 69.1 [bundle]; Unicode 13.0)
+## You are now working with stringi_1.7.7 (pt_BR.UTF-8; ICU4C 69.1 [bundle]; Unicode 13.0)
# ... some locale-dependent operations
# ... note that you may always modify a locale per-call
# ... changing the default locale is convenient if you perform
# ... many operations
stri_locale_set(oldloc) # restore the previous default locale
-## You are now working with stringi_1.7.6.9002 (en_AU.UTF-8; ICU4C 69.1 [bundle]; Unicode 13.0)
+## You are now working with stringi_1.7.7 (en_AU.UTF-8; ICU4C 69.1 [bundle]; Unicode 13.0)
## End(Not run)
```
diff --git a/devel/sphinx/rapi/stri_locate.md b/devel/sphinx/rapi/stri_locate.md
index 5fe4225d9..ec06130e2 100644
--- a/devel/sphinx/rapi/stri_locate.md
+++ b/devel/sphinx/rapi/stri_locate.md
@@ -156,6 +156,8 @@ If `capture_groups=TRUE`, then the outputs are equipped with the `capture_groups
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other search_locate: [`about_search`](about_search.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md)
Other indexing: [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_sub_all()`](stri_sub_all.md), [`stri_sub()`](stri_sub.md)
diff --git a/devel/sphinx/rapi/stri_locate_boundaries.md b/devel/sphinx/rapi/stri_locate_boundaries.md
index 98e1012f4..4600c4d8f 100644
--- a/devel/sphinx/rapi/stri_locate_boundaries.md
+++ b/devel/sphinx/rapi/stri_locate_boundaries.md
@@ -62,6 +62,8 @@ For `stri_locate_*_words`, just like in [`stri_extract_all_words`](stri_extract_
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other search_locate: [`about_search`](about_search.md), [`stri_locate_all()`](stri_locate.md)
Other indexing: [`stri_locate_all()`](stri_locate.md), [`stri_sub_all()`](stri_sub_all.md), [`stri_sub()`](stri_sub.md)
diff --git a/devel/sphinx/rapi/stri_match.md b/devel/sphinx/rapi/stri_match.md
index 6560ed331..6daf56a21 100644
--- a/devel/sphinx/rapi/stri_match.md
+++ b/devel/sphinx/rapi/stri_match.md
@@ -79,6 +79,8 @@ If regular expressions feature a named capture group, the matrix columns will be
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other search_extract: [`about_search`](about_search.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_extract_all()`](stri_extract.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_na2empty.md b/devel/sphinx/rapi/stri_na2empty.md
index ecad239b4..453fa7f54 100644
--- a/devel/sphinx/rapi/stri_na2empty.md
+++ b/devel/sphinx/rapi/stri_na2empty.md
@@ -28,6 +28,8 @@ Returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other utils: [`stri_list2matrix()`](stri_list2matrix.md), [`stri_remove_empty()`](stri_remove_empty.md), [`stri_replace_na()`](stri_replace_na.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_numbytes.md b/devel/sphinx/rapi/stri_numbytes.md
index b53616446..6f69187a5 100644
--- a/devel/sphinx/rapi/stri_numbytes.md
+++ b/devel/sphinx/rapi/stri_numbytes.md
@@ -40,6 +40,8 @@ Returns an integer vector of the same length as `str`.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other length: [`%s$%()`](+25s+24+25.md), [`stri_isempty()`](stri_isempty.md), [`stri_length()`](stri_length.md), [`stri_pad_both()`](stri_pad.md), [`stri_sprintf()`](stri_sprintf.md), [`stri_width()`](stri_width.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_opts_brkiter.md b/devel/sphinx/rapi/stri_opts_brkiter.md
index 339487ad5..40d021391 100644
--- a/devel/sphinx/rapi/stri_opts_brkiter.md
+++ b/devel/sphinx/rapi/stri_opts_brkiter.md
@@ -64,4 +64,6 @@ Returns a named list object. Omitted `skip_*` values act as they have been set t
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other text_boundaries: [`about_search_boundaries`](about_search_boundaries.md), [`about_search`](about_search.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_split_lines()`](stri_split_lines.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_wrap()`](stri_wrap.md)
diff --git a/devel/sphinx/rapi/stri_opts_collator.md b/devel/sphinx/rapi/stri_opts_collator.md
index df47a54c3..4c4f0285d 100644
--- a/devel/sphinx/rapi/stri_opts_collator.md
+++ b/devel/sphinx/rapi/stri_opts_collator.md
@@ -46,7 +46,7 @@ stri_coll(
| `case_level` | single logical value; controls whether an extra case level (positioned before the third level) is generated or not |
| `normalization` | single logical value; if `TRUE`, then incremental check is performed to see whether the input data is in the FCD form. If the data is not in the FCD form, incremental NFD normalization is performed |
| `normalisation` | alias of `normalization` |
-| `numeric` | single logical value; when turned on, this attribute generates a collation key for the numeric value of substrings of digits; this is a way to get \'100\' to sort AFTER \'2\' |
+| `numeric` | single logical value; when turned on, this attribute generates a collation key for the numeric value of substrings of digits; this is a way to get \'100\' to sort AFTER \'2\'; note that negative numbers will not be ordered properly |
| `...` | \[DEPRECATED\] any other arguments passed to this function generate a warning; this argument will be removed in the future |
## Details
@@ -73,6 +73,8 @@ Returns a named list object; missing settings are left with default values.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other locale_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
Other search_coll: [`about_search_coll`](about_search_coll.md), [`about_search`](about_search.md)
diff --git a/devel/sphinx/rapi/stri_opts_fixed.md b/devel/sphinx/rapi/stri_opts_fixed.md
index bd19acf18..ebdd6c6b5 100644
--- a/devel/sphinx/rapi/stri_opts_fixed.md
+++ b/devel/sphinx/rapi/stri_opts_fixed.md
@@ -40,6 +40,8 @@ Returns a named list object.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other search_fixed: [`about_search_fixed`](about_search_fixed.md), [`about_search`](about_search.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_opts_regex.md b/devel/sphinx/rapi/stri_opts_regex.md
index 28cbec32b..f3d054046 100644
--- a/devel/sphinx/rapi/stri_opts_regex.md
+++ b/devel/sphinx/rapi/stri_opts_regex.md
@@ -64,6 +64,8 @@ Returns a named list object; missing settings are left with default values.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other search_regex: [`about_search_regex`](about_search_regex.md), [`about_search`](about_search.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_order.md b/devel/sphinx/rapi/stri_order.md
index af5107ab0..4904ac77e 100644
--- a/devel/sphinx/rapi/stri_order.md
+++ b/devel/sphinx/rapi/stri_order.md
@@ -46,6 +46,8 @@ The function yields an integer vector that gives the sort order.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other locale_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_pad.md b/devel/sphinx/rapi/stri_pad.md
index 136c1242a..ae1e4220f 100644
--- a/devel/sphinx/rapi/stri_pad.md
+++ b/devel/sphinx/rapi/stri_pad.md
@@ -69,6 +69,8 @@ These functions return a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other length: [`%s$%()`](+25s+24+25.md), [`stri_isempty()`](stri_isempty.md), [`stri_length()`](stri_length.md), [`stri_numbytes()`](stri_numbytes.md), [`stri_sprintf()`](stri_sprintf.md), [`stri_width()`](stri_width.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_rand_lipsum.md b/devel/sphinx/rapi/stri_rand_lipsum.md
index 513d800a9..02fd3f336 100644
--- a/devel/sphinx/rapi/stri_rand_lipsum.md
+++ b/devel/sphinx/rapi/stri_rand_lipsum.md
@@ -36,6 +36,8 @@ Returns a character vector of length `n_paragraphs`.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other random: [`stri_rand_shuffle()`](stri_rand_shuffle.md), [`stri_rand_strings()`](stri_rand_strings.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_rand_shuffle.md b/devel/sphinx/rapi/stri_rand_shuffle.md
index e48d1ff22..05eedd6af 100644
--- a/devel/sphinx/rapi/stri_rand_shuffle.md
+++ b/devel/sphinx/rapi/stri_rand_shuffle.md
@@ -34,6 +34,8 @@ Returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other random: [`stri_rand_lipsum()`](stri_rand_lipsum.md), [`stri_rand_strings()`](stri_rand_strings.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_rand_strings.md b/devel/sphinx/rapi/stri_rand_strings.md
index 0e950c151..1a2c9afa0 100644
--- a/devel/sphinx/rapi/stri_rand_strings.md
+++ b/devel/sphinx/rapi/stri_rand_strings.md
@@ -38,6 +38,8 @@ Returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other random: [`stri_rand_lipsum()`](stri_rand_lipsum.md), [`stri_rand_shuffle()`](stri_rand_shuffle.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_rank.md b/devel/sphinx/rapi/stri_rank.md
index 3f18e3697..c6e00abdb 100644
--- a/devel/sphinx/rapi/stri_rank.md
+++ b/devel/sphinx/rapi/stri_rank.md
@@ -40,6 +40,8 @@ The result is a vector of ranks corresponding to each string in `str`.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other locale_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_read_lines.md b/devel/sphinx/rapi/stri_read_lines.md
index c2a3d5d9c..4e16a20f2 100644
--- a/devel/sphinx/rapi/stri_read_lines.md
+++ b/devel/sphinx/rapi/stri_read_lines.md
@@ -39,4 +39,6 @@ Returns a character vector, each text line is a separate string. The output is a
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other files: [`stri_read_raw()`](stri_read_raw.md), [`stri_write_lines()`](stri_write_lines.md)
diff --git a/devel/sphinx/rapi/stri_read_raw.md b/devel/sphinx/rapi/stri_read_raw.md
index 54e834875..4852a2e8f 100644
--- a/devel/sphinx/rapi/stri_read_raw.md
+++ b/devel/sphinx/rapi/stri_read_raw.md
@@ -33,4 +33,6 @@ Returns a vector of type `raw`.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other files: [`stri_read_lines()`](stri_read_lines.md), [`stri_write_lines()`](stri_write_lines.md)
diff --git a/devel/sphinx/rapi/stri_remove_empty.md b/devel/sphinx/rapi/stri_remove_empty.md
index 20e639c00..318236211 100644
--- a/devel/sphinx/rapi/stri_remove_empty.md
+++ b/devel/sphinx/rapi/stri_remove_empty.md
@@ -43,6 +43,8 @@ Returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other utils: [`stri_list2matrix()`](stri_list2matrix.md), [`stri_na2empty()`](stri_na2empty.md), [`stri_replace_na()`](stri_replace_na.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_replace.md b/devel/sphinx/rapi/stri_replace.md
index 668a110a4..ee7b3d5eb 100644
--- a/devel/sphinx/rapi/stri_replace.md
+++ b/devel/sphinx/rapi/stri_replace.md
@@ -120,6 +120,8 @@ All the functions return a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other search_replace: [`about_search`](about_search.md), [`stri_replace_rstr()`](stri_replace_rstr.md), [`stri_trim_both()`](stri_trim.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_replace_na.md b/devel/sphinx/rapi/stri_replace_na.md
index 4f74a9d09..4992a870d 100644
--- a/devel/sphinx/rapi/stri_replace_na.md
+++ b/devel/sphinx/rapi/stri_replace_na.md
@@ -33,6 +33,8 @@ Returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other utils: [`stri_list2matrix()`](stri_list2matrix.md), [`stri_na2empty()`](stri_na2empty.md), [`stri_remove_empty()`](stri_remove_empty.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_replace_rstr.md b/devel/sphinx/rapi/stri_replace_rstr.md
index 1f46f0f89..614c91d36 100644
--- a/devel/sphinx/rapi/stri_replace_rstr.md
+++ b/devel/sphinx/rapi/stri_replace_rstr.md
@@ -28,4 +28,6 @@ Returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other search_replace: [`about_search`](about_search.md), [`stri_replace_all()`](stri_replace.md), [`stri_trim_both()`](stri_trim.md)
diff --git a/devel/sphinx/rapi/stri_reverse.md b/devel/sphinx/rapi/stri_reverse.md
index bc440400b..fd44eda1b 100644
--- a/devel/sphinx/rapi/stri_reverse.md
+++ b/devel/sphinx/rapi/stri_reverse.md
@@ -34,6 +34,8 @@ Returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
## Examples
diff --git a/devel/sphinx/rapi/stri_sort.md b/devel/sphinx/rapi/stri_sort.md
index 0d9733abb..891a1e23a 100644
--- a/devel/sphinx/rapi/stri_sort.md
+++ b/devel/sphinx/rapi/stri_sort.md
@@ -44,6 +44,8 @@ The result is a sorted version of `str`, i.e., a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other locale_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_sort_key.md b/devel/sphinx/rapi/stri_sort_key.md
index 51f36206f..b9ebef2a4 100644
--- a/devel/sphinx/rapi/stri_sort_key.md
+++ b/devel/sphinx/rapi/stri_sort_key.md
@@ -40,6 +40,8 @@ The result is a character vector with the same length as `str` that contains the
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other locale_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_split.md b/devel/sphinx/rapi/stri_split.md
index 008796f06..cd01355bf 100644
--- a/devel/sphinx/rapi/stri_split.md
+++ b/devel/sphinx/rapi/stri_split.md
@@ -91,6 +91,8 @@ Otherwise, [`stri_list2matrix`](stri_list2matrix.md) with `byrow=TRUE` and `n_mi
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other search_split: [`about_search`](about_search.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_split_lines()`](stri_split_lines.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_split_boundaries.md b/devel/sphinx/rapi/stri_split_boundaries.md
index 7f64117b0..5b845d3fc 100644
--- a/devel/sphinx/rapi/stri_split_boundaries.md
+++ b/devel/sphinx/rapi/stri_split_boundaries.md
@@ -52,6 +52,8 @@ Otherwise, [`stri_list2matrix`](stri_list2matrix.md) with `byrow=TRUE` and `n_mi
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other search_split: [`about_search`](about_search.md), [`stri_split_lines()`](stri_split_lines.md), [`stri_split()`](stri_split.md)
Other locale_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
diff --git a/devel/sphinx/rapi/stri_split_lines.md b/devel/sphinx/rapi/stri_split_lines.md
index d7f8b3583..4a7a4f5b1 100644
--- a/devel/sphinx/rapi/stri_split_lines.md
+++ b/devel/sphinx/rapi/stri_split_lines.md
@@ -49,6 +49,8 @@ These stringi functions follow UTR#18 rules, where a ne
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other search_split: [`about_search`](about_search.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_split()`](stri_split.md)
Other text_boundaries: [`about_search_boundaries`](about_search_boundaries.md), [`about_search`](about_search.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_brkiter()`](stri_opts_brkiter.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_wrap()`](stri_wrap.md)
diff --git a/devel/sphinx/rapi/stri_sprintf.md b/devel/sphinx/rapi/stri_sprintf.md
index dcf8e814b..9c542a434 100644
--- a/devel/sphinx/rapi/stri_sprintf.md
+++ b/devel/sphinx/rapi/stri_sprintf.md
@@ -88,6 +88,8 @@ The other functions return a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other length: [`%s$%()`](+25s+24+25.md), [`stri_isempty()`](stri_isempty.md), [`stri_length()`](stri_length.md), [`stri_numbytes()`](stri_numbytes.md), [`stri_pad_both()`](stri_pad.md), [`stri_width()`](stri_width.md)
## Examples
@@ -141,14 +143,14 @@ stri_printf("%+10.3f", c(-Inf, -0, 0, Inf, NaN, NA_real_),
## 💩
##
stri_sprintf("UNIX time %1$f is %1$s.", Sys.time())
-## [1] "UNIX time 1655099471.739465 is 2022-06-13 15:51:11."
+## [1] "UNIX time 1656747632.723760 is 2022-07-02 17:40:32."
# the following do not work in sprintf()
stri_sprintf("%1$#- *2$.*3$f", 1.23456, 10, 3) # two asterisks
## [1] " 1.235 "
stri_sprintf(c("%s", "%f"), pi) # re-coercion needed
## [1] "3.14159265358979" "3.141593"
stri_sprintf("%1$s is %1$f UNIX time.", Sys.time()) # re-coercion needed
-## [1] "2022-06-13 15:51:11 is 1655099471.742067 UNIX time."
+## [1] "2022-07-02 17:40:32 is 1656747632.726045 UNIX time."
stri_sprintf(c("%d", "%s"), factor(11:12)) # re-coercion needed
## [1] "1" "12"
stri_sprintf(c("%s", "%d"), factor(11:12)) # re-coercion needed
diff --git a/devel/sphinx/rapi/stri_startsendswith.md b/devel/sphinx/rapi/stri_startsendswith.md
index fb1e88f41..f437078ef 100644
--- a/devel/sphinx/rapi/stri_startsendswith.md
+++ b/devel/sphinx/rapi/stri_startsendswith.md
@@ -94,6 +94,8 @@ Each function returns a logical vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other search_detect: [`about_search`](about_search.md), [`stri_detect()`](stri_detect.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_stats_general.md b/devel/sphinx/rapi/stri_stats_general.md
index 23c30eb31..cd16e8b9c 100644
--- a/devel/sphinx/rapi/stri_stats_general.md
+++ b/devel/sphinx/rapi/stri_stats_general.md
@@ -44,6 +44,8 @@ Returns an integer vector with the following named elements:
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other stats: [`stri_stats_latex()`](stri_stats_latex.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_stats_latex.md b/devel/sphinx/rapi/stri_stats_latex.md
index 49fcfc34e..2da3547ab 100644
--- a/devel/sphinx/rapi/stri_stats_latex.md
+++ b/devel/sphinx/rapi/stri_stats_latex.md
@@ -46,6 +46,8 @@ Returns an integer vector with the following named elements:
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other stats: [`stri_stats_general()`](stri_stats_general.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_sub.md b/devel/sphinx/rapi/stri_sub.md
index 5a683a891..48ca4a167 100644
--- a/devel/sphinx/rapi/stri_sub.md
+++ b/devel/sphinx/rapi/stri_sub.md
@@ -68,6 +68,8 @@ Note that for some Unicode strings, the extracted substrings might not be well-f
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other indexing: [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_locate_all()`](stri_locate.md), [`stri_sub_all()`](stri_sub_all.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_sub_all.md b/devel/sphinx/rapi/stri_sub_all.md
index 532cf9be7..f90ae9618 100644
--- a/devel/sphinx/rapi/stri_sub_all.md
+++ b/devel/sphinx/rapi/stri_sub_all.md
@@ -71,6 +71,8 @@ In the replacement function, the index ranges must be sorted with respect to `fr
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other indexing: [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_locate_all()`](stri_locate.md), [`stri_sub()`](stri_sub.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_subset.md b/devel/sphinx/rapi/stri_subset.md
index 74decfb60..af8bd09f0 100644
--- a/devel/sphinx/rapi/stri_subset.md
+++ b/devel/sphinx/rapi/stri_subset.md
@@ -81,6 +81,8 @@ The `stri_subset_*<-` functions modifies `str` \'in-place\'.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other search_subset: [`about_search`](about_search.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_timezone_info.md b/devel/sphinx/rapi/stri_timezone_info.md
index fa5f59019..1bad92325 100644
--- a/devel/sphinx/rapi/stri_timezone_info.md
+++ b/devel/sphinx/rapi/stri_timezone_info.md
@@ -48,6 +48,8 @@ Returns a list with the following named components:
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other datetime: [`stri_datetime_add()`](stri_datetime_add.md), [`stri_datetime_create()`](stri_datetime_create.md), [`stri_datetime_fields()`](stri_datetime_fields.md), [`stri_datetime_format()`](stri_datetime_format.md), [`stri_datetime_fstr()`](stri_datetime_fstr.md), [`stri_datetime_now()`](stri_datetime_now.md), [`stri_datetime_symbols()`](stri_datetime_symbols.md), [`stri_timezone_get()`](stri_timezone_set.md), [`stri_timezone_list()`](stri_timezone_list.md)
Other timezone: [`stri_timezone_get()`](stri_timezone_set.md), [`stri_timezone_list()`](stri_timezone_list.md)
diff --git a/devel/sphinx/rapi/stri_timezone_list.md b/devel/sphinx/rapi/stri_timezone_list.md
index 4abfbabf0..25c5b597d 100644
--- a/devel/sphinx/rapi/stri_timezone_list.md
+++ b/devel/sphinx/rapi/stri_timezone_list.md
@@ -51,6 +51,8 @@ Returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other datetime: [`stri_datetime_add()`](stri_datetime_add.md), [`stri_datetime_create()`](stri_datetime_create.md), [`stri_datetime_fields()`](stri_datetime_fields.md), [`stri_datetime_format()`](stri_datetime_format.md), [`stri_datetime_fstr()`](stri_datetime_fstr.md), [`stri_datetime_now()`](stri_datetime_now.md), [`stri_datetime_symbols()`](stri_datetime_symbols.md), [`stri_timezone_get()`](stri_timezone_set.md), [`stri_timezone_info()`](stri_timezone_info.md)
Other timezone: [`stri_timezone_get()`](stri_timezone_set.md), [`stri_timezone_info()`](stri_timezone_info.md)
diff --git a/devel/sphinx/rapi/stri_timezone_set.md b/devel/sphinx/rapi/stri_timezone_set.md
index d0edbd6df..671d5d380 100644
--- a/devel/sphinx/rapi/stri_timezone_set.md
+++ b/devel/sphinx/rapi/stri_timezone_set.md
@@ -44,6 +44,8 @@ Unless the default time zone has already been set using `stri_timezone_set`, the
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other datetime: [`stri_datetime_add()`](stri_datetime_add.md), [`stri_datetime_create()`](stri_datetime_create.md), [`stri_datetime_fields()`](stri_datetime_fields.md), [`stri_datetime_format()`](stri_datetime_format.md), [`stri_datetime_fstr()`](stri_datetime_fstr.md), [`stri_datetime_now()`](stri_datetime_now.md), [`stri_datetime_symbols()`](stri_datetime_symbols.md), [`stri_timezone_info()`](stri_timezone_info.md), [`stri_timezone_list()`](stri_timezone_list.md)
Other timezone: [`stri_timezone_info()`](stri_timezone_info.md), [`stri_timezone_list()`](stri_timezone_list.md)
diff --git a/devel/sphinx/rapi/stri_trans_casemap.md b/devel/sphinx/rapi/stri_trans_casemap.md
index e7d36e429..b4613219b 100644
--- a/devel/sphinx/rapi/stri_trans_casemap.md
+++ b/devel/sphinx/rapi/stri_trans_casemap.md
@@ -59,6 +59,8 @@ Each function returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other locale_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
Other transform: [`stri_trans_char()`](stri_trans_char.md), [`stri_trans_general()`](stri_trans_general.md), [`stri_trans_list()`](stri_trans_list.md), [`stri_trans_nfc()`](stri_trans_nf.md)
diff --git a/devel/sphinx/rapi/stri_trans_char.md b/devel/sphinx/rapi/stri_trans_char.md
index 28fb4de59..4378ad6c4 100644
--- a/devel/sphinx/rapi/stri_trans_char.md
+++ b/devel/sphinx/rapi/stri_trans_char.md
@@ -40,6 +40,8 @@ Returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other transform: [`stri_trans_general()`](stri_trans_general.md), [`stri_trans_list()`](stri_trans_list.md), [`stri_trans_nfc()`](stri_trans_nf.md), [`stri_trans_tolower()`](stri_trans_casemap.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_trans_general.md b/devel/sphinx/rapi/stri_trans_general.md
index d35866b20..c716ab437 100644
--- a/devel/sphinx/rapi/stri_trans_general.md
+++ b/devel/sphinx/rapi/stri_trans_general.md
@@ -53,6 +53,8 @@ Returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other transform: [`stri_trans_char()`](stri_trans_char.md), [`stri_trans_list()`](stri_trans_list.md), [`stri_trans_nfc()`](stri_trans_nf.md), [`stri_trans_tolower()`](stri_trans_casemap.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_trans_list.md b/devel/sphinx/rapi/stri_trans_list.md
index aa60e51bf..f6d6c4f93 100644
--- a/devel/sphinx/rapi/stri_trans_list.md
+++ b/devel/sphinx/rapi/stri_trans_list.md
@@ -26,6 +26,8 @@ Returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other transform: [`stri_trans_char()`](stri_trans_char.md), [`stri_trans_general()`](stri_trans_general.md), [`stri_trans_nfc()`](stri_trans_nf.md), [`stri_trans_tolower()`](stri_trans_casemap.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_trans_nf.md b/devel/sphinx/rapi/stri_trans_nf.md
index f8b91f429..f65fdb8ec 100644
--- a/devel/sphinx/rapi/stri_trans_nf.md
+++ b/devel/sphinx/rapi/stri_trans_nf.md
@@ -50,7 +50,7 @@ The following Normalization Forms (NFs) are supported:
- NFKC_Casefold (combination of NFKC, case folding, and removing ignorable characters which was introduced with Unicode 5.2).
-Note that many W3C Specifications recommend using NFC for all content, because this form avoids potential interoperability problems arising from the use of canonically equivalent, yet different, character sequences in document formats on the Web. Thus, you will rather not use these functions in typical string processing activities. Most often you may assume that a string is in NFC, see RFC\\#5198.
+Note that many W3C Specifications recommend using NFC for all content, because this form avoids potential interoperability problems arising from the use of canonically equivalent, yet different, character sequences in document formats on the Web. Thus, you will rather not use these functions in typical string processing activities. Most often you may assume that a string is in NFC, see RFC5198.
As usual in stringi, if the input character vector is in the native encoding, it will be automatically converted to UTF-8.
@@ -70,7 +70,7 @@ The `stri_trans_nf*` functions return a character vector of the same length as i
*Unicode Normalization Forms* -- Unicode Standard Annex #15,
-*Unicode Format for Network Interchange* -- RFC\\#5198,
+*Unicode Format for Network Interchange* -- RFC5198,
*Character Model for the World Wide Web 1.0: Normalization* -- W3C Working Draft,
@@ -82,6 +82,8 @@ The `stri_trans_nf*` functions return a character vector of the same length as i
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other transform: [`stri_trans_char()`](stri_trans_char.md), [`stri_trans_general()`](stri_trans_general.md), [`stri_trans_list()`](stri_trans_list.md), [`stri_trans_tolower()`](stri_trans_casemap.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_trim.md b/devel/sphinx/rapi/stri_trim.md
index d840d25eb..897efdbe2 100644
--- a/devel/sphinx/rapi/stri_trim.md
+++ b/devel/sphinx/rapi/stri_trim.md
@@ -56,6 +56,8 @@ All functions return a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other search_replace: [`about_search`](about_search.md), [`stri_replace_all()`](stri_replace.md), [`stri_replace_rstr()`](stri_replace_rstr.md)
Other search_charclass: [`about_search_charclass`](about_search_charclass.md), [`about_search`](about_search.md)
diff --git a/devel/sphinx/rapi/stri_unescape_unicode.md b/devel/sphinx/rapi/stri_unescape_unicode.md
index bb0357f2d..a8e1da341 100644
--- a/devel/sphinx/rapi/stri_unescape_unicode.md
+++ b/devel/sphinx/rapi/stri_unescape_unicode.md
@@ -38,6 +38,8 @@ Returns a character vector. If an escape sequence is ill-formed, result will be
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other escape: [`stri_escape_unicode()`](stri_escape_unicode.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_unique.md b/devel/sphinx/rapi/stri_unique.md
index 239b13186..f4f9f8e3d 100644
--- a/devel/sphinx/rapi/stri_unique.md
+++ b/devel/sphinx/rapi/stri_unique.md
@@ -40,6 +40,8 @@ Returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other locale_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_wrap()`](stri_wrap.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_width.md b/devel/sphinx/rapi/stri_width.md
index 0b779a141..c89a8229b 100644
--- a/devel/sphinx/rapi/stri_width.md
+++ b/devel/sphinx/rapi/stri_width.md
@@ -50,6 +50,8 @@ Returns an integer vector of the same length as `str`.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other length: [`%s$%()`](+25s+24+25.md), [`stri_isempty()`](stri_isempty.md), [`stri_length()`](stri_length.md), [`stri_numbytes()`](stri_numbytes.md), [`stri_pad_both()`](stri_pad.md), [`stri_sprintf()`](stri_sprintf.md)
## Examples
diff --git a/devel/sphinx/rapi/stri_wrap.md b/devel/sphinx/rapi/stri_wrap.md
index f5f584ec6..023bc23ee 100644
--- a/devel/sphinx/rapi/stri_wrap.md
+++ b/devel/sphinx/rapi/stri_wrap.md
@@ -71,6 +71,8 @@ D.E. Knuth, M.F. Plass, Breaking paragraphs into lines, *Software: Practice and
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other locale_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md)
Other text_boundaries: [`about_search_boundaries`](about_search_boundaries.md), [`about_search`](about_search.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_brkiter()`](stri_opts_brkiter.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_split_lines()`](stri_split_lines.md), [`stri_trans_tolower()`](stri_trans_casemap.md)
diff --git a/devel/sphinx/rapi/stri_write_lines.md b/devel/sphinx/rapi/stri_write_lines.md
index 7c856b221..db2a3ab75 100644
--- a/devel/sphinx/rapi/stri_write_lines.md
+++ b/devel/sphinx/rapi/stri_write_lines.md
@@ -44,4 +44,6 @@ This function returns nothing noteworthy.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other files: [`stri_read_lines()`](stri_read_lines.md), [`stri_read_raw()`](stri_read_raw.md)
diff --git a/devel/sphinx/weave/basic_operations.Rmd b/devel/sphinx/weave/basic_operations.Rmd
index c67ee7b4c..055ccebd0 100644
--- a/devel/sphinx/weave/basic_operations.Rmd
+++ b/devel/sphinx/weave/basic_operations.Rmd
@@ -12,7 +12,7 @@ opts_chunk$set(cache.path="cache/basic_operations_")
Computing Length and Width
--------------------------
-First we shall review the functions related to determining the number of
+First, we shall review the functions related to determining the number of
entities in each string.
Let's consider the following character vector:
@@ -38,8 +38,8 @@ stri_length(x)
The first string carries 4 ASCII (English) letters, the second consists
of 2 Chinese characters (U+4F60, U+597D; a greeting), and the third one
is comprised of 3 zero-width spaces (U+200B). Note that the 5th element
-in `x` is an empty string, `""`, hence its length is 0. Moreover, there
-is a missing (`NA`) value at index 4, therefore the corresponding length
+in `x` is an empty string, `""`. Hence, its length is 0. Moreover, there
+is a missing (`NA`) value at index 4. Therefore, the corresponding length
is undefined as well.
When formatting strings for display (e.g., in a report dynamically
@@ -48,8 +48,8 @@ generated with `Sweave()` or
estimate may be more informative -- an approximate number of text
columns it will occupy when printed using a monospaced font. In
particular, many Chinese, Japanese, Korean, and most emoji characters
-take up two text cells. Some code points, on the other hand, might be of
-width 0 (e.g., the said ZERO WIDTH SPACE, U+200B).
+take up two text cells. On the other hand,
+some code points might be of width 0 (e.g., the said ZERO WIDTH SPACE, U+200B).
```{r }
stri_width(x)
@@ -155,13 +155,13 @@ generated by, amongst others, `stri_sub_all()` and `stri_extract_all()`.
Extracting and Replacing Substrings
-----------------------------------
-Next group of functions deals with the extraction and replacement of
+The next group of functions deals with the extraction and replacement of
particular sequences of code points in given strings.
### Indexing Vectors
-Recall that in order to select a subsequence from any R vector, we use
+To select a subsequence from any R vector, we use
the square-bracket operator[^footsubset] with an index vector consisting of
either non-negative integers, negative integers, or logical values[^footnames].
@@ -199,7 +199,7 @@ x[!stri_isempty(x) & !is.na(x)]
### Extracting Substrings
-A character vector is, in its very own essence, a sequence of sequences
+A character vector is, in its very essence, a sequence of sequences
of code points. To extract specific substrings from each string in a
collection, we can use the `stri_sub()` function.
@@ -257,7 +257,7 @@ stri_list2matrix(z, by_row=TRUE, fill="", n_min=5)
The second parameter of both `stri_sub()` and `stri_sub_list()` can also
be fed with a two-column matrix of the form `cbind(from, to)`. Here, the
-first column gives the start indices and the second column defines the
+first column gives the start indices, and the second column defines the
end ones. Such matrices are generated, amongst others, by the
`stri_locate_()` functions (see below for details).
@@ -283,11 +283,11 @@ Note the difference between the above output and the following one:
stri_sub_all(c("abcdefgh", "ijklmnop"), from_to)
```
-This time, we extract four identical sections from each of the two
+This time, we extract four identical sections from the two
inputs.
Moreover, if the second column of the index matrix is named `"length"`
-(and only if this is exactly the case), i.e., the indexer is of the form
+(and only if this is the case), i.e., the indexer is of the form
`cbind(from, length=length)`, extraction will be based on the extracted
chunk size.
@@ -346,7 +346,7 @@ stri_sub(y, 7, length=3) <- "spam" # in-place replacement, egg → spam
print(y) # y has changed
```
-Note that the state of `y` has changed in such a way that the substring
+Note that the state of `y` has changed so that the substring
of length 3 starting at the 7th code point was replaced by a length-4
content.
diff --git a/devel/sphinx/weave/basic_operations.md b/devel/sphinx/weave/basic_operations.md
index 23e2d8388..2b1afb9e7 100644
--- a/devel/sphinx/weave/basic_operations.md
+++ b/devel/sphinx/weave/basic_operations.md
@@ -3,14 +3,14 @@ Basic String Operations
=======================
-> This tutorial is based on the [paper on *stringi*](https://stringi.gagolewski.com/_static/vignette/stringi.pdf) that will appear in the *Journal of Statistical Software*.
+> This tutorial is based on the [paper on *stringi*](https://dx.doi.org/10.18637/jss.v103.i02) that has recently been published the *Journal of Statistical Software*, see {cite}`stringi`.
Computing Length and Width
--------------------------
-First we shall review the functions related to determining the number of
+First, we shall review the functions related to determining the number of
entities in each string.
Let's consider the following character vector:
@@ -41,8 +41,8 @@ stri_length(x)
The first string carries 4 ASCII (English) letters, the second consists
of 2 Chinese characters (U+4F60, U+597D; a greeting), and the third one
is comprised of 3 zero-width spaces (U+200B). Note that the 5th element
-in `x` is an empty string, `""`, hence its length is 0. Moreover, there
-is a missing (`NA`) value at index 4, therefore the corresponding length
+in `x` is an empty string, `""`. Hence, its length is 0. Moreover, there
+is a missing (`NA`) value at index 4. Therefore, the corresponding length
is undefined as well.
When formatting strings for display (e.g., in a report dynamically
@@ -51,8 +51,8 @@ generated with `Sweave()` or
estimate may be more informative -- an approximate number of text
columns it will occupy when printed using a monospaced font. In
particular, many Chinese, Japanese, Korean, and most emoji characters
-take up two text cells. Some code points, on the other hand, might be of
-width 0 (e.g., the said ZERO WIDTH SPACE, U+200B).
+take up two text cells. On the other hand,
+some code points might be of width 0 (e.g., the said ZERO WIDTH SPACE, U+200B).
```r
@@ -188,13 +188,13 @@ generated by, amongst others, `stri_sub_all()` and `stri_extract_all()`.
Extracting and Replacing Substrings
-----------------------------------
-Next group of functions deals with the extraction and replacement of
+The next group of functions deals with the extraction and replacement of
particular sequences of code points in given strings.
### Indexing Vectors
-Recall that in order to select a subsequence from any R vector, we use
+To select a subsequence from any R vector, we use
the square-bracket operator[^footsubset] with an index vector consisting of
either non-negative integers, negative integers, or logical values[^footnames].
@@ -239,7 +239,7 @@ x[!stri_isempty(x) & !is.na(x)]
### Extracting Substrings
-A character vector is, in its very own essence, a sequence of sequences
+A character vector is, in its very essence, a sequence of sequences
of code points. To extract specific substrings from each string in a
collection, we can use the `stri_sub()` function.
@@ -318,7 +318,7 @@ stri_list2matrix(z, by_row=TRUE, fill="", n_min=5)
The second parameter of both `stri_sub()` and `stri_sub_list()` can also
be fed with a two-column matrix of the form `cbind(from, to)`. Here, the
-first column gives the start indices and the second column defines the
+first column gives the start indices, and the second column defines the
end ones. Such matrices are generated, amongst others, by the
`stri_locate_()` functions (see below for details).
@@ -363,11 +363,11 @@ stri_sub_all(c("abcdefgh", "ijklmnop"), from_to)
## [1] "ij" "kl" "mn" "op"
```
-This time, we extract four identical sections from each of the two
+This time, we extract four identical sections from the two
inputs.
Moreover, if the second column of the index matrix is named `"length"`
-(and only if this is exactly the case), i.e., the indexer is of the form
+(and only if this is the case), i.e., the indexer is of the form
`cbind(from, length=length)`, extraction will be based on the extracted
chunk size.
@@ -435,7 +435,7 @@ print(y) # y has changed
## [1] "spam, spam, spam, spam, bacon, and spam"
```
-Note that the state of `y` has changed in such a way that the substring
+Note that the state of `y` has changed so that the substring
of length 3 starting at the 7th code point was replaced by a length-4
content.
diff --git a/devel/sphinx/weave/codepoint_comparing.Rmd b/devel/sphinx/weave/codepoint_comparing.Rmd
index 8a4b3ab3d..49b1dadb6 100644
--- a/devel/sphinx/weave/codepoint_comparing.Rmd
+++ b/devel/sphinx/weave/codepoint_comparing.Rmd
@@ -29,8 +29,8 @@ relation.
"actg" %s===% c("ACTG", "actg", "act", "actga", NA)
```
-Due to recycling, the first string was compared against the 5 strings in
-the 2nd operand. There is only 1 exact match.
+Due to recycling, the first string was compared against the five strings in
+the 2nd operand. There is only one exact match.
(Sec:strsearch)=
@@ -45,7 +45,7 @@ and *p* is the length of the pattern), has been implemented in *stringi*
The table below lists the string search functions available
in *stringi*. Below we explain their behaviour in the context of fixed
-pattern matching. Notably, their description is quite detailed, because
+pattern matching. Notably, their description is quite detailed because
-- as we shall soon find out -- the corresponding operations are
available for the two other search engines: based on regular expressions
and the *ICU* Collator, see {ref}`Sec:regex` and {ref}`Sec:collator`.
@@ -108,9 +108,10 @@ to the search functions (via "`...`"; they are redirected as-is to
First, we may switch on the simplistic[^footfixedcase] case-insensitive matching.
-[^footfixedcase]: Which is not suitable for real-world NLP tasks, as it assumes that
+[^footfixedcase]: Which is not suitable for real-world NLP tasks,
+ as it assumes that
changing the case of a single code point always produces one and
- only one item; This way, `"groß"` does not compare equal to
+ only one item. This way, `"groß"` does not compare equal to
`"GROSS"`, see {ref}`Sec:collator` (and to some extent
{ref}`Sec:regex`) for a workaround.
@@ -131,7 +132,7 @@ stri_count_fixed("acatgacaca", "aca", overlap=TRUE)
Detecting and Subsetting Patterns
---------------------------------
-A somewhat simplified version of the above search task involves asking
+A somewhat simplified version of the above search task asks
whether a pattern occurs in a string at all. Such an operation can be
performed with a call to `stri_detect_fixed()`.
@@ -163,8 +164,8 @@ stri_endswith_fixed(x, "abc") # to=-1 - match at end
[^footanchor]: Note that testing for a pattern match at the start or end of a
string has not been implemented separately for `regex` patterns,
- which support `"^"` and `"$"` anchors that serve exactly this very
- purpose, see {ref}`Sec:regex`.
+ which support `"^"` and `"$"` anchors that serve precisely this
+ purpose; see {ref}`Sec:regex`.
Pattern detection is often performed in conjunction with character
vector subsetting. This is why we have a specialised (and hence slightly
@@ -177,7 +178,7 @@ stri_subset_fixed(x, "abc", omit_na=TRUE)
The above is equivalent to `x[which(stri_detect_fixed(x, "abc"))]` (note
the argument responsible for the removal of missing values), but avoids
-writing `x` twice. It hence is particularly convenient when `x` is
+writing `x` twice. It is particularly convenient when `x` is
generated programmatically on the fly, using some complicated
expression. Also, it works well with the forward pipe operator, as we
can write "`x |> stri_subset_fixed("abc", omit_na=TRUE)`".
@@ -202,7 +203,7 @@ stri_locate_first_fixed(x, "aga")
stri_locate_last_fixed(x, "aga", get_length=TRUE)
```
-In both examples we obtain a two-column matrix with the number of rows
+In both examples, we obtain a two-column matrix with the number of rows
determined by the recycling rule (here: the length of `x`). In the
former case, we get a "from--to" matrix (`get_length=FALSE`; the
default) where missing values correspond to either missing inputs or
@@ -246,7 +247,7 @@ stri_replace_all_fixed(x, "aga", "~", case_insensitive=TRUE)
```
Note that the inputs that are not part of any match are left unchanged.
-The input object is left unchanged, because it is not a replacement
+The input object is left unchanged because it is not a replacement
function per se.
The operation is vectorised with respect to all the three arguments
diff --git a/devel/sphinx/weave/codepoint_comparing.md b/devel/sphinx/weave/codepoint_comparing.md
index 8de1ab6e9..22aaf61df 100644
--- a/devel/sphinx/weave/codepoint_comparing.md
+++ b/devel/sphinx/weave/codepoint_comparing.md
@@ -4,7 +4,7 @@ Code-Pointwise Comparing
-> This tutorial is based on the [paper on *stringi*](https://stringi.gagolewski.com/_static/vignette/stringi.pdf) that will appear in the *Journal of Statistical Software*.
+> This tutorial is based on the [paper on *stringi*](https://dx.doi.org/10.18637/jss.v103.i02) that has recently been published the *Journal of Statistical Software*, see {cite}`stringi`.
There are many settings where we are faced with testing whether two
@@ -29,8 +29,8 @@ relation.
## [1] FALSE TRUE FALSE FALSE NA
```
-Due to recycling, the first string was compared against the 5 strings in
-the 2nd operand. There is only 1 exact match.
+Due to recycling, the first string was compared against the five strings in
+the 2nd operand. There is only one exact match.
(Sec:strsearch)=
@@ -45,7 +45,7 @@ and *p* is the length of the pattern), has been implemented in *stringi*
The table below lists the string search functions available
in *stringi*. Below we explain their behaviour in the context of fixed
-pattern matching. Notably, their description is quite detailed, because
+pattern matching. Notably, their description is quite detailed because
-- as we shall soon find out -- the corresponding operations are
available for the two other search engines: based on regular expressions
and the *ICU* Collator, see {ref}`Sec:regex` and {ref}`Sec:collator`.
@@ -114,9 +114,10 @@ to the search functions (via "`...`"; they are redirected as-is to
First, we may switch on the simplistic[^footfixedcase] case-insensitive matching.
-[^footfixedcase]: Which is not suitable for real-world NLP tasks, as it assumes that
+[^footfixedcase]: Which is not suitable for real-world NLP tasks,
+ as it assumes that
changing the case of a single code point always produces one and
- only one item; This way, `"groß"` does not compare equal to
+ only one item. This way, `"groß"` does not compare equal to
`"GROSS"`, see {ref}`Sec:collator` (and to some extent
{ref}`Sec:regex`) for a workaround.
@@ -142,7 +143,7 @@ stri_count_fixed("acatgacaca", "aca", overlap=TRUE)
Detecting and Subsetting Patterns
---------------------------------
-A somewhat simplified version of the above search task involves asking
+A somewhat simplified version of the above search task asks
whether a pattern occurs in a string at all. Such an operation can be
performed with a call to `stri_detect_fixed()`.
@@ -181,8 +182,8 @@ stri_endswith_fixed(x, "abc") # to=-1 - match at end
[^footanchor]: Note that testing for a pattern match at the start or end of a
string has not been implemented separately for `regex` patterns,
- which support `"^"` and `"$"` anchors that serve exactly this very
- purpose, see {ref}`Sec:regex`.
+ which support `"^"` and `"$"` anchors that serve precisely this
+ purpose; see {ref}`Sec:regex`.
Pattern detection is often performed in conjunction with character
vector subsetting. This is why we have a specialised (and hence slightly
@@ -197,7 +198,7 @@ stri_subset_fixed(x, "abc", omit_na=TRUE)
The above is equivalent to `x[which(stri_detect_fixed(x, "abc"))]` (note
the argument responsible for the removal of missing values), but avoids
-writing `x` twice. It hence is particularly convenient when `x` is
+writing `x` twice. It is particularly convenient when `x` is
generated programmatically on the fly, using some complicated
expression. Also, it works well with the forward pipe operator, as we
can write "`x |> stri_subset_fixed("abc", omit_na=TRUE)`".
@@ -235,7 +236,7 @@ stri_locate_last_fixed(x, "aga", get_length=TRUE)
## [4,] 9 3
```
-In both examples we obtain a two-column matrix with the number of rows
+In both examples, we obtain a two-column matrix with the number of rows
determined by the recycling rule (here: the length of `x`). In the
former case, we get a "from--to" matrix (`get_length=FALSE`; the
default) where missing values correspond to either missing inputs or
@@ -314,7 +315,7 @@ stri_replace_all_fixed(x, "aga", "~", case_insensitive=TRUE)
```
Note that the inputs that are not part of any match are left unchanged.
-The input object is left unchanged, because it is not a replacement
+The input object is left unchanged because it is not a replacement
function per se.
The operation is vectorised with respect to all the three arguments
diff --git a/devel/sphinx/weave/collation.Rmd b/devel/sphinx/weave/collation.Rmd
index ddb95ecab..9ea6ad116 100644
--- a/devel/sphinx/weave/collation.Rmd
+++ b/devel/sphinx/weave/collation.Rmd
@@ -9,10 +9,10 @@ opts_chunk$set(cache.path="cache/collation_")
```
-Historically, code-pointwise comparison had been used in most string
+Historically, the code-pointwise comparison had been used in most string
comparison activities, especially when strings in ASCII (i.e., English)
-were involved. However, nowadays this does not necessarily constitute
-the most suitable approach to the processing of natural-language texts.
+were involved. However, nowadays, this does not necessarily constitute
+the most suitable approach to processing natural-language texts.
In particular, a code-pointwise matching neither takes accented and
conjoined letters nor ignorable punctuation and case into account.
@@ -28,8 +28,9 @@ Locales
-------
String collation is amongst many locale-sensitive operations available
-in *stringi*. Before proceeding any further, we should first discuss how
-we can parameterise the *ICU* services so as to deliver the results that
+in *stringi*. However, before proceeding any further, we should first
+discuss how
+we can parameterise the *ICU* services to deliver the results that
reflect the expectations of a specific user community, such as the
speakers of different languages and their various regional variants.
@@ -93,7 +94,7 @@ during the request for a particular resource
(for more details, see the [*ICU* User Guide on Locales](https://unicode-org.github.io/icu/userguide/locale/)), which may depend on
the *ICU* library version actually in use as well as the way the *ICU*
Data Library (*icudt*) has been packaged. Therefore, for maximum
-portability, it is best to rely on the *ICU* library bundle that is
+portability, it is best to rely on the *ICU* library bundle
shipped with *stringi*. This is the case on Windows and macOS, whose
users typically download the pre-compiled versions of the package from
CRAN. However, on various flavours of GNU/Linux and other Unix-based
@@ -120,7 +121,7 @@ locale if no locale has been explicitly requested, i.e., when a
function's `locale` argument is left alone in its "`NULL`" state. The
default locale is initially set to match the system locale on the
current platform, and may be changed with `stri_locale_set()`, e.g., in
-the very rare case of improper automatic locale detection.
+the sporadic case of improper automatic locale detection.
As we have stated in the introduction, in this paper we use:
@@ -147,7 +148,7 @@ Testing for the Unicode equivalence between strings can be performed by
calling `%s==%` and, more generally, `stri_cmp_equiv()`, or their
negated versions, `%s!=%` and `stri_cmp_nequiv()`.
-In the example below we have: a followed by ogonek (two code points) vs
+In the example below, we have: a followed by ogonek (two code points) vs
a with ogonek (single code point).
```{r }
@@ -225,8 +226,8 @@ Collator Options
The table below lists the options that can be passed to
`stri_opts_collator()` via the dot-dot-dot parameter, "`...`", in all
-the functions that rely on the *ICU* Collator. Below we
-we play with some of them.
+the functions that rely on the *ICU* Collator.
+Below we play with some of them.
| Option | Purpose
@@ -314,7 +315,7 @@ stri_unique(x, alternate_shifted=TRUE) # strength=3
```
Here, when `strength = 3` is used (the default), punctuation
-differences are ignored, but case is deemed significant.
+differences are ignored, but the letter case is deemed significant.
```{r }
stri_unique(x, alternate_shifted=TRUE, strength=2)
@@ -330,10 +331,10 @@ stri_unique(x, strength=2)
### Backward Secondary Sorting
-The French Canadian Sorting Standard CAN/CSA Z243.4.1 (historically this
+The French Canadian Sorting Standard CAN/CSA Z243.4.1 (historically, this
had been the default for all French locales) requires the word ordering
with respect to the last accent difference. Such a behaviour can be
-applied either by setting the French-Canadian locale or by passing the
+applied either by setting the French-Canadian locale, or by passing the
`french=TRUE` option to the Collator.
```{r }
@@ -401,7 +402,7 @@ for.
Also, for "fuzzy" matching of strings,
the [*stringdist*](https://CRAN.R-project.org/package=stringdist) package
-might be useful.
+might be utile.
@@ -413,7 +414,7 @@ the occurrences of simple textual patterns. The counterparts of the
string search functions described in the section on
{ref}`Sec:fixed` have
their names ending with `*_coll()`. Albeit slower than the `*_fixed()`
-functions , they are more appropriate in natural language
+functions, they are more appropriate in natural language
processing activities.
For instance:
diff --git a/devel/sphinx/weave/collation.md b/devel/sphinx/weave/collation.md
index 73cf752bd..f939f8d22 100644
--- a/devel/sphinx/weave/collation.md
+++ b/devel/sphinx/weave/collation.md
@@ -4,13 +4,13 @@ Collation
-> This tutorial is based on the [paper on *stringi*](https://stringi.gagolewski.com/_static/vignette/stringi.pdf) that will appear in the *Journal of Statistical Software*.
+> This tutorial is based on the [paper on *stringi*](https://dx.doi.org/10.18637/jss.v103.i02) that has recently been published the *Journal of Statistical Software*, see {cite}`stringi`.
-Historically, code-pointwise comparison had been used in most string
+Historically, the code-pointwise comparison had been used in most string
comparison activities, especially when strings in ASCII (i.e., English)
-were involved. However, nowadays this does not necessarily constitute
-the most suitable approach to the processing of natural-language texts.
+were involved. However, nowadays, this does not necessarily constitute
+the most suitable approach to processing natural-language texts.
In particular, a code-pointwise matching neither takes accented and
conjoined letters nor ignorable punctuation and case into account.
@@ -26,8 +26,9 @@ Locales
-------
String collation is amongst many locale-sensitive operations available
-in *stringi*. Before proceeding any further, we should first discuss how
-we can parameterise the *ICU* services so as to deliver the results that
+in *stringi*. However, before proceeding any further, we should first
+discuss how
+we can parameterise the *ICU* services to deliver the results that
reflect the expectations of a specific user community, such as the
speakers of different languages and their various regional variants.
@@ -94,7 +95,7 @@ during the request for a particular resource
(for more details, see the [*ICU* User Guide on Locales](https://unicode-org.github.io/icu/userguide/locale/)), which may depend on
the *ICU* library version actually in use as well as the way the *ICU*
Data Library (*icudt*) has been packaged. Therefore, for maximum
-portability, it is best to rely on the *ICU* library bundle that is
+portability, it is best to rely on the *ICU* library bundle
shipped with *stringi*. This is the case on Windows and macOS, whose
users typically download the pre-compiled versions of the package from
CRAN. However, on various flavours of GNU/Linux and other Unix-based
@@ -122,7 +123,7 @@ locale if no locale has been explicitly requested, i.e., when a
function's `locale` argument is left alone in its "`NULL`" state. The
default locale is initially set to match the system locale on the
current platform, and may be changed with `stri_locale_set()`, e.g., in
-the very rare case of improper automatic locale detection.
+the sporadic case of improper automatic locale detection.
As we have stated in the introduction, in this paper we use:
@@ -151,7 +152,7 @@ Testing for the Unicode equivalence between strings can be performed by
calling `%s==%` and, more generally, `stri_cmp_equiv()`, or their
negated versions, `%s!=%` and `stri_cmp_nequiv()`.
-In the example below we have: a followed by ogonek (two code points) vs
+In the example below, we have: a followed by ogonek (two code points) vs
a with ogonek (single code point).
@@ -246,8 +247,8 @@ Collator Options
The table below lists the options that can be passed to
`stri_opts_collator()` via the dot-dot-dot parameter, "`...`", in all
-the functions that rely on the *ICU* Collator. Below we
-we play with some of them.
+the functions that rely on the *ICU* Collator.
+Below we play with some of them.
| Option | Purpose
@@ -255,7 +256,7 @@ we play with some of them.
| `locale` | a string specifying the locale to use; `NULL` (default) or `""` for the current default locale as indicated by `stri_locale_get()` |
| `strength` | an integer in *{1,2,3,4}* defining collation strength; 1 for the most permissive collation rules, 4 for the strictest ones; defaults to 3 |
| `uppercase_first` | logical; `NA` (default) orders upper and lower case letters in accordance to their tertiary weights, `TRUE` forces upper case letters to sort before lower case letters, `FALSE` does the opposite |
-| `numeric` | logical; if `TRUE`, a collation key for the numeric value of substrings of digits is generated; this is a way to make `"100"` ordered after `"2"`; defaults to `FALSE` |
+| `numeric` | logical; if `TRUE`, a collation key for the numeric value of substrings of digits is generated; this is a way to make `"100"` ordered after `"2"`; however, negative numbers are not ordered correctly; defaults to `FALSE` |
| `case_level` | logical; if `TRUE`, an extra case level (positioned before the third level) is generated; defaults to `FALSE` |
| `normalisation` | logical; if `TRUE`, then an incremental check is performed to see whether input data are in the FCD ("fast C or D") form; if data are not in the FCD form, the incremental NFD normalisation is performed, see {ref}`Sec:normalisation`; defaults to `FALSE` |
| `alternate_shifted` | logical; if `FALSE` (default), all code points with non-ignorable primary weights are handled in the same way; `TRUE` causes the code points with primary weights that are less than or equal to the variable top value to be ignored on the primary level and moved to the quaternary level; this can be used to, e.g., ignore punctuation, see the examples provided |
@@ -344,7 +345,7 @@ stri_unique(x, alternate_shifted=TRUE) # strength=3
```
Here, when `strength = 3` is used (the default), punctuation
-differences are ignored, but case is deemed significant.
+differences are ignored, but the letter case is deemed significant.
```r
@@ -364,10 +365,10 @@ stri_unique(x, strength=2)
### Backward Secondary Sorting
-The French Canadian Sorting Standard CAN/CSA Z243.4.1 (historically this
+The French Canadian Sorting Standard CAN/CSA Z243.4.1 (historically, this
had been the default for all French locales) requires the word ordering
with respect to the last accent difference. Such a behaviour can be
-applied either by setting the French-Canadian locale or by passing the
+applied either by setting the French-Canadian locale, or by passing the
`french=TRUE` option to the Collator.
@@ -453,7 +454,7 @@ for.
Also, for "fuzzy" matching of strings,
the [*stringdist*](https://CRAN.R-project.org/package=stringdist) package
-might be useful.
+might be utile.
@@ -465,7 +466,7 @@ the occurrences of simple textual patterns. The counterparts of the
string search functions described in the section on
{ref}`Sec:fixed` have
their names ending with `*_coll()`. Albeit slower than the `*_fixed()`
-functions , they are more appropriate in natural language
+functions, they are more appropriate in natural language
processing activities.
For instance:
diff --git a/devel/sphinx/weave/common.R b/devel/sphinx/weave/common.R
index 050a967ae..180a56b33 100644
--- a/devel/sphinx/weave/common.R
+++ b/devel/sphinx/weave/common.R
@@ -28,4 +28,4 @@ options(scipen=10)
options(showWarnCalls=FALSE)
#options(stringsAsFactors=FALSE) # default in R 4.0
-cat("\n> This tutorial is based on the [paper on *stringi*](https://stringi.gagolewski.com/_static/vignette/stringi.pdf) that will appear in the *Journal of Statistical Software*.\n\n")
+cat("\n> This tutorial is based on the [paper on *stringi*](https://dx.doi.org/10.18637/jss.v103.i02) that has recently been published the *Journal of Statistical Software*, see {cite}`stringi`.\n\n")
diff --git a/devel/sphinx/weave/design_principles.Rmd b/devel/sphinx/weave/design_principles.Rmd
index abf85b0d5..9e7e52b9c 100644
--- a/devel/sphinx/weave/design_principles.Rmd
+++ b/devel/sphinx/weave/design_principles.Rmd
@@ -8,7 +8,7 @@ opts_chunk$set(cache.path="cache/design_principles_")
```
-The API of the early releases of *stringi* has been designed so as to be
+The API of the early releases of *stringi* has been designed to be
fairly compatible with that of the 0.6.2 version of the [*stringr*](https://stringr.tidyverse.org/)
package {cite}`Wickham2010:stringr` (dated 2012[^footstringr]),
with some fixes in the consistency of the
@@ -19,16 +19,16 @@ and not really suitable for natural language processing tasks, all the
functionality has been implemented from the ground up, with the use of
[*ICU*](https://icu.unicode.org/) services wherever applicable.
Since the initial release, an
-abundance of new features has been added and the package can now be
-considered a comprehensive workhorse for text data processing.
+abundance of new features has been added, and the package can now be
+considered a complete workhorse for text data processing.
Note that
-the *stringi* API is stable. Future releases are aiming for as much
+the *stringi* API is stable. Future releases will aim for as much
backward compatibility as possible so that other software projects can
safely rely on it.
[^footstringr]: Interestingly, in 2015 the aforementioned
- *stringr* package has been rewritten as a set of wrappers around some of
+ *stringr* package was rewritten as a set of wrappers around some of
the *stringi* functions instead of the base R ones. In Section 14.7 of
*R for Data Science* {cite}`GrolemundWickham2017:rdatascience` we read:
"*stringr* is useful when you're learning because it exposes a minimal
@@ -80,7 +80,7 @@ Naming
Function and argument names use a combination of lowercase letters and
underscores (and no dots). To avoid namespace clashes, all function
-names feature the "`stri_`" prefix. Names are fairly self-explanatory,
+names feature the "`stri_`" prefix. Names are quite self-explanatory,
e.g., `stri_locate_first_regex` and `stri_locate_all_fixed` find,
respectively, the first match to a regular expression and all
occurrences of a substring as-is.
@@ -126,9 +126,10 @@ function call:
```
Due to vectorisation, we can generally avoid using the `for`- and
-`while`-loops ("for each string in a vector..."), which makes the code
+`while`-loops ("for each string in a vector..."). This can make the code
much more readable, maintainable, and faster to execute.
+
Acting Elementwise with Recycling
---------------------------------
@@ -206,11 +207,11 @@ haystack) as an illustration:
`haystack`, `"b"` in the 2nd row, and `"c"` in the 3rd;
in particular, there are 3 `"a"`s in `"aaa"`, 2 in `"aba"`,
and 1 `"b"` in `"baa"`;
- this is possible due to the fact that matrices are
+ this is possible because matrices are
represented as "flat" vectors of length `nrow*ncol`,
whose elements are read in a column-major (Fortran) order;
therefore, here,
- pattern "a" is being sought in the 1st, 4th, 7th, ...
+ the pattern `"a"` is being sought in the 1st, 4th, 7th, ...
string in `haystack`, i.e., `"aaa"`, `"aba"`, `"aca"`, ...;
pattern `"b"` in the 2nd, 5th, 8th, ... string;
and `"c"` in the 3rd, 6th, 9th, ... one);
@@ -263,18 +264,18 @@ Data Flow
---------
All vector-like arguments (including factors and objects) in *stringi*
-are treated in the same manner: for example, if a function expects a
-character vector on input and an object of other type is provided,
+are treated in the same manner. For example, if a function expects a
+character vector on input and an object of another type is provided,
`as.character()` is called first (we see that in the example above,
"`1:2`" is treated as `c("1", "2")`).
Following {cite}`Wickham2010:stringr`, *stringi* makes sure the output data
types are consistent and that different functions are interoperable.
-This makes operation chaining easier and less error prone.
+This makes operation chaining easier and less error-prone.
For example, `stri_extract_first_regex()` finds the first occurrence of
-a pattern in each string, therefore the output is a character of the
-same length as the input (with recycling rule in place if necessary).
+a pattern in each string. Therefore, the output is a character of the
+same length as the input (with the recycling rule in place if necessary).
```{r }
haystack <- c("bacon", "spam", "jam, spam, bacon, and spam")
@@ -293,7 +294,7 @@ yields a list of character vectors.
stri_extract_all_regex(haystack, "\\b\\w{1,4}\\b", omit_no_match=TRUE)
```
-If the 3rd argument was not specified, a no-match would be represented
+If the 3rd argument were not specified, a no-match would be represented
by a missing value (for consistency with the previous function).
Also, care is taken so that the "data" or "`x`" argument is most often
@@ -319,7 +320,7 @@ yields the same result as in the previous example, but refers to
Further Deviations from Base R
------------------------------
-*stringi* can be used as a replacement of the existing string processing
+*stringi* can be used as a replacement for the existing string processing
functions. Also, it offers many facilities not available in base R.
Except for being fully vectorised with respect to all crucial arguments,
propagating missing values and empty vectors consistently, and following
@@ -330,14 +331,14 @@ counterparts even further.
**Following Unicode Standards.**
Thanks to the comprehensive coverage of the most important services
provided by *ICU*, its users gain access to collation, pattern
-searching, normalisation, transliteration, etc., that follow the recent
+searching, normalisation, transliteration, etc., that follow the current
Unicode standards for text processing in any locale. Due to this, as we
-state in {ref}`Sec:encoding`, all inputs are converted to Unicode and
-outputs are always in UTF-8.
+state in {ref}`Sec:encoding`, all inputs are converted to Unicode.
+Furthermore, all outputs are always in UTF-8.
**Portability Issues in Base R.**
-As we have mentioned in the introduction, base R string operations have
+As mentioned in the introduction, base R string operations have
traditionally been limited in scope. There also might be some issues
with regards to their portability, reasons for which may be plentiful.
For instance, varied versions of the [*PCRE*](https://www.pcre.org/)
@@ -348,7 +349,7 @@ character encoding IDs not fully compatible with that on GNU/Linux: to
select the Polish locale, we are required to pass `"Polish_Poland"` to
`Sys.setlocale()` on Windows whereas `"pl_PL"` on Linux. Interestingly,
R can be built against the system *ICU* so that it uses its Collator for
-comparing strings (e.g., using the "`<=`" operator), however this is
+comparing strings (e.g., using the "`<=`" operator). However, this is
only optional and does not provide access to any other Unicode services.
For example, let us consider the matching of "all letters" by means of
@@ -440,6 +441,6 @@ arguments have been introduced for more detailed tuning.
**Preserving Attributes.**
Generally, *stringi* preserves no object attributes whatsoever, but a
-user can make sure themself that this is becomes the case, e.g., by
+user can make sure themself that this is the case, e.g., by
calling "`x[] <- stri_...(x, ...)`" or
"\``attributes<-`\``(stri_...(x, ...), attributes(x))`".
diff --git a/devel/sphinx/weave/design_principles.md b/devel/sphinx/weave/design_principles.md
index fbeddda65..d61317976 100644
--- a/devel/sphinx/weave/design_principles.md
+++ b/devel/sphinx/weave/design_principles.md
@@ -3,10 +3,10 @@ General Design Principles
-> This tutorial is based on the [paper on *stringi*](https://stringi.gagolewski.com/_static/vignette/stringi.pdf) that will appear in the *Journal of Statistical Software*.
+> This tutorial is based on the [paper on *stringi*](https://dx.doi.org/10.18637/jss.v103.i02) that has recently been published the *Journal of Statistical Software*, see {cite}`stringi`.
-The API of the early releases of *stringi* has been designed so as to be
+The API of the early releases of *stringi* has been designed to be
fairly compatible with that of the 0.6.2 version of the [*stringr*](https://stringr.tidyverse.org/)
package {cite}`Wickham2010:stringr` (dated 2012[^footstringr]),
with some fixes in the consistency of the
@@ -17,16 +17,16 @@ and not really suitable for natural language processing tasks, all the
functionality has been implemented from the ground up, with the use of
[*ICU*](https://icu.unicode.org/) services wherever applicable.
Since the initial release, an
-abundance of new features has been added and the package can now be
-considered a comprehensive workhorse for text data processing.
+abundance of new features has been added, and the package can now be
+considered a complete workhorse for text data processing.
Note that
-the *stringi* API is stable. Future releases are aiming for as much
+the *stringi* API is stable. Future releases will aim for as much
backward compatibility as possible so that other software projects can
safely rely on it.
[^footstringr]: Interestingly, in 2015 the aforementioned
- *stringr* package has been rewritten as a set of wrappers around some of
+ *stringr* package was rewritten as a set of wrappers around some of
the *stringi* functions instead of the base R ones. In Section 14.7 of
*R for Data Science* {cite}`GrolemundWickham2017:rdatascience` we read:
"*stringr* is useful when you're learning because it exposes a minimal
@@ -78,7 +78,7 @@ Naming
Function and argument names use a combination of lowercase letters and
underscores (and no dots). To avoid namespace clashes, all function
-names feature the "`stri_`" prefix. Names are fairly self-explanatory,
+names feature the "`stri_`" prefix. Names are quite self-explanatory,
e.g., `stri_locate_first_regex` and `stri_locate_all_fixed` find,
respectively, the first match to a regular expression and all
occurrences of a substring as-is.
@@ -138,9 +138,10 @@ function call:
```
Due to vectorisation, we can generally avoid using the `for`- and
-`while`-loops ("for each string in a vector..."), which makes the code
+`while`-loops ("for each string in a vector..."). This can make the code
much more readable, maintainable, and faster to execute.
+
Acting Elementwise with Recycling
---------------------------------
@@ -253,11 +254,11 @@ haystack) as an illustration:
`haystack`, `"b"` in the 2nd row, and `"c"` in the 3rd;
in particular, there are 3 `"a"`s in `"aaa"`, 2 in `"aba"`,
and 1 `"b"` in `"baa"`;
- this is possible due to the fact that matrices are
+ this is possible because matrices are
represented as "flat" vectors of length `nrow*ncol`,
whose elements are read in a column-major (Fortran) order;
therefore, here,
- pattern "a" is being sought in the 1st, 4th, 7th, ...
+ the pattern `"a"` is being sought in the 1st, 4th, 7th, ...
string in `haystack`, i.e., `"aaa"`, `"aba"`, `"aca"`, ...;
pattern `"b"` in the 2nd, 5th, 8th, ... string;
and `"c"` in the 3rd, 6th, 9th, ... one);
@@ -341,18 +342,18 @@ Data Flow
---------
All vector-like arguments (including factors and objects) in *stringi*
-are treated in the same manner: for example, if a function expects a
-character vector on input and an object of other type is provided,
+are treated in the same manner. For example, if a function expects a
+character vector on input and an object of another type is provided,
`as.character()` is called first (we see that in the example above,
"`1:2`" is treated as `c("1", "2")`).
Following {cite}`Wickham2010:stringr`, *stringi* makes sure the output data
types are consistent and that different functions are interoperable.
-This makes operation chaining easier and less error prone.
+This makes operation chaining easier and less error-prone.
For example, `stri_extract_first_regex()` finds the first occurrence of
-a pattern in each string, therefore the output is a character of the
-same length as the input (with recycling rule in place if necessary).
+a pattern in each string. Therefore, the output is a character of the
+same length as the input (with the recycling rule in place if necessary).
```r
@@ -382,7 +383,7 @@ stri_extract_all_regex(haystack, "\\b\\w{1,4}\\b", omit_no_match=TRUE)
## [1] "jam" "spam" "and" "spam"
```
-If the 3rd argument was not specified, a no-match would be represented
+If the 3rd argument were not specified, a no-match would be represented
by a missing value (for consistency with the previous function).
Also, care is taken so that the "data" or "`x`" argument is most often
@@ -409,7 +410,7 @@ yields the same result as in the previous example, but refers to
Further Deviations from Base R
------------------------------
-*stringi* can be used as a replacement of the existing string processing
+*stringi* can be used as a replacement for the existing string processing
functions. Also, it offers many facilities not available in base R.
Except for being fully vectorised with respect to all crucial arguments,
propagating missing values and empty vectors consistently, and following
@@ -420,14 +421,14 @@ counterparts even further.
**Following Unicode Standards.**
Thanks to the comprehensive coverage of the most important services
provided by *ICU*, its users gain access to collation, pattern
-searching, normalisation, transliteration, etc., that follow the recent
+searching, normalisation, transliteration, etc., that follow the current
Unicode standards for text processing in any locale. Due to this, as we
-state in {ref}`Sec:encoding`, all inputs are converted to Unicode and
-outputs are always in UTF-8.
+state in {ref}`Sec:encoding`, all inputs are converted to Unicode.
+Furthermore, all outputs are always in UTF-8.
**Portability Issues in Base R.**
-As we have mentioned in the introduction, base R string operations have
+As mentioned in the introduction, base R string operations have
traditionally been limited in scope. There also might be some issues
with regards to their portability, reasons for which may be plentiful.
For instance, varied versions of the [*PCRE*](https://www.pcre.org/)
@@ -438,7 +439,7 @@ character encoding IDs not fully compatible with that on GNU/Linux: to
select the Polish locale, we are required to pass `"Polish_Poland"` to
`Sys.setlocale()` on Windows whereas `"pl_PL"` on Linux. Interestingly,
R can be built against the system *ICU* so that it uses its Collator for
-comparing strings (e.g., using the "`<=`" operator), however this is
+comparing strings (e.g., using the "`<=`" operator). However, this is
only optional and does not provide access to any other Unicode services.
For example, let us consider the matching of "all letters" by means of
@@ -504,10 +505,10 @@ microbenchmark::microbenchmark(
)
## Unit: milliseconds
## expr min lq mean median uq max neval
-## join2 35.055 35.843 47.088 37.069 53.653 85.487 100
-## join3 79.598 80.815 85.678 81.562 91.717 128.963 100
-## r_paste2 90.511 91.802 108.026 97.838 112.744 162.961 100
-## r_paste3 190.622 193.449 229.653 244.515 251.249 280.253 100
+## join2 36.622 37.777 49.123 38.678 49.026 99.879 100
+## join3 83.978 85.402 90.266 86.841 94.784 135.811 100
+## r_paste2 94.275 97.662 112.777 100.228 122.510 173.908 100
+## r_paste3 199.192 208.616 243.586 258.622 266.051 293.781 100
```
Another example -- timings of fixed pattern searching:
@@ -526,12 +527,12 @@ microbenchmark::microbenchmark(
)
## Unit: milliseconds
## expr min lq mean median uq max neval
-## fixed 4.8786 4.9576 5.0491 5.0353 5.0934 5.7554 100
-## regex 113.6029 114.8261 115.5398 115.2599 115.5966 121.5098 100
-## coll 388.7710 392.9170 396.2839 394.3113 396.7988 427.4617 100
-## r_tre 126.5528 127.3853 128.9907 127.8981 128.9863 141.2107 100
-## r_pcre 73.8114 74.4104 75.1880 74.7540 75.1607 80.1232 100
-## r_fixed 52.4524 53.0642 53.6158 53.3177 53.6758 57.4354 100
+## fixed 5.1298 5.2853 5.4114 5.3587 5.4548 6.8839 100
+## regex 121.2871 123.7408 125.0747 124.6750 126.3970 130.3374 100
+## coll 388.8468 392.2507 397.7151 396.5786 402.3163 424.3306 100
+## r_tre 137.4533 139.8173 141.7976 141.6391 143.0787 151.5654 100
+## r_pcre 72.7527 74.3581 75.7499 75.2514 76.4620 87.1007 100
+## r_fixed 41.7046 42.8004 43.5566 43.3757 43.9061 50.8317 100
```
@@ -545,6 +546,6 @@ arguments have been introduced for more detailed tuning.
**Preserving Attributes.**
Generally, *stringi* preserves no object attributes whatsoever, but a
-user can make sure themself that this is becomes the case, e.g., by
+user can make sure themself that this is the case, e.g., by
calling "`x[] <- stri_...(x, ...)`" or
"\``attributes<-`\``(stri_...(x, ...), attributes(x))`".
diff --git a/devel/sphinx/weave/example.Rmd b/devel/sphinx/weave/example.Rmd
index 57dbaaec9..cde26200e 100644
--- a/devel/sphinx/weave/example.Rmd
+++ b/devel/sphinx/weave/example.Rmd
@@ -142,4 +142,3 @@ and so forth.
For the climate data on other cities, very similar operations will need
to be performed -- the whole process of scraping and cleansing data can
be automated quite easily.
-
diff --git a/devel/sphinx/weave/example.md b/devel/sphinx/weave/example.md
index 9a8de0b3f..719e2a070 100644
--- a/devel/sphinx/weave/example.md
+++ b/devel/sphinx/weave/example.md
@@ -2,7 +2,7 @@ Example Use Case: Data Preparation
==================================
-> This tutorial is based on the [paper on *stringi*](https://stringi.gagolewski.com/_static/vignette/stringi.pdf) that will appear in the *Journal of Statistical Software*.
+> This tutorial is based on the [paper on *stringi*](https://dx.doi.org/10.18637/jss.v103.i02) that has recently been published the *Journal of Statistical Software*, see {cite}`stringi`.
What follows is a short case study where we prepare a
web-scraped data set for further processing.
@@ -215,4 +215,3 @@ and so forth.
For the climate data on other cities, very similar operations will need
to be performed -- the whole process of scraping and cleansing data can
be automated quite easily.
-
diff --git a/devel/sphinx/weave/input_output.Rmd b/devel/sphinx/weave/input_output.Rmd
index 7c19045fb..f0da998f0 100644
--- a/devel/sphinx/weave/input_output.Rmd
+++ b/devel/sphinx/weave/input_output.Rmd
@@ -11,8 +11,8 @@ opts_chunk$set(cache.path="cache/input_output_")
This section deals with some more advanced topics related to the
operability of text processing applications between different platforms.
-In particular, we discuss how to assure that data read from various
-input connections are interpreted in the correct manner.
+In particular, we discuss how to ensure that data read from various
+input connections are interpreted correctly.
(Sec:codepoints)=
Dealing with Unicode Code Points
@@ -83,7 +83,8 @@ native encoding, is defined via the `LC_CTYPE` locale category in
`Sys.getlocale()`. This is the representation assumed, e.g., when
reading data from the standard input or from files (e.g., when `scan()`
is called). For instance, Central European versions of Windows will
-assume the "`windows-1250`" code page. MacOS as well as most Linux boxes
+assume the "`windows-1250`" code page. On the other hand,
+MacOS as well as most Linux boxes
work with UTF-8 by default[^footucrt].
All strings in R have an associated encoding mark which can be read by
@@ -98,7 +99,7 @@ thought of as a superset of every other encoding. Moreover, in order to
guarantee the correctness and high performance of the string processing
pipelines, *stringi* always[^footuftout] outputs UTF-8 data.
-[^footucrt]: It is expected that future R releases will support UTF-8 natively
+[^footucrt]: It is expected that future R releases will support UTF-8 natively,
thanks to the Universal C Runtime (UCRT) that is available for
Windows 10.
@@ -163,7 +164,7 @@ stri_enc_detect(x) # based on heuristics
```
Nevertheless, encoding detection is an operation that relies on
-heuristics, therefore there is a chance that the output might be
+heuristics. Therefore, there is a chance that the output might be
imprecise or even misleading.
@@ -210,7 +211,7 @@ stri_enc_toutf32( # code points as decimals
Above we see some example code points before, after NFC, and after NFD
normalisation, respectively.
-It might be a good idea to always normalise all the strings read from
+It might be a good idea always to normalise all the strings read from
external sources (files, URLs) with NFC.
Compatibility composition and decomposition normalisation forms (NFKC
diff --git a/devel/sphinx/weave/input_output.md b/devel/sphinx/weave/input_output.md
index 051ae68e5..b1632e87a 100644
--- a/devel/sphinx/weave/input_output.md
+++ b/devel/sphinx/weave/input_output.md
@@ -4,13 +4,13 @@ Input and Output
-> This tutorial is based on the [paper on *stringi*](https://stringi.gagolewski.com/_static/vignette/stringi.pdf) that will appear in the *Journal of Statistical Software*.
+> This tutorial is based on the [paper on *stringi*](https://dx.doi.org/10.18637/jss.v103.i02) that has recently been published the *Journal of Statistical Software*, see {cite}`stringi`.
This section deals with some more advanced topics related to the
operability of text processing applications between different platforms.
-In particular, we discuss how to assure that data read from various
-input connections are interpreted in the correct manner.
+In particular, we discuss how to ensure that data read from various
+input connections are interpreted correctly.
(Sec:codepoints)=
Dealing with Unicode Code Points
@@ -91,7 +91,8 @@ native encoding, is defined via the `LC_CTYPE` locale category in
`Sys.getlocale()`. This is the representation assumed, e.g., when
reading data from the standard input or from files (e.g., when `scan()`
is called). For instance, Central European versions of Windows will
-assume the "`windows-1250`" code page. MacOS as well as most Linux boxes
+assume the "`windows-1250`" code page. On the other hand,
+MacOS as well as most Linux boxes
work with UTF-8 by default[^footucrt].
All strings in R have an associated encoding mark which can be read by
@@ -106,7 +107,7 @@ thought of as a superset of every other encoding. Moreover, in order to
guarantee the correctness and high performance of the string processing
pipelines, *stringi* always[^footuftout] outputs UTF-8 data.
-[^footucrt]: It is expected that future R releases will support UTF-8 natively
+[^footucrt]: It is expected that future R releases will support UTF-8 natively,
thanks to the Universal C Runtime (UCRT) that is available for
Windows 10.
@@ -155,7 +156,7 @@ Detecting Encodings
If a file's encoding is not known in advance, there are a
certain functions that can aid in encoding detection. First, we can read
-the resource in form of a raw-type vector:
+the resource in the form of a raw-type vector:
```r
@@ -189,7 +190,7 @@ stri_enc_detect(x) # based on heuristics
```
Nevertheless, encoding detection is an operation that relies on
-heuristics, therefore there is a chance that the output might be
+heuristics. Therefore, there is a chance that the output might be
imprecise or even misleading.
@@ -251,7 +252,7 @@ stri_enc_toutf32( # code points as decimals
Above we see some example code points before, after NFC, and after NFD
normalisation, respectively.
-It might be a good idea to always normalise all the strings read from
+It might be a good idea always to normalise all the strings read from
external sources (files, URLs) with NFC.
Compatibility composition and decomposition normalisation forms (NFKC
diff --git a/devel/sphinx/weave/other_operations.Rmd b/devel/sphinx/weave/other_operations.Rmd
index 22ac1bb4f..4ce01f07d 100644
--- a/devel/sphinx/weave/other_operations.Rmd
+++ b/devel/sphinx/weave/other_operations.Rmd
@@ -24,10 +24,10 @@ or words, locating particular text units (e.g., the 3rd sentence), etc.
Generally, text boundary analysis is a locale-sensitive operation
(see the [Unicode Standard Annex \#29: Unicode Text Segmentation](https://unicode.org/reports/tr29/)). For example, in Japanese and Chinese, spaces are
-not used for separation of words -- a line break can occur even in the
+not used for the separation of words -- a line break can occur even in the
middle of a word. Nevertheless, these languages have punctuation and
diacritical marks that cannot start or end a line, so this must also be
-taken into account.
+considered.
The *ICU* Break Iterator
(see the [*ICU* User Guide on Boundary Analysis](https://unicode-org.github.io/icu/userguide/boundaryanalysis/))
@@ -152,9 +152,9 @@ Transliterating
Transliteration, in its broad sense, deals with the substitution of
characters or their groups for different ones, according to some
-well-defined, possibly context-aware, rules. It may be useful, amongst
+well-defined, possibly context-aware, rules. It may be utile, amongst
others, when "normalising" pieces of strings or identifiers so that
-they can be more easily compared with each other.
+they can be more easily compared.
### Case Mapping
@@ -190,8 +190,8 @@ set.seed(12345)
sample(stri_trans_list(), 9) # a few random entries
```
-For example, below we apply a transliteration chain: first, we convert
-to upper case, and then we convert characters in the Latin script to
+In the example below, we apply a transliteration chain: we first convert
+to upper case, and then convert characters in the Latin script to
ASCII.
```{r }
@@ -261,5 +261,5 @@ stri_datetime_format(
locale="ja_JP@calendar=japanese")
```
-Above we have selected the Hebrew calendar within the English locale and
+We have selected the Hebrew calendar within the English locale and
the Japanese calendar in the Japanese locale.
diff --git a/devel/sphinx/weave/other_operations.md b/devel/sphinx/weave/other_operations.md
index 48f31fa31..3b58d9a34 100644
--- a/devel/sphinx/weave/other_operations.md
+++ b/devel/sphinx/weave/other_operations.md
@@ -3,7 +3,7 @@ Other Operations
================
-> This tutorial is based on the [paper on *stringi*](https://stringi.gagolewski.com/_static/vignette/stringi.pdf) that will appear in the *Journal of Statistical Software*.
+> This tutorial is based on the [paper on *stringi*](https://dx.doi.org/10.18637/jss.v103.i02) that has recently been published the *Journal of Statistical Software*, see {cite}`stringi`.
In the sequel, we cover the functions that deal with text boundary
@@ -22,10 +22,10 @@ or words, locating particular text units (e.g., the 3rd sentence), etc.
Generally, text boundary analysis is a locale-sensitive operation
(see the [Unicode Standard Annex \#29: Unicode Text Segmentation](https://unicode.org/reports/tr29/)). For example, in Japanese and Chinese, spaces are
-not used for separation of words -- a line break can occur even in the
+not used for the separation of words -- a line break can occur even in the
middle of a word. Nevertheless, these languages have punctuation and
diacritical marks that cannot start or end a line, so this must also be
-taken into account.
+considered.
The *ICU* Break Iterator
(see the [*ICU* User Guide on Boundary Analysis](https://unicode-org.github.io/icu/userguide/boundaryanalysis/))
@@ -180,9 +180,9 @@ Transliterating
Transliteration, in its broad sense, deals with the substitution of
characters or their groups for different ones, according to some
-well-defined, possibly context-aware, rules. It may be useful, amongst
+well-defined, possibly context-aware, rules. It may be utile, amongst
others, when "normalising" pieces of strings or identifiers so that
-they can be more easily compared with each other.
+they can be more easily compared.
### Case Mapping
@@ -227,8 +227,8 @@ sample(stri_trans_list(), 9) # a few random entries
## [9] "Any-Kana"
```
-For example, below we apply a transliteration chain: first, we convert
-to upper case, and then we convert characters in the Latin script to
+In the example below, we apply a transliteration chain: we first convert
+to upper case, and then convert characters in the Latin script to
ASCII.
@@ -291,7 +291,7 @@ stri_datetime_format(
stri_datetime_add(stri_datetime_now(), 1, "day"), # add 1 day to 'now'
"datetime_relative_long", # full format, relative to 'now'
locale="en_NZ", tz="NZ")
-## [1] "tomorrow at 5:55:52 pm NZDT"
+## [1] "tomorrow at 7:16:04 pm NZST"
```
For example, here's how we can access different calendars:
@@ -312,5 +312,5 @@ stri_datetime_format(
## [1] "令和2年2月4日火曜日" "令和2年8月7日金曜日"
```
-Above we have selected the Hebrew calendar within the English locale and
+We have selected the Hebrew calendar within the English locale and
the Japanese calendar in the Japanese locale.
diff --git a/devel/sphinx/weave/regular_expressions.Rmd b/devel/sphinx/weave/regular_expressions.Rmd
index c57ba07b5..3f89d71f4 100644
--- a/devel/sphinx/weave/regular_expressions.Rmd
+++ b/devel/sphinx/weave/regular_expressions.Rmd
@@ -9,8 +9,8 @@ opts_chunk$set(cache.path="cache/regular_expressions_")
```
-Regular expressions (regexes) provide us with a concise grammar for
-defining systematic patterns which can be sought in character strings.
+Regular expressions (regexes) provide concise grammar for
+defining systematic patterns that can be sought in character strings.
Examples of such patterns include: specific fixed substrings, emojis of
any kind, stand-alone sequences of lower-case Latin letters ("words"),
substrings that can be interpreted as real numbers (with or without
@@ -19,7 +19,8 @@ addresses, or URLs.
Theoretically, the concept of regular pattern matching dates back to the
so-called regular languages and finite state automata {cite}`kleene`, see
-also {cite}`hopcroftullman` and {cite}`automata`. Regexes in the form as we know today
+also {cite}`hopcroftullman` and {cite}`automata`. Regexes in the form
+as we know it today
have already been present in one of the pre-Unix implementations of the
command-line text editor *qed* ({cite}`qed`; the predecessor of the
well-known *sed*).
@@ -50,7 +51,7 @@ patterns). On the other hand, *ICU* regexes fully conform to the
[Unicode Technical Standard \#18](https://www.unicode.org/reports/tr18/)
and hence provide comprehensive support for Unicode.
-It is worth noting that most programming languages as well as advanced
+It is worth noting that most programming languages and advanced
text editors and development environments
(including [*Kate*](https://kate-editor.org/),
[*Eclipse*](https://www.eclipse.org/ide/),
@@ -131,7 +132,7 @@ stri_detect_regex("groß", "GROSS", case_insensitive=TRUE)
If we wish to include a special character as part of a regular
expression -- so that it is treated literally -- we'll need to escape
it with a backslash, "\\". Yet, the backlash itself has a special
-meaning to R, see `help("Quotes")`, therefore it needs to be preceded by
+meaning to R, see `help("Quotes")`. Therefore it needs to be preceded by
another backslash.
```{r }
@@ -157,7 +158,7 @@ The above matches non-overlapping length-4 substrings that end with
"`am`".
The dot's insensitivity to the newline character is motivated by the
-need to maintain the compatibility with tools such as *grep* (when
+need to maintain compatibility with tools such as *grep* (when
searching within text files in a line-by-line manner). This behaviour
can be altered by setting the `dot_all` option to `TRUE`.
@@ -210,8 +211,8 @@ stri_extract_all_regex(x, "[^ ][^ ][^ ]")
### Defining Code Point Ranges
Each Unicode code point can be referenced by its unique numeric
-identifier, see {ref}`Sec:codepoints` for more details. For instance, "`a`" is
-assigned code U+0061 and "`z`" is mapped to U+007A. In the pre-Unicode
+identifier; see {ref}`Sec:codepoints` for more details. For instance, "`a`" is
+assigned code U+0061, and "`z`" is mapped to U+007A. In the pre-Unicode
era (mostly with regards to the ASCII codes, ≤ U+007F, representing
English letters, decimal digits, some punctuation characters, and a few
control characters), we were used to relying on specific code ranges;
@@ -226,8 +227,8 @@ stri_extract_all_regex("In 2020, Gągolewski had fun once.", "[0-9A-Za-z]")
The above pattern denotes a union of 3 code ranges: digits and ASCII
upper- and lowercase letters.
-Nowadays, in the processing of text in natural languages, this notation
-should rather be avoided. Note the missing "`ą`" (Polish "`a`" with
+Nowadays, when processing text in natural languages, this notation
+should be avoided. Note the missing "`ą`" (Polish "`a`" with
ogonek) in the result.
### Using Predefined Character Sets
@@ -266,7 +267,7 @@ denote their complements.
### Avoiding POSIX Classes
-The use of the POSIX-like character classes should be avoided, because
+The use of the POSIX-like character classes should be avoided because
they are generally not well-defined.
In particular, in POSIX-like regex engines, "`[:punct:]`" stands for the
@@ -316,12 +317,12 @@ x <- "spam, egg, ham, jam, algae, and an amalgam of spam, all al dente"
stri_extract_all_regex(x, "spam|ham")
```
-"`|`" has a very low precedence. Therefore, if we wish to introduce an
+"`|`" has very low precedence. Therefore, if we wish to introduce an
alternative of subexpressions, we need to group them, e.g., between
round brackets. For instance, "`(sp|h)am`" matches either "`spam`"
or "`ham`".
-Note that this has the side-effect of creating new capturing groups,
+Note that this has the side-effect of creating new capturing groups;
see {ref}`Sec:Capturing`.
@@ -372,7 +373,7 @@ Quantifiers
-----------
More often than not, a variable number of instances of the same
-subexpression needs to be captured or its presence should be made
+subexpression needs to be captured, or its presence should be
optional. This can be achieved by means of the following quantifiers:
- "`?`" matches 0 or 1 times;
@@ -426,7 +427,7 @@ stri_extract_all_regex("12, 34.5, 678.901234, 37...629, ...",
```
Here, the first regex matches digits, a dot, and another series of
-digits. The second one finds digits which are possibly (but not
+digits. The second one finds digits that are possibly (but not
necessarily) followed by a dot and a digit sequence.
### Performance Notes
@@ -547,13 +548,13 @@ syntax (the angle brackets are part of the token), as in, e.g.,
Anchoring
---------
-Lastly, let's mention the ways to match a pattern at a given abstract
+Lastly, let's mention how to match a pattern at a given abstract
position within a string.
### Matching at the Beginning or End of a String
-"`^`" and "`$`" match, respectively, start and end of the string (or
+"`^`" and "`$`" match, respectively, the start and the end of the string (or
each line within a string, if the `multi_line` option is set to `TRUE`).
```{r }
diff --git a/devel/sphinx/weave/regular_expressions.md b/devel/sphinx/weave/regular_expressions.md
index 487f8c7d8..7dc213da9 100644
--- a/devel/sphinx/weave/regular_expressions.md
+++ b/devel/sphinx/weave/regular_expressions.md
@@ -4,11 +4,11 @@ Regular Expressions
-> This tutorial is based on the [paper on *stringi*](https://stringi.gagolewski.com/_static/vignette/stringi.pdf) that will appear in the *Journal of Statistical Software*.
+> This tutorial is based on the [paper on *stringi*](https://dx.doi.org/10.18637/jss.v103.i02) that has recently been published the *Journal of Statistical Software*, see {cite}`stringi`.
-Regular expressions (regexes) provide us with a concise grammar for
-defining systematic patterns which can be sought in character strings.
+Regular expressions (regexes) provide concise grammar for
+defining systematic patterns that can be sought in character strings.
Examples of such patterns include: specific fixed substrings, emojis of
any kind, stand-alone sequences of lower-case Latin letters ("words"),
substrings that can be interpreted as real numbers (with or without
@@ -17,7 +17,8 @@ addresses, or URLs.
Theoretically, the concept of regular pattern matching dates back to the
so-called regular languages and finite state automata {cite}`kleene`, see
-also {cite}`hopcroftullman` and {cite}`automata`. Regexes in the form as we know today
+also {cite}`hopcroftullman` and {cite}`automata`. Regexes in the form
+as we know it today
have already been present in one of the pre-Unix implementations of the
command-line text editor *qed* ({cite}`qed`; the predecessor of the
well-known *sed*).
@@ -48,7 +49,7 @@ patterns). On the other hand, *ICU* regexes fully conform to the
[Unicode Technical Standard \#18](https://www.unicode.org/reports/tr18/)
and hence provide comprehensive support for Unicode.
-It is worth noting that most programming languages as well as advanced
+It is worth noting that most programming languages and advanced
text editors and development environments
(including [*Kate*](https://kate-editor.org/),
[*Eclipse*](https://www.eclipse.org/ide/),
@@ -133,7 +134,7 @@ stri_detect_regex("groß", "GROSS", case_insensitive=TRUE)
If we wish to include a special character as part of a regular
expression -- so that it is treated literally -- we'll need to escape
it with a backslash, "\\". Yet, the backlash itself has a special
-meaning to R, see `help("Quotes")`, therefore it needs to be preceded by
+meaning to R, see `help("Quotes")`. Therefore it needs to be preceded by
another backslash.
@@ -164,7 +165,7 @@ The above matches non-overlapping length-4 substrings that end with
"`am`".
The dot's insensitivity to the newline character is motivated by the
-need to maintain the compatibility with tools such as *grep* (when
+need to maintain compatibility with tools such as *grep* (when
searching within text files in a line-by-line manner). This behaviour
can be altered by setting the `dot_all` option to `TRUE`.
@@ -226,8 +227,8 @@ stri_extract_all_regex(x, "[^ ][^ ][^ ]")
### Defining Code Point Ranges
Each Unicode code point can be referenced by its unique numeric
-identifier, see {ref}`Sec:codepoints` for more details. For instance, "`a`" is
-assigned code U+0061 and "`z`" is mapped to U+007A. In the pre-Unicode
+identifier; see {ref}`Sec:codepoints` for more details. For instance, "`a`" is
+assigned code U+0061, and "`z`" is mapped to U+007A. In the pre-Unicode
era (mostly with regards to the ASCII codes, ≤ U+007F, representing
English letters, decimal digits, some punctuation characters, and a few
control characters), we were used to relying on specific code ranges;
@@ -246,8 +247,8 @@ stri_extract_all_regex("In 2020, Gągolewski had fun once.", "[0-9A-Za-z]")
The above pattern denotes a union of 3 code ranges: digits and ASCII
upper- and lowercase letters.
-Nowadays, in the processing of text in natural languages, this notation
-should rather be avoided. Note the missing "`ą`" (Polish "`a`" with
+Nowadays, when processing text in natural languages, this notation
+should be avoided. Note the missing "`ą`" (Polish "`a`" with
ogonek) in the result.
### Using Predefined Character Sets
@@ -313,7 +314,7 @@ denote their complements.
### Avoiding POSIX Classes
-The use of the POSIX-like character classes should be avoided, because
+The use of the POSIX-like character classes should be avoided because
they are generally not well-defined.
In particular, in POSIX-like regex engines, "`[:punct:]`" stands for the
@@ -378,12 +379,12 @@ stri_extract_all_regex(x, "spam|ham")
## [1] "spam" "ham" "spam"
```
-"`|`" has a very low precedence. Therefore, if we wish to introduce an
+"`|`" has very low precedence. Therefore, if we wish to introduce an
alternative of subexpressions, we need to group them, e.g., between
round brackets. For instance, "`(sp|h)am`" matches either "`spam`"
or "`ham`".
-Note that this has the side-effect of creating new capturing groups,
+Note that this has the side-effect of creating new capturing groups;
see {ref}`Sec:Capturing`.
@@ -442,7 +443,7 @@ Quantifiers
-----------
More often than not, a variable number of instances of the same
-subexpression needs to be captured or its presence should be made
+subexpression needs to be captured, or its presence should be
optional. This can be achieved by means of the following quantifiers:
- "`?`" matches 0 or 1 times;
@@ -514,7 +515,7 @@ stri_extract_all_regex("12, 34.5, 678.901234, 37...629, ...",
```
Here, the first regex matches digits, a dot, and another series of
-digits. The second one finds digits which are possibly (but not
+digits. The second one finds digits that are possibly (but not
necessarily) followed by a dot and a digit sequence.
### Performance Notes
@@ -679,13 +680,13 @@ syntax (the angle brackets are part of the token), as in, e.g.,
Anchoring
---------
-Lastly, let's mention the ways to match a pattern at a given abstract
+Lastly, let's mention how to match a pattern at a given abstract
position within a string.
### Matching at the Beginning or End of a String
-"`^`" and "`$`" match, respectively, start and end of the string (or
+"`^`" and "`$`" match, respectively, the start and the end of the string (or
each line within a string, if the `multi_line` option is set to `TRUE`).
diff --git a/docs/_sources/index.rst.txt b/docs/_sources/index.rst.txt
index 514d647db..2f664dad7 100644
--- a/docs/_sources/index.rst.txt
+++ b/docs/_sources/index.rst.txt
@@ -1,5 +1,5 @@
-stringi: THE String Processing Package for R
-============================================
+stringi: Fast and Portable Character String Processing in R
+===========================================================
**stringi (pronounced “stringy”, IPA [strinɡi]) is THE R package
for very fast, portable, correct, consistent, and convenient string/text
@@ -9,9 +9,8 @@ stringi: THE String Processing Package for R
-Thanks to `ICU `_,
-*stringi* fully supports a wide range
-of `Unicode `_ standards
+Thanks to `ICU `_, *stringi* fully supports a wide
+range of `Unicode `_ standards
(see also `this video `_).
.. code-block:: r
@@ -73,8 +72,14 @@ The contributions from Bartłomiej Tartanus and
`many others `_
is greatly appreciated. Thanks!
-Also check out `stringx `_
-for a set of wrappers around *stringi* with a base R-compatible API.
+**See also**: `stringx `_ –
+a set of wrappers around *stringi* with a base R-compatible API.
+
+**Citation**: Gagolewski M.,
+*stringi*: Fast and portable character string processing in R,
+*Journal of Statistical Software* 103(2), 2022, 1–59,
+`doi:10.18637/jss.v103.i02 `_.
+
.. COMMENT
@@ -134,6 +139,7 @@ for a set of wrappers around *stringi* with a base R-compatible API.
Source Code (GitHub)
Bug Tracker and Feature Suggestions
CRAN Entry
+ JStatSoft Paper
Author's Homepage
C++ API — Rcpp Example
news.md
diff --git a/docs/_sources/install.md.txt b/docs/_sources/install.md.txt
index ba4045edd..0fefbdf0a 100644
--- a/docs/_sources/install.md.txt
+++ b/docs/_sources/install.md.txt
@@ -14,10 +14,10 @@ install.packages("stringi")
However, due to the overwhelming complexity of the ICU4C library,
upon which *stringi* is based, and the colourful diversity of operating systems,
-their flavors, and particular setups, some users may still experience
+their flavours, and particular setups, some users may still experience
a few issues that hopefully can be resolved with the help of this short manual.
-Also, some additional build tweaks are possible in case we require a more
+Also, some additional build tweaks are possible if we require a more
customised installation.
@@ -32,25 +32,25 @@ If we install the package from sources and either:
the `libicu-devel` rpm on Fedora/CentOS/OpenSUSE,
`libicu-dev` on Ubuntu/Debian, etc.),
-* `pkg-config` is fails to find appropriate build settings
+* `pkg-config` fails to find appropriate build settings
for ICU-based projects, or
* `R CMD INSTALL` is called with the `--configure-args='--disable-pkg-config'`
- argument or environment variable `STRINGI_DISABLE_PKG_CONFIG` is
+ argument, or environment variable `STRINGI_DISABLE_PKG_CONFIG` is
set to non-zero or
`install.packages("stringi", configure.args="--disable-pkg-config")`
is executed,
then ICU will be built together with stringi.
A custom subset of ICU4C 69.1 is shipped with the package.
-We also include ICU4C 55.1 that can be used as a fallback version
+We also include ICU4C 55.1 which can be used as a fallback version
(e.g., on older Solaris boxes).
> To get the most out of stringi, you are strongly encouraged to rely on our
> ICU4C package bundle. This ensures maximum portability across all platforms
> (Windows and macOS users by default fetch the pre-compiled binaries
-> from CRAN built exactly this way).
+> from CRAN built precisely this way).
@@ -59,7 +59,7 @@ We also include ICU4C 55.1 that can be used as a fallback version
Note that if you choose to use our ICU4C bundle, then -- by default -- the
ICU data library will be downloaded from one of our mirror servers.
However, if you have already downloaded a version of `icudt*.zip` suitable
-for your platform (big/little endian), you may wish to install the
+for your platform (big/little-endian), you may wish to install the
package by calling:
```r
@@ -110,12 +110,13 @@ install.packages("stringi", configure.args="--with-extra-cxxflags='-std=c++11'")
```
Overall, your build chain may be misconfigured, check out,
-amongst others, `/etc/Makeconf`
-(e.g., are you using `-std=gnu++11` instead of `-std=c++11`?). Refer to
-https://cran.r-project.org/doc/manuals/r-release/R-admin.html for more details.
+amongst others, `/etc/Makeconf` (e.g., are you using
+`-std=gnu++11` instead of `-std=c++11`?). Refer to
+
+for more details.
-There is an option of using the fallback version of ICU4C 55.1
-which however requires the support of the `long long` type in a few functions,
+There is an option of using the fallback version of ICU4C 55.1.
+However, it requires the support of the `long long` type in a few functions,
(this is not part of the C++98 standard; works on Solaris, though). Try:
```r
@@ -154,7 +155,7 @@ Some influential environment variables:
path relative to `/src`; defaults to `icuXX/data`.
* `PKG_CONFIG_PATH`: An optional list of directories to search for
- `pkg-config`s `.pc` files.
+ `pkg-config`'s `.pc` files.
* `R_HOME`: Override the R directory, e.g.,
`/usr/lib64/R`. Note that `$R_HOME/bin/R` point to the R executable.
@@ -162,15 +163,15 @@ Some influential environment variables:
* `CAT`: The `cat` command used to generate the list of source files to compile.
* `PKG_CONFIG`:The `pkg-config` command used to fetch the necessary compiler
- flags to link to and existing `libicu` installation.
+ flags to link to the existing `libicu` installation.
-* `STRINGI_DISABLE_CXX11`: Disable C++11,
+* `STRINGI_DISABLE_CXX11`: Disable C++11;
see also `--disable-cxx11`.
-* `STRINGI_DISABLE_PKG_CONFIG`: Compile ICU from sources,
+* `STRINGI_DISABLE_PKG_CONFIG`: Compile ICU from sources;
see also `--disable-pkg-config`.
-* `STRINGI_DISABLE_ICU_BUNDLE`: Enforce system ICU,
+* `STRINGI_DISABLE_ICU_BUNDLE`: Enforce system ICU;
see also `--disable-icu-bundle`.
* `STRINGI_CFLAGS`: see `--with-extra-cflags`.
@@ -190,7 +191,7 @@ Some influential environment variables:
We expect that with a correctly configured C++11 compiler and properly
installed system ICU4C distribution, you should face no problems
-with installing the package, especially if you use our ICU4C bundle and you
+installing the package, especially if you use our ICU4C bundle and
have a working internet access.
If you do not manage to set up a successful stringi build, do not
diff --git a/docs/_sources/news.md.txt b/docs/_sources/news.md.txt
index d14e003e9..b12b54446 100644
--- a/docs/_sources/news.md.txt
+++ b/docs/_sources/news.md.txt
@@ -1,5 +1,21 @@
# What Is New in *stringi*
+
+## 1.7.7 (2022-07-02)
+
+* [DOCUMENTATION] Paper on *stringi* has been published in
+ the *Journal of Statistical Software*, see .
+
+* [BUGFIX] #473, #397: Fixed buffer overflow in `stri_dup`.
+ `stri_dup`, `stri_paste`, ... fail more graciously on attempts to
+ generate strings of length >= 2^31 each.
+
+* [BUILD TIME] #480: Using `Rf_isNull` instead of `isNull`.
+
+* [DOCUMENTATION] #462: That the `numeric=TRUE` collator
+ does not handle negative numbers correctly is now mentioned in the manual.
+
+
## 1.7.6 (2021-11-29)
* [BUILD TIME] #463: Added loongarch support in ICU's double conversion
@@ -328,9 +344,9 @@ documentation object `stri_datetime_format`: `...`
* [BUGFIX] #319: Fixed overflow in `stri_rand_shuffle()`.
-* [BUGFIX] #337: Empty search patters in search functions (e.g.,
+* [BUGFIX] #337: Empty search patterns in search functions (e.g.,
`stri_split_regex()` and `stri_count_fixed()`) used to raise
- too many warnings on empty search patters.
+ too many warnings on empty search patterns.
## 1.2.4 (2018-07-20)
diff --git a/docs/_sources/rapi/about_arguments.md.txt b/docs/_sources/rapi/about_arguments.md.txt
index 27c984193..609fbb732 100644
--- a/docs/_sources/rapi/about_arguments.md.txt
+++ b/docs/_sources/rapi/about_arguments.md.txt
@@ -1,4 +1,4 @@
-# about\_arguments: Passing Arguments to Functions in stringi
+# about_arguments: Passing Arguments to Functions in stringi
## Description
@@ -40,4 +40,6 @@ Generally, all our functions drop input objects\' attributes (e.g., [`names`](ht
The official online manual of stringi at
-Other stringi\_general\_topics: [`about_encoding`](about_encoding.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_charclass`](about_search_charclass.md), [`about_search_coll`](about_search_coll.md), [`about_search_fixed`](about_search_fixed.md), [`about_search_regex`](about_search_regex.md), [`about_search`](about_search.md), [`about_stringi`](about_stringi.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other stringi_general_topics: [`about_encoding`](about_encoding.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_charclass`](about_search_charclass.md), [`about_search_coll`](about_search_coll.md), [`about_search_fixed`](about_search_fixed.md), [`about_search_regex`](about_search_regex.md), [`about_search`](about_search.md), [`about_stringi`](about_stringi.md)
diff --git a/docs/_sources/rapi/about_encoding.md.txt b/docs/_sources/rapi/about_encoding.md.txt
index 897645c27..0f1a08eb6 100644
--- a/docs/_sources/rapi/about_encoding.md.txt
+++ b/docs/_sources/rapi/about_encoding.md.txt
@@ -1,4 +1,4 @@
-# about\_encoding: Character Encodings and stringi
+# about_encoding: Character Encodings and stringi
## Description
@@ -52,7 +52,7 @@ Moreover, there are two other cases:
- ASCII -- for strings consisting only of byte codes not greater than 127;
-- `native` (a.k.a. `unknown` in [`Encoding`](https://stat.ethz.ch/R-manual/R-devel/library/base/html/Encoding.html); quite a misleading name: no explicit encoding mark) -- for strings that are assumed to be in your platform\'s native (default) encoding. This can represent UTF-8 if you are an OS X user, or some 8-bit Windows code page, for example. The native encoding used by **R** may be determined by examining the LC\_CTYPE category, see [`Sys.getlocale`](https://stat.ethz.ch/R-manual/R-devel/library/base/html/locales.html).
+- `native` (a.k.a. `unknown` in [`Encoding`](https://stat.ethz.ch/R-manual/R-devel/library/base/html/Encoding.html); quite a misleading name: no explicit encoding mark) -- for strings that are assumed to be in your platform\'s native (default) encoding. This can represent UTF-8 if you are an OS X user, or some 8-bit Windows code page, for example. The native encoding used by **R** may be determined by examining the LC_CTYPE category, see [`Sys.getlocale`](https://stat.ethz.ch/R-manual/R-devel/library/base/html/locales.html).
Intuitively, "native" strings result from reading a string from stdin (e.g., keyboard input). This makes sense: your operating system works in some encoding and provides **R** with some data.
@@ -64,7 +64,7 @@ Finally, note that ICU. Note that converter names are case-insensitive and ICU tries to normalize the encoding specifiers. Leading zeroes are ignored in sequences of digits (if further digits follow), and all non-alphanumeric characters are ignored. Thus the strings \'UTF-8\', \'utf\_8\', \'u\*Tf08\' and \'Utf 8\' are equivalent.
+Apart from automatic conversion from the native encoding, you may re-encode a string manually, for example when you read it from a file created on a different platform. Call [`stri_enc_list`](stri_enc_list.md) for the list of encodings supported by ICU. Note that converter names are case-insensitive and ICU tries to normalize the encoding specifiers. Leading zeroes are ignored in sequences of digits (if further digits follow), and all non-alphanumeric characters are ignored. Thus the strings \'UTF-8\', \'utf_8\', \'u\*Tf08\' and \'Utf 8\' are equivalent.
The [`stri_encode`](stri_encode.md) function allows you to convert between any given encodings (in some cases you will obtain `bytes`-marked strings, or even lists of raw vectors (i.e., for UTF-16). There are also some useful more specialized functions, like [`stri_enc_toutf32`](stri_enc_toutf32.md) (converts a character vector to a list of integers, where one code point is exactly one numeric value) or [`stri_enc_toascii`](stri_enc_toascii.md) (substitutes all non-ASCII bytes with the SUBSTITUTE CHARACTER, which plays a similar role as **R**\'s `NA` value).
@@ -96,10 +96,12 @@ Check out [`stri_enc_detect`](stri_enc_detect.md) (among others) for a useful fu
The official online manual of stringi at
-Other stringi\_general\_topics: [`about_arguments`](about_arguments.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_charclass`](about_search_charclass.md), [`about_search_coll`](about_search_coll.md), [`about_search_fixed`](about_search_fixed.md), [`about_search_regex`](about_search_regex.md), [`about_search`](about_search.md), [`about_stringi`](about_stringi.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
-Other encoding\_management: [`stri_enc_info()`](stri_enc_info.md), [`stri_enc_list()`](stri_enc_list.md), [`stri_enc_mark()`](stri_enc_mark.md), [`stri_enc_set()`](stri_enc_set.md)
+Other stringi_general_topics: [`about_arguments`](about_arguments.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_charclass`](about_search_charclass.md), [`about_search_coll`](about_search_coll.md), [`about_search_fixed`](about_search_fixed.md), [`about_search_regex`](about_search_regex.md), [`about_search`](about_search.md), [`about_stringi`](about_stringi.md)
-Other encoding\_detection: [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_enc_detect()`](stri_enc_detect.md), [`stri_enc_isascii()`](stri_enc_isascii.md), [`stri_enc_isutf16be()`](stri_enc_isutf16.md), [`stri_enc_isutf8()`](stri_enc_isutf8.md)
+Other encoding_management: [`stri_enc_info()`](stri_enc_info.md), [`stri_enc_list()`](stri_enc_list.md), [`stri_enc_mark()`](stri_enc_mark.md), [`stri_enc_set()`](stri_enc_set.md)
-Other encoding\_conversion: [`stri_enc_fromutf32()`](stri_enc_fromutf32.md), [`stri_enc_toascii()`](stri_enc_toascii.md), [`stri_enc_tonative()`](stri_enc_tonative.md), [`stri_enc_toutf32()`](stri_enc_toutf32.md), [`stri_enc_toutf8()`](stri_enc_toutf8.md), [`stri_encode()`](stri_encode.md)
+Other encoding_detection: [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_enc_detect()`](stri_enc_detect.md), [`stri_enc_isascii()`](stri_enc_isascii.md), [`stri_enc_isutf16be()`](stri_enc_isutf16.md), [`stri_enc_isutf8()`](stri_enc_isutf8.md)
+
+Other encoding_conversion: [`stri_enc_fromutf32()`](stri_enc_fromutf32.md), [`stri_enc_toascii()`](stri_enc_toascii.md), [`stri_enc_tonative()`](stri_enc_tonative.md), [`stri_enc_toutf32()`](stri_enc_toutf32.md), [`stri_enc_toutf8()`](stri_enc_toutf8.md), [`stri_encode()`](stri_encode.md)
diff --git a/docs/_sources/rapi/about_locale.md.txt b/docs/_sources/rapi/about_locale.md.txt
index 866f9baea..051b1e85e 100644
--- a/docs/_sources/rapi/about_locale.md.txt
+++ b/docs/_sources/rapi/about_locale.md.txt
@@ -1,4 +1,4 @@
-# about\_locale: Locales and stringi
+# about_locale: Locales and stringi
## Description
@@ -10,11 +10,11 @@ Because a locale is just an identifier of a region, no validity check is perform
## Locale Identifiers
-ICU services are parametrized by locale, to deliver culturally correct results. Locales are identified by character strings of the form `Language` code, `Language_Country` code, or `Language_Country_Variant` code, e.g., \'en\_US\'.
+ICU services are parametrized by locale, to deliver culturally correct results. Locales are identified by character strings of the form `Language` code, `Language_Country` code, or `Language_Country_Variant` code, e.g., \'en_US\'.
The two-letter `Language` code uses the ISO-639-1 standard, e.g., \'en\' stands for English, \'pl\' -- Polish, \'fr\' -- French, and \'de\' for German.
-`Country` is a two-letter code following the ISO-3166 standard. This is to reflect different language conventions within the same language, for example in US-English (\'en\_US\') and Australian-English (\'en\_AU\').
+`Country` is a two-letter code following the ISO-3166 standard. This is to reflect different language conventions within the same language, for example in US-English (\'en_US\') and Australian-English (\'en_AU\').
Differences may also appear in language conventions used within the same country. For example, the Euro currency may be used in several European countries while the individual country\'s currency is still in circulation. In such a case, ICU `Variant` \'\_EURO\' could be used for selecting locales that support the Euro currency.
@@ -50,8 +50,10 @@ Other locale-sensitive functions include, e.g., [`stri_trans_tolower`](stri_tran
The official online manual of stringi at
-Other locale\_management: [`stri_locale_info()`](stri_locale_info.md), [`stri_locale_list()`](stri_locale_list.md), [`stri_locale_set()`](stri_locale_set.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
-Other locale\_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
+Other locale_management: [`stri_locale_info()`](stri_locale_info.md), [`stri_locale_list()`](stri_locale_list.md), [`stri_locale_set()`](stri_locale_set.md)
-Other stringi\_general\_topics: [`about_arguments`](about_arguments.md), [`about_encoding`](about_encoding.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_charclass`](about_search_charclass.md), [`about_search_coll`](about_search_coll.md), [`about_search_fixed`](about_search_fixed.md), [`about_search_regex`](about_search_regex.md), [`about_search`](about_search.md), [`about_stringi`](about_stringi.md)
+Other locale_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
+
+Other stringi_general_topics: [`about_arguments`](about_arguments.md), [`about_encoding`](about_encoding.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_charclass`](about_search_charclass.md), [`about_search_coll`](about_search_coll.md), [`about_search_fixed`](about_search_fixed.md), [`about_search_regex`](about_search_regex.md), [`about_search`](about_search.md), [`about_stringi`](about_stringi.md)
diff --git a/docs/_sources/rapi/about_search.md.txt b/docs/_sources/rapi/about_search.md.txt
index 70877f0cc..816032a02 100644
--- a/docs/_sources/rapi/about_search.md.txt
+++ b/docs/_sources/rapi/about_search.md.txt
@@ -1,4 +1,4 @@
-# about\_search: String Searching
+# about_search: String Searching
## Description
@@ -8,15 +8,15 @@ This man page explains how to perform string search-based operations in stringi.
-- `stri_*_regex` -- ICU\'s regular expressions (regexes), see [about\_search\_regex](about_search_regex.md),
+- `stri_*_regex` -- ICU\'s regular expressions (regexes), see [about_search_regex](about_search_regex.md),
-- `stri_*_fixed` -- locale-independent byte-wise pattern matching, see [about\_search\_fixed](about_search_fixed.md),
+- `stri_*_fixed` -- locale-independent byte-wise pattern matching, see [about_search_fixed](about_search_fixed.md),
-- `stri_*_coll` -- ICU\'s `StringSearch`, locale-sensitive, Collator-based pattern search, useful for natural language processing tasks, see [about\_search\_coll](about_search_coll.md),
+- `stri_*_coll` -- ICU\'s `StringSearch`, locale-sensitive, Collator-based pattern search, useful for natural language processing tasks, see [about_search_coll](about_search_coll.md),
-- `stri_*_charclass` -- character classes search, e.g., Unicode General Categories or Binary Properties, see [about\_search\_charclass](about_search_charclass.md),
+- `stri_*_charclass` -- character classes search, e.g., Unicode General Categories or Binary Properties, see [about_search_charclass](about_search_charclass.md),
-- `stri_*_boundaries` -- text boundary analysis, see [about\_search\_boundaries](about_search_boundaries.md)
+- `stri_*_boundaries` -- text boundary analysis, see [about_search_boundaries](about_search_boundaries.md)
Each search engine is able to perform many search-based operations. These may include:
@@ -44,28 +44,30 @@ Each search engine is able to perform many search-based operations. These may in
The official online manual of stringi at
-Other text\_boundaries: [`about_search_boundaries`](about_search_boundaries.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_brkiter()`](stri_opts_brkiter.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_split_lines()`](stri_split_lines.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_wrap()`](stri_wrap.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
-Other search\_regex: [`about_search_regex`](about_search_regex.md), [`stri_opts_regex()`](stri_opts_regex.md)
+Other text_boundaries: [`about_search_boundaries`](about_search_boundaries.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_brkiter()`](stri_opts_brkiter.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_split_lines()`](stri_split_lines.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_wrap()`](stri_wrap.md)
-Other search\_fixed: [`about_search_fixed`](about_search_fixed.md), [`stri_opts_fixed()`](stri_opts_fixed.md)
+Other search_regex: [`about_search_regex`](about_search_regex.md), [`stri_opts_regex()`](stri_opts_regex.md)
-Other search\_coll: [`about_search_coll`](about_search_coll.md), [`stri_opts_collator()`](stri_opts_collator.md)
+Other search_fixed: [`about_search_fixed`](about_search_fixed.md), [`stri_opts_fixed()`](stri_opts_fixed.md)
-Other search\_charclass: [`about_search_charclass`](about_search_charclass.md), [`stri_trim_both()`](stri_trim.md)
+Other search_coll: [`about_search_coll`](about_search_coll.md), [`stri_opts_collator()`](stri_opts_collator.md)
-Other search\_detect: [`stri_detect()`](stri_detect.md), [`stri_startswith()`](stri_startsendswith.md)
+Other search_charclass: [`about_search_charclass`](about_search_charclass.md), [`stri_trim_both()`](stri_trim.md)
-Other search\_count: [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_count()`](stri_count.md)
+Other search_detect: [`stri_detect()`](stri_detect.md), [`stri_startswith()`](stri_startsendswith.md)
-Other search\_locate: [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_locate_all()`](stri_locate.md)
+Other search_count: [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_count()`](stri_count.md)
-Other search\_replace: [`stri_replace_all()`](stri_replace.md), [`stri_replace_rstr()`](stri_replace_rstr.md), [`stri_trim_both()`](stri_trim.md)
+Other search_locate: [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_locate_all()`](stri_locate.md)
-Other search\_split: [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_split_lines()`](stri_split_lines.md), [`stri_split()`](stri_split.md)
+Other search_replace: [`stri_replace_all()`](stri_replace.md), [`stri_replace_rstr()`](stri_replace_rstr.md), [`stri_trim_both()`](stri_trim.md)
-Other search\_subset: [`stri_subset()`](stri_subset.md)
+Other search_split: [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_split_lines()`](stri_split_lines.md), [`stri_split()`](stri_split.md)
-Other search\_extract: [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_extract_all()`](stri_extract.md), [`stri_match_all()`](stri_match.md)
+Other search_subset: [`stri_subset()`](stri_subset.md)
-Other stringi\_general\_topics: [`about_arguments`](about_arguments.md), [`about_encoding`](about_encoding.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_charclass`](about_search_charclass.md), [`about_search_coll`](about_search_coll.md), [`about_search_fixed`](about_search_fixed.md), [`about_search_regex`](about_search_regex.md), [`about_stringi`](about_stringi.md)
+Other search_extract: [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_extract_all()`](stri_extract.md), [`stri_match_all()`](stri_match.md)
+
+Other stringi_general_topics: [`about_arguments`](about_arguments.md), [`about_encoding`](about_encoding.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_charclass`](about_search_charclass.md), [`about_search_coll`](about_search_coll.md), [`about_search_fixed`](about_search_fixed.md), [`about_search_regex`](about_search_regex.md), [`about_stringi`](about_stringi.md)
diff --git a/docs/_sources/rapi/about_search_boundaries.md.txt b/docs/_sources/rapi/about_search_boundaries.md.txt
index 0d24b9850..aa74cc838 100644
--- a/docs/_sources/rapi/about_search_boundaries.md.txt
+++ b/docs/_sources/rapi/about_search_boundaries.md.txt
@@ -1,4 +1,4 @@
-# about\_search\_boundaries: Text Boundary Analysis in stringi
+# about_search_boundaries: Text Boundary Analysis in stringi
## Description
@@ -44,8 +44,10 @@ For technical details on different classes of text boundaries refer to the stringi at
-Other locale\_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
-Other text\_boundaries: [`about_search`](about_search.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_brkiter()`](stri_opts_brkiter.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_split_lines()`](stri_split_lines.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_wrap()`](stri_wrap.md)
+Other locale_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
-Other stringi\_general\_topics: [`about_arguments`](about_arguments.md), [`about_encoding`](about_encoding.md), [`about_locale`](about_locale.md), [`about_search_charclass`](about_search_charclass.md), [`about_search_coll`](about_search_coll.md), [`about_search_fixed`](about_search_fixed.md), [`about_search_regex`](about_search_regex.md), [`about_search`](about_search.md), [`about_stringi`](about_stringi.md)
+Other text_boundaries: [`about_search`](about_search.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_brkiter()`](stri_opts_brkiter.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_split_lines()`](stri_split_lines.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_wrap()`](stri_wrap.md)
+
+Other stringi_general_topics: [`about_arguments`](about_arguments.md), [`about_encoding`](about_encoding.md), [`about_locale`](about_locale.md), [`about_search_charclass`](about_search_charclass.md), [`about_search_coll`](about_search_coll.md), [`about_search_fixed`](about_search_fixed.md), [`about_search_regex`](about_search_regex.md), [`about_search`](about_search.md), [`about_stringi`](about_stringi.md)
diff --git a/docs/_sources/rapi/about_search_charclass.md.txt b/docs/_sources/rapi/about_search_charclass.md.txt
index 3c7b1fcc4..bb98e5ca3 100644
--- a/docs/_sources/rapi/about_search_charclass.md.txt
+++ b/docs/_sources/rapi/about_search_charclass.md.txt
@@ -1,4 +1,4 @@
-# about\_search\_charclass: Character Classes in stringi
+# about_search_charclass: Character Classes in stringi
## Description
@@ -6,7 +6,7 @@ Here we describe how character classes (sets) can be specified in the stringi perform a single character (i.e., Unicode code point) search-based operations. You may obtain the same results using [about\_search\_regex](about_search_regex.md). However, these very functions aim to be faster.
+All `stri_*_charclass` functions in stringi perform a single character (i.e., Unicode code point) search-based operations. You may obtain the same results using [about_search_regex](about_search_regex.md). However, these very functions aim to be faster.
Character classes are defined using ICU\'s `UnicodeSet` patterns. Below we briefly summarize their syntax. For more details refer to the bibliographic References below.
@@ -430,7 +430,7 @@ Therefore, a POSIX flavor of `[:punct:]` is more like `[\p{P}\p{S}]` in
+*The Unicode Character Database* -- Unicode Standard Annex #44,
*UnicodeSet* -- ICU User Guide,
@@ -446,6 +446,8 @@ Therefore, a POSIX flavor of `[:punct:]` is more like `[\p{P}\p{S}]` in stringi at
-Other search\_charclass: [`about_search`](about_search.md), [`stri_trim_both()`](stri_trim.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
-Other stringi\_general\_topics: [`about_arguments`](about_arguments.md), [`about_encoding`](about_encoding.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`about_search_fixed`](about_search_fixed.md), [`about_search_regex`](about_search_regex.md), [`about_search`](about_search.md), [`about_stringi`](about_stringi.md)
+Other search_charclass: [`about_search`](about_search.md), [`stri_trim_both()`](stri_trim.md)
+
+Other stringi_general_topics: [`about_arguments`](about_arguments.md), [`about_encoding`](about_encoding.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`about_search_fixed`](about_search_fixed.md), [`about_search_regex`](about_search_regex.md), [`about_search`](about_search.md), [`about_stringi`](about_stringi.md)
diff --git a/docs/_sources/rapi/about_search_coll.md.txt b/docs/_sources/rapi/about_search_coll.md.txt
index 27046ef66..7ff318468 100644
--- a/docs/_sources/rapi/about_search_coll.md.txt
+++ b/docs/_sources/rapi/about_search_coll.md.txt
@@ -1,4 +1,4 @@
-# about\_search\_coll: Locale-Sensitive Text Searching in stringi
+# about_search_coll: Locale-Sensitive Text Searching in stringi
## Description
@@ -28,8 +28,10 @@ L. Werner, *Efficient Text Searching in Java*, 1999, stringi at
-Other search\_coll: [`about_search`](about_search.md), [`stri_opts_collator()`](stri_opts_collator.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
-Other locale\_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
+Other search_coll: [`about_search`](about_search.md), [`stri_opts_collator()`](stri_opts_collator.md)
-Other stringi\_general\_topics: [`about_arguments`](about_arguments.md), [`about_encoding`](about_encoding.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_charclass`](about_search_charclass.md), [`about_search_fixed`](about_search_fixed.md), [`about_search_regex`](about_search_regex.md), [`about_search`](about_search.md), [`about_stringi`](about_stringi.md)
+Other locale_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
+
+Other stringi_general_topics: [`about_arguments`](about_arguments.md), [`about_encoding`](about_encoding.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_charclass`](about_search_charclass.md), [`about_search_fixed`](about_search_fixed.md), [`about_search_regex`](about_search_regex.md), [`about_search`](about_search.md), [`about_stringi`](about_stringi.md)
diff --git a/docs/_sources/rapi/about_search_fixed.md.txt b/docs/_sources/rapi/about_search_fixed.md.txt
index 5984c3e33..9cf8c75ff 100644
--- a/docs/_sources/rapi/about_search_fixed.md.txt
+++ b/docs/_sources/rapi/about_search_fixed.md.txt
@@ -1,4 +1,4 @@
-# about\_search\_fixed: Locale-Insensitive Fixed Pattern Matching in stringi
+# about_search_fixed: Locale-Insensitive Fixed Pattern Matching in stringi
## Description
@@ -18,7 +18,7 @@ Be aware that, for natural language processing, fixed pattern searching might no
4. ignorable case,
-see also [about\_search\_coll](about_search_coll.md).
+see also [about_search_coll](about_search_coll.md).
Note that the conversion of input data to Unicode is done as usual.
@@ -30,6 +30,8 @@ Note that the conversion of input data to Unicode is done as usual.
The official online manual of stringi at
-Other search\_fixed: [`about_search`](about_search.md), [`stri_opts_fixed()`](stri_opts_fixed.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
-Other stringi\_general\_topics: [`about_arguments`](about_arguments.md), [`about_encoding`](about_encoding.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_charclass`](about_search_charclass.md), [`about_search_coll`](about_search_coll.md), [`about_search_regex`](about_search_regex.md), [`about_search`](about_search.md), [`about_stringi`](about_stringi.md)
+Other search_fixed: [`about_search`](about_search.md), [`stri_opts_fixed()`](stri_opts_fixed.md)
+
+Other stringi_general_topics: [`about_arguments`](about_arguments.md), [`about_encoding`](about_encoding.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_charclass`](about_search_charclass.md), [`about_search_coll`](about_search_coll.md), [`about_search_regex`](about_search_regex.md), [`about_search`](about_search.md), [`about_stringi`](about_stringi.md)
diff --git a/docs/_sources/rapi/about_search_regex.md.txt b/docs/_sources/rapi/about_search_regex.md.txt
index d2ffff73b..68e6fc2e2 100644
--- a/docs/_sources/rapi/about_search_regex.md.txt
+++ b/docs/_sources/rapi/about_search_regex.md.txt
@@ -1,4 +1,4 @@
-# about\_search\_regex: Regular Expressions in stringi
+# about_search_regex: Regular Expressions in stringi
## Description
@@ -8,7 +8,7 @@ A regular expression is a pattern describing, possibly in a very abstract way, a
All `stri_*_regex` functions in stringi use the ICU regex engine. Its settings may be tuned up (for example to perform case-insensitive search) via the [`stri_opts_regex`](stri_opts_regex.md) function.
-Regular expression patterns in ICU are quite similar in form and behavior to Perl\'s regexes. Their implementation is loosely inspired by JDK 1.4 `java.util.regex`. ICU Regular Expressions conform to the Unicode Technical Standard \#18 (see References section) and its features are summarized in the ICU User Guide (see below). A good general introduction to regexes is (Friedl, 2002). Some general topics are also covered in the **R** manual, see [regex](https://stat.ethz.ch/R-manual/R-devel/library/base/html/regex.html).
+Regular expression patterns in ICU are quite similar in form and behavior to Perl\'s regexes. Their implementation is loosely inspired by JDK 1.4 `java.util.regex`. ICU Regular Expressions conform to the Unicode Technical Standard #18 (see References section) and its features are summarized in the ICU User Guide (see below). A good general introduction to regexes is (Friedl, 2002). Some general topics are also covered in the **R** manual, see [regex](https://stat.ethz.ch/R-manual/R-devel/library/base/html/regex.html).
## ICU Regex Operators at a Glance
@@ -184,7 +184,7 @@ Here is a list of meta-characters provided by the ICU User Guide on regexes.
`\h`
-: Match a Horizontal White Space character. They are characters with Unicode General Category of Space\_Separator plus the ASCII tab, `\u0009`. \[Since ICU 55\]
+: Match a Horizontal White Space character. They are characters with Unicode General Category of Space_Separator plus the ASCII tab, `\u0009`. \[Since ICU 55\]
`\H`
@@ -300,7 +300,7 @@ Here is a list of meta-characters provided by the ICU User Guide on regexes.
## Character Classes
-The syntax is similar, but not 100% compatible with the one described in [about\_search\_charclass](about_search_charclass.md). In particular, whitespaces are not ignored and set-theoretic operations are denoted slightly differently. However, other than this [about\_search\_charclass](about_search_charclass.md) is a good reference on the capabilities offered.
+The syntax is similar, but not 100% compatible with the one described in [about_search_charclass](about_search_charclass.md). In particular, whitespaces are not ignored and set-theoretic operations are denoted slightly differently. However, other than this [about_search_charclass](about_search_charclass.md) is a good reference on the capabilities offered.
The ICU User Guide on regexes lists what follows.
@@ -344,7 +344,7 @@ The ICU User Guide on regexes lists what follows.
Note that if a given regex `pattern` is empty, then all the functions in stringi give `NA` in result and generate a warning. On a syntax error, a quite informative failure message is shown.
-If you wish to search for a fixed pattern, refer to [about\_search\_coll](about_search_coll.md) or [about\_search\_fixed](about_search_fixed.md). They allow to perform a locale-aware text lookup, or a very fast exact-byte search, respectively.
+If you wish to search for a fixed pattern, refer to [about_search_coll](about_search_coll.md) or [about_search_fixed](about_search_fixed.md). They allow to perform a locale-aware text lookup, or a very fast exact-byte search, respectively.
## Author(s)
@@ -356,7 +356,7 @@ If you wish to search for a fixed pattern, refer to [about\_search\_coll](about_
J.E.F. Friedl, *Mastering Regular Expressions*, O\'Reilly, 2002
-*Unicode Regular Expressions* -- Unicode Technical Standard \#18,
+*Unicode Regular Expressions* -- Unicode Technical Standard #18,
*Unicode Regular Expressions* -- Regex tutorial,
@@ -364,6 +364,8 @@ J.E.F. Friedl, *Mastering Regular Expressions*, O\'Reilly, 2002
The official online manual of stringi at
-Other search\_regex: [`about_search`](about_search.md), [`stri_opts_regex()`](stri_opts_regex.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
-Other stringi\_general\_topics: [`about_arguments`](about_arguments.md), [`about_encoding`](about_encoding.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_charclass`](about_search_charclass.md), [`about_search_coll`](about_search_coll.md), [`about_search_fixed`](about_search_fixed.md), [`about_search`](about_search.md), [`about_stringi`](about_stringi.md)
+Other search_regex: [`about_search`](about_search.md), [`stri_opts_regex()`](stri_opts_regex.md)
+
+Other stringi_general_topics: [`about_arguments`](about_arguments.md), [`about_encoding`](about_encoding.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_charclass`](about_search_charclass.md), [`about_search_coll`](about_search_coll.md), [`about_search_fixed`](about_search_fixed.md), [`about_search`](about_search.md), [`about_stringi`](about_stringi.md)
diff --git a/docs/_sources/rapi/about_stringi.md.txt b/docs/_sources/rapi/about_stringi.md.txt
index 4a7f8d50a..9d3099f97 100644
--- a/docs/_sources/rapi/about_stringi.md.txt
+++ b/docs/_sources/rapi/about_stringi.md.txt
@@ -1,4 +1,4 @@
-# about\_stringi: THE String Processing Package
+# about_stringi: Fast and Portable Character String Processing in R
## Description
@@ -14,27 +14,27 @@
Manual pages on general topics:
-- [about\_encoding](about_encoding.md) -- character encoding issues, including information on encoding management in stringi, as well as on encoding detection and conversion.
+- [about_encoding](about_encoding.md) -- character encoding issues, including information on encoding management in stringi, as well as on encoding detection and conversion.
-- [about\_locale](about_locale.md) -- locale issues, including locale management and specification in stringi, and the list of locale-sensitive operations. In particular, see [`stri_opts_collator`](stri_opts_collator.md) for a description of the string collation algorithm, which is used for string comparing, ordering, ranking, sorting, case-folding, and searching.
+- [about_locale](about_locale.md) -- locale issues, including locale management and specification in stringi, and the list of locale-sensitive operations. In particular, see [`stri_opts_collator`](stri_opts_collator.md) for a description of the string collation algorithm, which is used for string comparing, ordering, ranking, sorting, case-folding, and searching.
-- [about\_arguments](about_arguments.md) -- information on how stringi handles the arguments passed to its function.
+- [about_arguments](about_arguments.md) -- information on how stringi handles the arguments passed to its function.
## Facilities available
Refer to the following:
-- [about\_search](about_search.md) for string searching facilities; these include pattern searching, matching, string splitting, and so on. The following independent search engines are provided:
+- [about_search](about_search.md) for string searching facilities; these include pattern searching, matching, string splitting, and so on. The following independent search engines are provided:
- - [about\_search\_regex](about_search_regex.md) -- with ICU (Java-like) regular expressions,
+ - [about_search_regex](about_search_regex.md) -- with ICU (Java-like) regular expressions,
- - [about\_search\_fixed](about_search_fixed.md) -- fast, locale-independent, byte-wise pattern matching,
+ - [about_search_fixed](about_search_fixed.md) -- fast, locale-independent, byte-wise pattern matching,
- - [about\_search\_coll](about_search_coll.md) -- locale-aware pattern matching for natural language processing tasks,
+ - [about_search_coll](about_search_coll.md) -- locale-aware pattern matching for natural language processing tasks,
- - [about\_search\_charclass](about_search_charclass.md) -- seeking elements of particular character classes, like "all whites-paces" or "all digits",
+ - [about_search_charclass](about_search_charclass.md) -- seeking elements of particular character classes, like "all whites-paces" or "all digits",
- - [about\_search\_boundaries](about_search_boundaries.md) -- text boundary analysis.
+ - [about_search_boundaries](about_search_boundaries.md) -- text boundary analysis.
- [`stri_datetime_format`](stri_datetime_format.md) for date/time formatting and parsing. Also refer to the links therein for other date/time/time zone- related operations.
@@ -46,11 +46,11 @@ Refer to the following:
- [`stri_length`](stri_length.md) (among others) for determining the number of code points in a string. See also [`stri_count_boundaries`](stri_count_boundaries.md) for counting the number of Unicode characters and [`stri_width`](stri_width.md) for approximating the width of a string.
-- [`stri_trim`](stri_trim.md) (among others) for trimming characters from the beginning or/and end of a string, see also [about\_search\_charclass](about_search_charclass.md), and [`stri_pad`](stri_pad.md) for padding strings so that they are of the same width. Additionally, [`stri_wrap`](stri_wrap.md) wraps text into lines.
+- [`stri_trim`](stri_trim.md) (among others) for trimming characters from the beginning or/and end of a string, see also [about_search_charclass](about_search_charclass.md), and [`stri_pad`](stri_pad.md) for padding strings so that they are of the same width. Additionally, [`stri_wrap`](stri_wrap.md) wraps text into lines.
-- [`stri_trans_tolower`](stri_trans_casemap.md) (among others) for case mapping, i.e., conversion to lower, UPPER, or Title Case, [`stri_trans_nfc`](stri_trans_nf.md) (among others) for Unicode normalization, [`stri_trans_char`](stri_trans_char.md) for translating individual code points, and [`stri_trans_general`](stri_trans_general.md) for other universal yet powerful text transforms, including transliteration.
+- [`stri_trans_tolower`](stri_trans_casemap.md) (among others) for case mapping, i.e., conversion to lower, UPPER, or Title Case, [`stri_trans_nfc`](stri_trans_nf.md) (among others) for Unicode normalization, [`stri_trans_char`](stri_trans_char.md) for translating individual code points, and [`stri_trans_general`](stri_trans_general.md) for other universal text transforms, including transliteration.
-- [`stri_cmp`](stri_compare.md), [`%s<%`](+25s+3C+25.md), [`stri_order`](stri_order.md), [`stri_sort`](stri_sort.md), [`stri_rank`](stri_rank.md), [`stri_unique`](stri_unique.md), and [`stri_duplicated`](stri_duplicated.md) for collation-based, locale-aware operations, see also [about\_locale](about_locale.md).
+- [`stri_cmp`](stri_compare.md), [`%s<%`](+25s+3C+25.md), [`stri_order`](stri_order.md), [`stri_sort`](stri_sort.md), [`stri_rank`](stri_rank.md), [`stri_unique`](stri_unique.md), and [`stri_duplicated`](stri_duplicated.md) for collation-based, locale-aware operations, see also [about_locale](about_locale.md).
- [`stri_split_lines`](stri_split_lines.md) (among others) to split a string into text lines.
@@ -68,7 +68,9 @@ Marek Gagolewski, with contributions from Bartek Tartanus and many others. ICU4C
## References
-*stringi Package homepage*,
+*stringi Package Homepage*,
+
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
*ICU -- International Components for Unicode*,
@@ -76,10 +78,12 @@ Marek Gagolewski, with contributions from Bartek Tartanus and many others. ICU4C
*The Unicode Consortium*,
-*UTF-8, a transformation format of ISO 10646* -- RFC 3629,
+*UTF-8, A Transformation Format of ISO 10646* -- RFC 3629,
## See Also
The official online manual of stringi at
-Other stringi\_general\_topics: [`about_arguments`](about_arguments.md), [`about_encoding`](about_encoding.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_charclass`](about_search_charclass.md), [`about_search_coll`](about_search_coll.md), [`about_search_fixed`](about_search_fixed.md), [`about_search_regex`](about_search_regex.md), [`about_search`](about_search.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other stringi_general_topics: [`about_arguments`](about_arguments.md), [`about_encoding`](about_encoding.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_charclass`](about_search_charclass.md), [`about_search_coll`](about_search_coll.md), [`about_search_fixed`](about_search_fixed.md), [`about_search_regex`](about_search_regex.md), [`about_search`](about_search.md)
diff --git a/docs/_sources/rapi/operator_add.md.txt b/docs/_sources/rapi/operator_add.md.txt
index fa3483456..8b4415df9 100644
--- a/docs/_sources/rapi/operator_add.md.txt
+++ b/docs/_sources/rapi/operator_add.md.txt
@@ -1,4 +1,4 @@
-# operator\_add: Concatenate Two Character Vectors
+# operator_add: Concatenate Two Character Vectors
## Description
@@ -6,7 +6,7 @@ Binary operators for joining (concatenating) two character vectors, with a typic
## Usage
-```r
+``` r
e1 %s+% e2
e1 %stri+% e2
@@ -37,6 +37,8 @@ Returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other join: [`stri_dup()`](stri_dup.md), [`stri_flatten()`](stri_flatten.md), [`stri_join_list()`](stri_join_list.md), [`stri_join()`](stri_join.md)
## Examples
diff --git a/docs/_sources/rapi/operator_compare.md.txt b/docs/_sources/rapi/operator_compare.md.txt
index d7ca8d030..cf1db6192 100644
--- a/docs/_sources/rapi/operator_compare.md.txt
+++ b/docs/_sources/rapi/operator_compare.md.txt
@@ -1,4 +1,4 @@
-# operator\_compare: Compare Strings with or without Collation
+# operator_compare: Compare Strings with or without Collation
## Description
@@ -6,7 +6,7 @@ Relational operators for comparing corresponding strings in two character vector
## Usage
-```r
+``` r
e1 %s<% e2
e1 %s<=% e2
@@ -66,7 +66,9 @@ All the functions return a logical vector indicating the result of a pairwise co
The official online manual of stringi at
-Other locale\_sensitive: [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other locale_sensitive: [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
## Examples
diff --git a/docs/_sources/rapi/operator_dollar.md.txt b/docs/_sources/rapi/operator_dollar.md.txt
index a303e7ea9..5b00409d3 100644
--- a/docs/_sources/rapi/operator_dollar.md.txt
+++ b/docs/_sources/rapi/operator_dollar.md.txt
@@ -1,4 +1,4 @@
-# operator\_dollar: C-Style Formatting with [`stri_sprintf`](stri_sprintf.md) as a Binary Operator
+# operator_dollar: C-Style Formatting with [`stri_sprintf`](stri_sprintf.md) as a Binary Operator
## Description
@@ -8,7 +8,7 @@ Missing values and empty vectors are propagated as usual.
## Usage
-```r
+``` r
e1 %s$% e2
e1 %stri$% e2
@@ -39,6 +39,8 @@ Returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other length: [`stri_isempty()`](stri_isempty.md), [`stri_length()`](stri_length.md), [`stri_numbytes()`](stri_numbytes.md), [`stri_pad_both()`](stri_pad.md), [`stri_sprintf()`](stri_sprintf.md), [`stri_width()`](stri_width.md)
## Examples
diff --git a/docs/_sources/rapi/stri_compare.md.txt b/docs/_sources/rapi/stri_compare.md.txt
index c82eef406..c67d882bb 100644
--- a/docs/_sources/rapi/stri_compare.md.txt
+++ b/docs/_sources/rapi/stri_compare.md.txt
@@ -1,4 +1,4 @@
-# stri\_compare: Compare Strings with or without Collation
+# stri_compare: Compare Strings with or without Collation
## Description
@@ -6,7 +6,7 @@ These functions may be used to determine if two strings are equal, canonically e
## Usage
-```r
+``` r
stri_compare(e1, e2, ..., opts_collator = NULL)
stri_cmp(e1, e2, ..., opts_collator = NULL)
@@ -68,7 +68,9 @@ All the other functions return a logical vector that indicates whether a given r
The official online manual of stringi at
-Other locale\_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other locale_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
## Examples
diff --git a/docs/_sources/rapi/stri_count.md.txt b/docs/_sources/rapi/stri_count.md.txt
index 00cd59b79..176b1d8c6 100644
--- a/docs/_sources/rapi/stri_count.md.txt
+++ b/docs/_sources/rapi/stri_count.md.txt
@@ -1,4 +1,4 @@
-# stri\_count: Count the Number of Pattern Occurrences
+# stri_count: Count the Number of Pattern Occurrences
## Description
@@ -6,7 +6,7 @@ These functions count the number of occurrences of a pattern in a string.
## Usage
-```r
+``` r
stri_count(str, ..., regex, fixed, coll, charclass)
stri_count_charclass(str, pattern)
@@ -47,7 +47,9 @@ All the functions return an integer vector.
The official online manual of stringi at
-Other search\_count: [`about_search`](about_search.md), [`stri_count_boundaries()`](stri_count_boundaries.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other search_count: [`about_search`](about_search.md), [`stri_count_boundaries()`](stri_count_boundaries.md)
## Examples
diff --git a/docs/_sources/rapi/stri_count_boundaries.md.txt b/docs/_sources/rapi/stri_count_boundaries.md.txt
index 3fdd06146..9605ce658 100644
--- a/docs/_sources/rapi/stri_count_boundaries.md.txt
+++ b/docs/_sources/rapi/stri_count_boundaries.md.txt
@@ -1,4 +1,4 @@
-# stri\_count\_boundaries: Count the Number of Text Boundaries
+# stri_count_boundaries: Count the Number of Text Boundaries
## Description
@@ -6,7 +6,7 @@ These functions determine the number of text boundaries (like character, word, l
## Usage
-```r
+``` r
stri_count_boundaries(str, ..., opts_brkiter = NULL)
stri_count_words(str, locale = NULL)
@@ -45,11 +45,13 @@ Both functions return an integer vector.
The official online manual of stringi at
-Other search\_count: [`about_search`](about_search.md), [`stri_count()`](stri_count.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other search_count: [`about_search`](about_search.md), [`stri_count()`](stri_count.md)
-Other locale\_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
+Other locale_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
-Other text\_boundaries: [`about_search_boundaries`](about_search_boundaries.md), [`about_search`](about_search.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_brkiter()`](stri_opts_brkiter.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_split_lines()`](stri_split_lines.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_wrap()`](stri_wrap.md)
+Other text_boundaries: [`about_search_boundaries`](about_search_boundaries.md), [`about_search`](about_search.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_brkiter()`](stri_opts_brkiter.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_split_lines()`](stri_split_lines.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_wrap()`](stri_wrap.md)
## Examples
diff --git a/docs/_sources/rapi/stri_datetime_add.md.txt b/docs/_sources/rapi/stri_datetime_add.md.txt
index 7dba5620e..10619dbab 100644
--- a/docs/_sources/rapi/stri_datetime_add.md.txt
+++ b/docs/_sources/rapi/stri_datetime_add.md.txt
@@ -1,4 +1,4 @@
-# stri\_datetime\_add: Date and Time Arithmetic
+# stri_datetime_add: Date and Time Arithmetic
## Description
@@ -6,7 +6,7 @@ Modifies a date-time object by adding a specific amount of time units.
## Usage
-```r
+``` r
stri_datetime_add(
time,
value = 1L,
@@ -52,6 +52,8 @@ The replacement version of `stri_datetime_add` modifies the state of the `time`
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other datetime: [`stri_datetime_create()`](stri_datetime_create.md), [`stri_datetime_fields()`](stri_datetime_fields.md), [`stri_datetime_format()`](stri_datetime_format.md), [`stri_datetime_fstr()`](stri_datetime_fstr.md), [`stri_datetime_now()`](stri_datetime_now.md), [`stri_datetime_symbols()`](stri_datetime_symbols.md), [`stri_timezone_get()`](stri_timezone_set.md), [`stri_timezone_info()`](stri_timezone_info.md), [`stri_timezone_list()`](stri_timezone_list.md)
## Examples
@@ -63,9 +65,9 @@ Other datetime: [`stri_datetime_create()`](stri_datetime_create.md), [`stri_date
x <- stri_datetime_now()
stri_datetime_add(x, units='months') <- 2
print(x)
-## [1] "2022-01-29 10:28:47 AEDT"
+## [1] "2022-09-02 17:40:19 AEST"
stri_datetime_add(x, -2, units='months')
-## [1] "2021-11-29 10:28:47 AEDT"
+## [1] "2022-07-02 17:40:19 AEST"
stri_datetime_add(stri_datetime_create(2014, 4, 20), 1, units='years')
## [1] "2015-04-20 12:00:00 AEST"
stri_datetime_add(stri_datetime_create(2014, 4, 20), 1, units='years', locale='@calendar=hebrew')
diff --git a/docs/_sources/rapi/stri_datetime_create.md.txt b/docs/_sources/rapi/stri_datetime_create.md.txt
index 834edc064..b46c677ab 100644
--- a/docs/_sources/rapi/stri_datetime_create.md.txt
+++ b/docs/_sources/rapi/stri_datetime_create.md.txt
@@ -1,4 +1,4 @@
-# stri\_datetime\_create: Create a Date-Time Object
+# stri_datetime_create: Create a Date-Time Object
## Description
@@ -6,7 +6,7 @@ Constructs date-time objects from numeric representations.
## Usage
-```r
+``` r
stri_datetime_create(
year,
month,
@@ -50,6 +50,8 @@ Returns an object of class [`POSIXct`](https://stat.ethz.ch/R-manual/R-devel/lib
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other datetime: [`stri_datetime_add()`](stri_datetime_add.md), [`stri_datetime_fields()`](stri_datetime_fields.md), [`stri_datetime_format()`](stri_datetime_format.md), [`stri_datetime_fstr()`](stri_datetime_fstr.md), [`stri_datetime_now()`](stri_datetime_now.md), [`stri_datetime_symbols()`](stri_datetime_symbols.md), [`stri_timezone_get()`](stri_timezone_set.md), [`stri_timezone_info()`](stri_timezone_info.md), [`stri_timezone_list()`](stri_timezone_list.md)
## Examples
diff --git a/docs/_sources/rapi/stri_datetime_fields.md.txt b/docs/_sources/rapi/stri_datetime_fields.md.txt
index 2a9c93e20..6352db775 100644
--- a/docs/_sources/rapi/stri_datetime_fields.md.txt
+++ b/docs/_sources/rapi/stri_datetime_fields.md.txt
@@ -1,4 +1,4 @@
-# stri\_datetime\_fields: Get Values for Date and Time Fields
+# stri_datetime_fields: Get Values for Date and Time Fields
## Description
@@ -6,7 +6,7 @@ Computes and returns values for all date and time fields.
## Usage
-```r
+``` r
stri_datetime_fields(time, tz = attr(time, "tzone"), locale = NULL)
```
@@ -62,6 +62,8 @@ Returns a data frame with the following columns:
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other datetime: [`stri_datetime_add()`](stri_datetime_add.md), [`stri_datetime_create()`](stri_datetime_create.md), [`stri_datetime_format()`](stri_datetime_format.md), [`stri_datetime_fstr()`](stri_datetime_fstr.md), [`stri_datetime_now()`](stri_datetime_now.md), [`stri_datetime_symbols()`](stri_datetime_symbols.md), [`stri_timezone_get()`](stri_timezone_set.md), [`stri_timezone_info()`](stri_timezone_info.md), [`stri_timezone_list()`](stri_timezone_list.md)
## Examples
@@ -72,16 +74,16 @@ Other datetime: [`stri_datetime_add()`](stri_datetime_add.md), [`stri_datetime_c
```r
stri_datetime_fields(stri_datetime_now())
## Year Month Day Hour Minute Second Millisecond WeekOfYear WeekOfMonth
-## 1 2021 11 29 10 28 48 53 49 5
+## 1 2022 7 2 17 40 19 644 27 1
## DayOfYear DayOfWeek Hour12 AmPm Era
-## 1 333 2 10 1 2
+## 1 183 7 5 2 2
stri_datetime_fields(stri_datetime_now(), locale='@calendar=hebrew')
## Year Month Day Hour Minute Second Millisecond WeekOfYear WeekOfMonth
-## 1 5782 3 25 10 28 48 56 13 5
+## 1 5782 11 3 17 40 19 647 43 1
## DayOfYear DayOfWeek Hour12 AmPm Era
-## 1 84 2 10 1 1
+## 1 299 7 5 2 1
stri_datetime_symbols(locale='@calendar=hebrew')$Month[
stri_datetime_fields(stri_datetime_now(), locale='@calendar=hebrew')$Month
]
-## [1] "Kislev"
+## [1] "Tamuz"
```
diff --git a/docs/_sources/rapi/stri_datetime_format.md.txt b/docs/_sources/rapi/stri_datetime_format.md.txt
index f084461cf..2f1d23140 100644
--- a/docs/_sources/rapi/stri_datetime_format.md.txt
+++ b/docs/_sources/rapi/stri_datetime_format.md.txt
@@ -1,4 +1,4 @@
-# stri\_datetime\_format: Date and Time Formatting and Parsing
+# stri_datetime_format: Date and Time Formatting and Parsing
## Description
@@ -6,7 +6,7 @@ These functions convert a given date/time object to a character vector, or vice
## Usage
-```r
+``` r
stri_datetime_format(
time,
format = "uuuu-MM-dd HH:mm:ss",
@@ -40,6 +40,8 @@ Vectorized over `format` and `time` or `str`.
By default, `stri_datetime_format` (for the sake of compatibility with the [`strftime`](https://stat.ethz.ch/R-manual/R-devel/library/base/help/strftime.html) function) formats a date/time object using the current default time zone.
+Unspecified fields (e.g., seconds where only hours and minutes are given) are filled with the ones based on current date and time.
+
`format` may be one of `DT_STYLE` or `DT_relative_STYLE`, where `DT` is equal to `date`, `time`, or `datetime`, and `STYLE` is equal to `full`, `long`, `medium`, or `short`. This gives a locale-dependent date and/or time format. Note that currently ICU does not support `relative` `time` formats, thus this flag is currently ignored in such a context.
Otherwise, `format` is a pattern: a string where specific sequences of characters are replaced with date/time data from a calendar when formatting or used to generate data for a calendar when parsing. For example, `y` stands for \'year\'. Characters may be used multiple times: `yy` might produce `99`, whereas `yyyy` yields `1999`. For most numerical fields, the number of characters specifies the field width. For example, if `h` is the hour, `h` might produce `5`, but `hh` yields `05`. For some characters, the count specifies whether an abbreviated or full form should be used.
@@ -122,7 +124,7 @@ Two single quotes represent a literal single quote, either inside or outside sin
| v | Time Zone: generic non-location | v | PT |
| | (falls back first to VVVV) | vvvv | Pacific Time or Los Angeles Time |
| V | Time Zone: short time zone ID | V | uslax |
-| | Time Zone: long time zone ID | VV | America/Los\_Angeles |
+| | Time Zone: long time zone ID | VV | America/Los_Angeles |
| | Time Zone: time zone exemplar city | VVV | Los Angeles |
| | Time Zone: generic location (falls back to OOOO) | VVVV | Los Angeles Time |
| X | Time Zone: ISO8601 basic hm?, with Z for 0 | X | -08, +0530, Z |
@@ -172,6 +174,8 @@ A few examples:
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other datetime: [`stri_datetime_add()`](stri_datetime_add.md), [`stri_datetime_create()`](stri_datetime_create.md), [`stri_datetime_fields()`](stri_datetime_fields.md), [`stri_datetime_fstr()`](stri_datetime_fstr.md), [`stri_datetime_now()`](stri_datetime_now.md), [`stri_datetime_symbols()`](stri_datetime_symbols.md), [`stri_timezone_get()`](stri_timezone_set.md), [`stri_timezone_info()`](stri_timezone_info.md), [`stri_timezone_list()`](stri_timezone_list.md)
## Examples
@@ -180,12 +184,15 @@ Other datetime: [`stri_datetime_add()`](stri_datetime_add.md), [`stri_datetime_c
```r
-stri_datetime_parse(c('2015-02-28', '2015-02-29'), 'yyyy-MM-dd')
-## [1] "2015-02-28 10:28:48 AEDT" NA
-stri_datetime_parse(c('2015-02-28', '2015-02-29'), 'yyyy-MM-dd', lenient=TRUE)
-## [1] "2015-02-28 10:28:48 AEDT" "2015-03-01 10:28:48 AEDT"
+x <- c('2015-02-28', '2015-02-29')
+stri_datetime_parse(x, 'yyyy-MM-dd')
+## [1] "2015-02-28 17:40:19 AEDT" NA
+stri_datetime_parse(x, 'yyyy-MM-dd', lenient=TRUE)
+## [1] "2015-02-28 17:40:19 AEDT" "2015-03-01 17:40:19 AEDT"
+stri_datetime_parse(x %s+% " 00:00:00", "yyyy-MM-dd HH:mm:ss")
+## [1] "2015-02-28 00:00:00 AEDT" NA
stri_datetime_parse('19 lipca 2015', 'date_long', locale='pl_PL')
-## [1] "2015-07-19 10:28:48 AEST"
+## [1] "2015-07-19 17:40:19 AEST"
stri_datetime_format(stri_datetime_now(), 'datetime_relative_medium')
-## [1] "today, 10:28:48 am"
+## [1] "today, 5:40:19 pm"
```
diff --git a/docs/_sources/rapi/stri_datetime_fstr.md.txt b/docs/_sources/rapi/stri_datetime_fstr.md.txt
index fac2802eb..458bbbfdc 100644
--- a/docs/_sources/rapi/stri_datetime_fstr.md.txt
+++ b/docs/_sources/rapi/stri_datetime_fstr.md.txt
@@ -1,4 +1,4 @@
-# stri\_datetime\_fstr: Convert `strptime`-Style Format Strings
+# stri_datetime_fstr: Convert `strptime`-Style Format Strings
## Description
@@ -6,7 +6,7 @@ This function converts [`strptime`](https://stat.ethz.ch/R-manual/R-devel/librar
## Usage
-```r
+``` r
stri_datetime_fstr(x, ignore_special = TRUE)
```
@@ -35,6 +35,8 @@ Returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other datetime: [`stri_datetime_add()`](stri_datetime_add.md), [`stri_datetime_create()`](stri_datetime_create.md), [`stri_datetime_fields()`](stri_datetime_fields.md), [`stri_datetime_format()`](stri_datetime_format.md), [`stri_datetime_now()`](stri_datetime_now.md), [`stri_datetime_symbols()`](stri_datetime_symbols.md), [`stri_timezone_get()`](stri_timezone_set.md), [`stri_timezone_info()`](stri_timezone_info.md), [`stri_timezone_list()`](stri_timezone_list.md)
## Examples
diff --git a/docs/_sources/rapi/stri_datetime_now.md.txt b/docs/_sources/rapi/stri_datetime_now.md.txt
index f14ad40f8..87ee87974 100644
--- a/docs/_sources/rapi/stri_datetime_now.md.txt
+++ b/docs/_sources/rapi/stri_datetime_now.md.txt
@@ -1,4 +1,4 @@
-# stri\_datetime\_now: Get Current Date and Time
+# stri_datetime_now: Get Current Date and Time
## Description
@@ -6,7 +6,7 @@ Returns the current date and time.
## Usage
-```r
+``` r
stri_datetime_now()
```
@@ -26,4 +26,6 @@ Returns an object of class [`POSIXct`](https://stat.ethz.ch/R-manual/R-devel/lib
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other datetime: [`stri_datetime_add()`](stri_datetime_add.md), [`stri_datetime_create()`](stri_datetime_create.md), [`stri_datetime_fields()`](stri_datetime_fields.md), [`stri_datetime_format()`](stri_datetime_format.md), [`stri_datetime_fstr()`](stri_datetime_fstr.md), [`stri_datetime_symbols()`](stri_datetime_symbols.md), [`stri_timezone_get()`](stri_timezone_set.md), [`stri_timezone_info()`](stri_timezone_info.md), [`stri_timezone_list()`](stri_timezone_list.md)
diff --git a/docs/_sources/rapi/stri_datetime_symbols.md.txt b/docs/_sources/rapi/stri_datetime_symbols.md.txt
index 9e5dd360c..b4e56c778 100644
--- a/docs/_sources/rapi/stri_datetime_symbols.md.txt
+++ b/docs/_sources/rapi/stri_datetime_symbols.md.txt
@@ -1,4 +1,4 @@
-# stri\_datetime\_symbols: List Localizable Date-Time Formatting Data
+# stri_datetime_symbols: List Localizable Date-Time Formatting Data
## Description
@@ -6,7 +6,7 @@ Returns a list of all localizable date-time formatting data, including month and
## Usage
-```r
+``` r
stri_datetime_symbols(locale = NULL, context = "standalone", width = "wide")
```
@@ -52,6 +52,8 @@ Returns a list with the following named components:
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other datetime: [`stri_datetime_add()`](stri_datetime_add.md), [`stri_datetime_create()`](stri_datetime_create.md), [`stri_datetime_fields()`](stri_datetime_fields.md), [`stri_datetime_format()`](stri_datetime_format.md), [`stri_datetime_fstr()`](stri_datetime_fstr.md), [`stri_datetime_now()`](stri_datetime_now.md), [`stri_timezone_get()`](stri_timezone_set.md), [`stri_timezone_info()`](stri_timezone_info.md), [`stri_timezone_list()`](stri_timezone_list.md)
## Examples
diff --git a/docs/_sources/rapi/stri_detect.md.txt b/docs/_sources/rapi/stri_detect.md.txt
index 03861f3c1..4a36dcde2 100644
--- a/docs/_sources/rapi/stri_detect.md.txt
+++ b/docs/_sources/rapi/stri_detect.md.txt
@@ -1,4 +1,4 @@
-# stri\_detect: Detect Pattern Occurrences
+# stri_detect: Detect Pattern Occurrences
## Description
@@ -6,7 +6,7 @@ These functions determine, for each string in `str`, if there is at least one ma
## Usage
-```r
+``` r
stri_detect(str, ..., regex, fixed, coll, charclass)
stri_detect_fixed(
@@ -74,7 +74,9 @@ Each function returns a logical vector.
The official online manual of stringi at
-Other search\_detect: [`about_search`](about_search.md), [`stri_startswith()`](stri_startsendswith.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other search_detect: [`about_search`](about_search.md), [`stri_startswith()`](stri_startsendswith.md)
## Examples
diff --git a/docs/_sources/rapi/stri_dup.md.txt b/docs/_sources/rapi/stri_dup.md.txt
index 9af726823..250ffb747 100644
--- a/docs/_sources/rapi/stri_dup.md.txt
+++ b/docs/_sources/rapi/stri_dup.md.txt
@@ -1,4 +1,4 @@
-# stri\_dup: Duplicate Strings
+# stri_dup: Duplicate Strings
## Description
@@ -6,7 +6,7 @@ Duplicates each `str`(`e1`) string `times`(`e2`) times and concatenates the resu
## Usage
-```r
+``` r
stri_dup(str, times)
e1 %s*% e2
@@ -39,6 +39,8 @@ Returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other join: [`%s+%()`](+25s+2B+25.md), [`stri_flatten()`](stri_flatten.md), [`stri_join_list()`](stri_join_list.md), [`stri_join()`](stri_join.md)
## Examples
diff --git a/docs/_sources/rapi/stri_duplicated.md.txt b/docs/_sources/rapi/stri_duplicated.md.txt
index 38ff988b0..762c65594 100644
--- a/docs/_sources/rapi/stri_duplicated.md.txt
+++ b/docs/_sources/rapi/stri_duplicated.md.txt
@@ -1,4 +1,4 @@
-# stri\_duplicated: Determine Duplicated Elements
+# stri_duplicated: Determine Duplicated Elements
## Description
@@ -8,7 +8,7 @@
## Usage
-```r
+``` r
stri_duplicated(
str,
from_last = FALSE,
@@ -62,7 +62,9 @@ See also [`stri_unique`](stri_unique.md) for extracting unique elements.
The official online manual of stringi at
-Other locale\_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other locale_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
## Examples
diff --git a/docs/_sources/rapi/stri_enc_detect.md.txt b/docs/_sources/rapi/stri_enc_detect.md.txt
index 802481e16..f651c9f36 100644
--- a/docs/_sources/rapi/stri_enc_detect.md.txt
+++ b/docs/_sources/rapi/stri_enc_detect.md.txt
@@ -1,4 +1,4 @@
-# stri\_enc\_detect: Detect Character Set and Language
+# stri_enc_detect: Detect Character Set and Language
## Description
@@ -6,7 +6,7 @@ This function uses the ICU engine to determine the char
## Usage
-```r
+``` r
stri_enc_detect(str, filter_angle_brackets = FALSE)
```
@@ -33,40 +33,40 @@ This function should most often be used for byte-marked input strings, especiall
The following table shows all the encodings that can be detected:
-| | |
-|:-------------------|:--------------------------------------------------------------------------------|
-| **Character\_Set** | **Languages** |
-| UTF-8 | \-- |
-| UTF-16BE | \-- |
-| UTF-16LE | \-- |
-| UTF-32BE | \-- |
-| UTF-32LE | \-- |
-| Shift\_JIS | Japanese |
-| ISO-2022-JP | Japanese |
-| ISO-2022-CN | Simplified Chinese |
-| ISO-2022-KR | Korean |
-| GB18030 | Chinese |
-| Big5 | Traditional Chinese |
-| EUC-JP | Japanese |
-| EUC-KR | Korean |
-| ISO-8859-1 | Danish, Dutch, English, French, German, Italian, Norwegian, Portuguese, Swedish |
-| ISO-8859-2 | Czech, Hungarian, Polish, Romanian |
-| ISO-8859-5 | Russian |
-| ISO-8859-6 | Arabic |
-| ISO-8859-7 | Greek |
-| ISO-8859-8 | Hebrew |
-| ISO-8859-9 | Turkish |
-| windows-1250 | Czech, Hungarian, Polish, Romanian |
-| windows-1251 | Russian |
-| windows-1252 | Danish, Dutch, English, French, German, Italian, Norwegian, Portuguese, Swedish |
-| windows-1253 | Greek |
-| windows-1254 | Turkish |
-| windows-1255 | Hebrew |
-| windows-1256 | Arabic |
-| KOI8-R | Russian |
-| IBM420 | Arabic |
-| IBM424 | Hebrew |
-| | |
+| | |
+|:------------------|:--------------------------------------------------------------------------------|
+| **Character_Set** | **Languages** |
+| UTF-8 | \-- |
+| UTF-16BE | \-- |
+| UTF-16LE | \-- |
+| UTF-32BE | \-- |
+| UTF-32LE | \-- |
+| Shift_JIS | Japanese |
+| ISO-2022-JP | Japanese |
+| ISO-2022-CN | Simplified Chinese |
+| ISO-2022-KR | Korean |
+| GB18030 | Chinese |
+| Big5 | Traditional Chinese |
+| EUC-JP | Japanese |
+| EUC-KR | Korean |
+| ISO-8859-1 | Danish, Dutch, English, French, German, Italian, Norwegian, Portuguese, Swedish |
+| ISO-8859-2 | Czech, Hungarian, Polish, Romanian |
+| ISO-8859-5 | Russian |
+| ISO-8859-6 | Arabic |
+| ISO-8859-7 | Greek |
+| ISO-8859-8 | Hebrew |
+| ISO-8859-9 | Turkish |
+| windows-1250 | Czech, Hungarian, Polish, Romanian |
+| windows-1251 | Russian |
+| windows-1252 | Danish, Dutch, English, French, German, Italian, Norwegian, Portuguese, Swedish |
+| windows-1253 | Greek |
+| windows-1254 | Turkish |
+| windows-1255 | Hebrew |
+| windows-1256 | Arabic |
+| KOI8-R | Russian |
+| IBM420 | Arabic |
+| IBM424 | Hebrew |
+| | |
## Value
@@ -92,7 +92,9 @@ The guesses are ordered by decreasing confidence.
The official online manual of stringi at
-Other encoding\_detection: [`about_encoding`](about_encoding.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_enc_isascii()`](stri_enc_isascii.md), [`stri_enc_isutf16be()`](stri_enc_isutf16.md), [`stri_enc_isutf8()`](stri_enc_isutf8.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other encoding_detection: [`about_encoding`](about_encoding.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_enc_isascii()`](stri_enc_isascii.md), [`stri_enc_isutf16be()`](stri_enc_isutf16.md), [`stri_enc_isutf8()`](stri_enc_isutf8.md)
## Examples
diff --git a/docs/_sources/rapi/stri_enc_detect2.md.txt b/docs/_sources/rapi/stri_enc_detect2.md.txt
index a85706cfb..9ae05bb2c 100644
--- a/docs/_sources/rapi/stri_enc_detect2.md.txt
+++ b/docs/_sources/rapi/stri_enc_detect2.md.txt
@@ -1,4 +1,4 @@
-# stri\_enc\_detect2: \[DEPRECATED\] Detect Locale-Sensitive Character Encoding
+# stri_enc_detect2: \[DEPRECATED\] Detect Locale-Sensitive Character Encoding
## Description
@@ -6,7 +6,7 @@ This function tries to detect character encoding in case the language of text is
## Usage
-```r
+``` r
stri_enc_detect2(str, locale = NULL)
```
@@ -49,6 +49,8 @@ The guesses are ordered by decreasing confidence.
The official online manual of stringi at
-Other locale\_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
-Other encoding\_detection: [`about_encoding`](about_encoding.md), [`stri_enc_detect()`](stri_enc_detect.md), [`stri_enc_isascii()`](stri_enc_isascii.md), [`stri_enc_isutf16be()`](stri_enc_isutf16.md), [`stri_enc_isutf8()`](stri_enc_isutf8.md)
+Other locale_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
+
+Other encoding_detection: [`about_encoding`](about_encoding.md), [`stri_enc_detect()`](stri_enc_detect.md), [`stri_enc_isascii()`](stri_enc_isascii.md), [`stri_enc_isutf16be()`](stri_enc_isutf16.md), [`stri_enc_isutf8()`](stri_enc_isutf8.md)
diff --git a/docs/_sources/rapi/stri_enc_fromutf32.md.txt b/docs/_sources/rapi/stri_enc_fromutf32.md.txt
index 13a049852..cd85b7757 100644
--- a/docs/_sources/rapi/stri_enc_fromutf32.md.txt
+++ b/docs/_sources/rapi/stri_enc_fromutf32.md.txt
@@ -1,4 +1,4 @@
-# stri\_enc\_fromutf32: Convert From UTF-32
+# stri_enc_fromutf32: Convert From UTF-32
## Description
@@ -6,7 +6,7 @@ This function converts integer vectors, representing sequences of UTF-32 code po
## Usage
-```r
+``` r
stri_enc_fromutf32(vec)
```
@@ -38,4 +38,6 @@ Returns a character vector (in UTF-8). `NULL`s in the input list are converted t
The official online manual of stringi at
-Other encoding\_conversion: [`about_encoding`](about_encoding.md), [`stri_enc_toascii()`](stri_enc_toascii.md), [`stri_enc_tonative()`](stri_enc_tonative.md), [`stri_enc_toutf32()`](stri_enc_toutf32.md), [`stri_enc_toutf8()`](stri_enc_toutf8.md), [`stri_encode()`](stri_encode.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other encoding_conversion: [`about_encoding`](about_encoding.md), [`stri_enc_toascii()`](stri_enc_toascii.md), [`stri_enc_tonative()`](stri_enc_tonative.md), [`stri_enc_toutf32()`](stri_enc_toutf32.md), [`stri_enc_toutf8()`](stri_enc_toutf8.md), [`stri_encode()`](stri_encode.md)
diff --git a/docs/_sources/rapi/stri_enc_info.md.txt b/docs/_sources/rapi/stri_enc_info.md.txt
index 478205f2c..c38e1ef25 100644
--- a/docs/_sources/rapi/stri_enc_info.md.txt
+++ b/docs/_sources/rapi/stri_enc_info.md.txt
@@ -1,4 +1,4 @@
-# stri\_enc\_info: Query a Character Encoding
+# stri_enc_info: Query a Character Encoding
## Description
@@ -6,7 +6,7 @@ Gets basic information on a character encoding.
## Usage
-```r
+``` r
stri_enc_info(enc = NULL)
```
@@ -48,4 +48,6 @@ Returns a list with the following components:
The official online manual of stringi at
-Other encoding\_management: [`about_encoding`](about_encoding.md), [`stri_enc_list()`](stri_enc_list.md), [`stri_enc_mark()`](stri_enc_mark.md), [`stri_enc_set()`](stri_enc_set.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other encoding_management: [`about_encoding`](about_encoding.md), [`stri_enc_list()`](stri_enc_list.md), [`stri_enc_mark()`](stri_enc_mark.md), [`stri_enc_set()`](stri_enc_set.md)
diff --git a/docs/_sources/rapi/stri_enc_isascii.md.txt b/docs/_sources/rapi/stri_enc_isascii.md.txt
index 0af0a2a3f..e91d7a006 100644
--- a/docs/_sources/rapi/stri_enc_isascii.md.txt
+++ b/docs/_sources/rapi/stri_enc_isascii.md.txt
@@ -1,4 +1,4 @@
-# stri\_enc\_isascii: Check If a Data Stream Is Possibly in ASCII
+# stri_enc_isascii: Check If a Data Stream Is Possibly in ASCII
## Description
@@ -6,7 +6,7 @@ The function checks whether all bytes in a string are \<= 127.
## Usage
-```r
+``` r
stri_enc_isascii(str)
```
@@ -32,7 +32,9 @@ Returns a logical vector. The i-th element indicates whether the i-th string cor
The official online manual of stringi at
-Other encoding\_detection: [`about_encoding`](about_encoding.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_enc_detect()`](stri_enc_detect.md), [`stri_enc_isutf16be()`](stri_enc_isutf16.md), [`stri_enc_isutf8()`](stri_enc_isutf8.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other encoding_detection: [`about_encoding`](about_encoding.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_enc_detect()`](stri_enc_detect.md), [`stri_enc_isutf16be()`](stri_enc_isutf16.md), [`stri_enc_isutf8()`](stri_enc_isutf8.md)
## Examples
diff --git a/docs/_sources/rapi/stri_enc_isutf16.md.txt b/docs/_sources/rapi/stri_enc_isutf16.md.txt
index 2f83cd4ac..9c7488936 100644
--- a/docs/_sources/rapi/stri_enc_isutf16.md.txt
+++ b/docs/_sources/rapi/stri_enc_isutf16.md.txt
@@ -1,4 +1,4 @@
-# stri\_enc\_isutf16: Check If a Data Stream Is Possibly in UTF-16 or UTF-32
+# stri_enc_isutf16: Check If a Data Stream Is Possibly in UTF-16 or UTF-32
## Description
@@ -6,7 +6,7 @@ These functions detect whether a given byte stream is valid UTF-16LE, UTF-16BE,
## Usage
-```r
+``` r
stri_enc_isutf16be(str)
stri_enc_isutf16le(str)
@@ -42,4 +42,6 @@ Returns a logical vector.
The official online manual of stringi at
-Other encoding\_detection: [`about_encoding`](about_encoding.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_enc_detect()`](stri_enc_detect.md), [`stri_enc_isascii()`](stri_enc_isascii.md), [`stri_enc_isutf8()`](stri_enc_isutf8.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other encoding_detection: [`about_encoding`](about_encoding.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_enc_detect()`](stri_enc_detect.md), [`stri_enc_isascii()`](stri_enc_isascii.md), [`stri_enc_isutf8()`](stri_enc_isutf8.md)
diff --git a/docs/_sources/rapi/stri_enc_isutf8.md.txt b/docs/_sources/rapi/stri_enc_isutf8.md.txt
index 28144a241..3490d7399 100644
--- a/docs/_sources/rapi/stri_enc_isutf8.md.txt
+++ b/docs/_sources/rapi/stri_enc_isutf8.md.txt
@@ -1,4 +1,4 @@
-# stri\_enc\_isutf8: Check If a Data Stream Is Possibly in UTF-8
+# stri_enc_isutf8: Check If a Data Stream Is Possibly in UTF-8
## Description
@@ -6,7 +6,7 @@ The function checks whether given sequences of bytes forms a proper UTF-8 string
## Usage
-```r
+``` r
stri_enc_isutf8(str)
```
@@ -36,7 +36,9 @@ Returns a logical vector. Its i-th element indicates whether the i-th string cor
The official online manual of stringi at
-Other encoding\_detection: [`about_encoding`](about_encoding.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_enc_detect()`](stri_enc_detect.md), [`stri_enc_isascii()`](stri_enc_isascii.md), [`stri_enc_isutf16be()`](stri_enc_isutf16.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other encoding_detection: [`about_encoding`](about_encoding.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_enc_detect()`](stri_enc_detect.md), [`stri_enc_isascii()`](stri_enc_isascii.md), [`stri_enc_isutf16be()`](stri_enc_isutf16.md)
## Examples
diff --git a/docs/_sources/rapi/stri_enc_list.md.txt b/docs/_sources/rapi/stri_enc_list.md.txt
index 2eb2db44d..620b8ccce 100644
--- a/docs/_sources/rapi/stri_enc_list.md.txt
+++ b/docs/_sources/rapi/stri_enc_list.md.txt
@@ -1,4 +1,4 @@
-# stri\_enc\_list: List Known Character Encodings
+# stri_enc_list: List Known Character Encodings
## Description
@@ -6,7 +6,7 @@ Gives the list of encodings that are supported by ICU.
## Usage
-```r
+``` r
stri_enc_list(simplify = TRUE)
```
@@ -34,7 +34,9 @@ If `simplify` is `TRUE` (the default), then the resulting list is coerced to a c
The official online manual of stringi at
-Other encoding\_management: [`about_encoding`](about_encoding.md), [`stri_enc_info()`](stri_enc_info.md), [`stri_enc_mark()`](stri_enc_mark.md), [`stri_enc_set()`](stri_enc_set.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other encoding_management: [`about_encoding`](about_encoding.md), [`stri_enc_info()`](stri_enc_info.md), [`stri_enc_mark()`](stri_enc_mark.md), [`stri_enc_set()`](stri_enc_set.md)
## Examples
diff --git a/docs/_sources/rapi/stri_enc_mark.md.txt b/docs/_sources/rapi/stri_enc_mark.md.txt
index bd776447a..5699a49cd 100644
--- a/docs/_sources/rapi/stri_enc_mark.md.txt
+++ b/docs/_sources/rapi/stri_enc_mark.md.txt
@@ -1,4 +1,4 @@
-# stri\_enc\_mark: Get Declared Encodings of Each String
+# stri_enc_mark: Get Declared Encodings of Each String
## Description
@@ -6,7 +6,7 @@ Reads declared encodings for each string in a character vector as seen by stringi at
-Other encoding\_management: [`about_encoding`](about_encoding.md), [`stri_enc_info()`](stri_enc_info.md), [`stri_enc_list()`](stri_enc_list.md), [`stri_enc_set()`](stri_enc_set.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other encoding_management: [`about_encoding`](about_encoding.md), [`stri_enc_info()`](stri_enc_info.md), [`stri_enc_list()`](stri_enc_list.md), [`stri_enc_set()`](stri_enc_set.md)
diff --git a/docs/_sources/rapi/stri_enc_set.md.txt b/docs/_sources/rapi/stri_enc_set.md.txt
index 9771d429c..e1c480e42 100644
--- a/docs/_sources/rapi/stri_enc_set.md.txt
+++ b/docs/_sources/rapi/stri_enc_set.md.txt
@@ -1,4 +1,4 @@
-# stri\_enc\_set: Set or Get Default Character Encoding in stringi
+# stri_enc_set: Set or Get Default Character Encoding in stringi
## Description
@@ -6,7 +6,7 @@
## Usage
-```r
+``` r
stri_enc_set(enc)
stri_enc_get()
@@ -42,4 +42,6 @@ If you set a default encoding that is neither a superset of ASCII, nor an 8-bit
The official online manual of stringi at
-Other encoding\_management: [`about_encoding`](about_encoding.md), [`stri_enc_info()`](stri_enc_info.md), [`stri_enc_list()`](stri_enc_list.md), [`stri_enc_mark()`](stri_enc_mark.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other encoding_management: [`about_encoding`](about_encoding.md), [`stri_enc_info()`](stri_enc_info.md), [`stri_enc_list()`](stri_enc_list.md), [`stri_enc_mark()`](stri_enc_mark.md)
diff --git a/docs/_sources/rapi/stri_enc_toascii.md.txt b/docs/_sources/rapi/stri_enc_toascii.md.txt
index db567c7d0..c6d3e1868 100644
--- a/docs/_sources/rapi/stri_enc_toascii.md.txt
+++ b/docs/_sources/rapi/stri_enc_toascii.md.txt
@@ -1,4 +1,4 @@
-# stri\_enc\_toascii: Convert To ASCII
+# stri_enc_toascii: Convert To ASCII
## Description
@@ -6,7 +6,7 @@ This function converts input strings to ASCII, i.e., to character strings consis
## Usage
-```r
+``` r
stri_enc_toascii(str)
```
@@ -36,4 +36,6 @@ Returns a character vector.
The official online manual of stringi at
-Other encoding\_conversion: [`about_encoding`](about_encoding.md), [`stri_enc_fromutf32()`](stri_enc_fromutf32.md), [`stri_enc_tonative()`](stri_enc_tonative.md), [`stri_enc_toutf32()`](stri_enc_toutf32.md), [`stri_enc_toutf8()`](stri_enc_toutf8.md), [`stri_encode()`](stri_encode.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other encoding_conversion: [`about_encoding`](about_encoding.md), [`stri_enc_fromutf32()`](stri_enc_fromutf32.md), [`stri_enc_tonative()`](stri_enc_tonative.md), [`stri_enc_toutf32()`](stri_enc_toutf32.md), [`stri_enc_toutf8()`](stri_enc_toutf8.md), [`stri_encode()`](stri_encode.md)
diff --git a/docs/_sources/rapi/stri_enc_tonative.md.txt b/docs/_sources/rapi/stri_enc_tonative.md.txt
index 76cf6be96..ff07199f9 100644
--- a/docs/_sources/rapi/stri_enc_tonative.md.txt
+++ b/docs/_sources/rapi/stri_enc_tonative.md.txt
@@ -1,4 +1,4 @@
-# stri\_enc\_tonative: Convert Strings To Native Encoding
+# stri_enc_tonative: Convert Strings To Native Encoding
## Description
@@ -6,7 +6,7 @@ Converts character strings with declared encodings to the current native encodin
## Usage
-```r
+``` r
stri_enc_tonative(str)
```
@@ -34,4 +34,6 @@ Returns a character vector.
The official online manual of stringi at
-Other encoding\_conversion: [`about_encoding`](about_encoding.md), [`stri_enc_fromutf32()`](stri_enc_fromutf32.md), [`stri_enc_toascii()`](stri_enc_toascii.md), [`stri_enc_toutf32()`](stri_enc_toutf32.md), [`stri_enc_toutf8()`](stri_enc_toutf8.md), [`stri_encode()`](stri_encode.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other encoding_conversion: [`about_encoding`](about_encoding.md), [`stri_enc_fromutf32()`](stri_enc_fromutf32.md), [`stri_enc_toascii()`](stri_enc_toascii.md), [`stri_enc_toutf32()`](stri_enc_toutf32.md), [`stri_enc_toutf8()`](stri_enc_toutf8.md), [`stri_encode()`](stri_encode.md)
diff --git a/docs/_sources/rapi/stri_enc_toutf32.md.txt b/docs/_sources/rapi/stri_enc_toutf32.md.txt
index d094d33d8..e80becbee 100644
--- a/docs/_sources/rapi/stri_enc_toutf32.md.txt
+++ b/docs/_sources/rapi/stri_enc_toutf32.md.txt
@@ -1,4 +1,4 @@
-# stri\_enc\_toutf32: Convert Strings To UTF-32
+# stri_enc_toutf32: Convert Strings To UTF-32
## Description
@@ -6,7 +6,7 @@ UTF-32 is a 32-bit encoding where each Unicode code point corresponds to exactly
## Usage
-```r
+``` r
stri_enc_toutf32(str)
```
@@ -36,4 +36,6 @@ Returns a list of integer vectors. Missing values are converted to `NULL`s.
The official online manual of stringi at
-Other encoding\_conversion: [`about_encoding`](about_encoding.md), [`stri_enc_fromutf32()`](stri_enc_fromutf32.md), [`stri_enc_toascii()`](stri_enc_toascii.md), [`stri_enc_tonative()`](stri_enc_tonative.md), [`stri_enc_toutf8()`](stri_enc_toutf8.md), [`stri_encode()`](stri_encode.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other encoding_conversion: [`about_encoding`](about_encoding.md), [`stri_enc_fromutf32()`](stri_enc_fromutf32.md), [`stri_enc_toascii()`](stri_enc_toascii.md), [`stri_enc_tonative()`](stri_enc_tonative.md), [`stri_enc_toutf8()`](stri_enc_toutf8.md), [`stri_encode()`](stri_encode.md)
diff --git a/docs/_sources/rapi/stri_enc_toutf8.md.txt b/docs/_sources/rapi/stri_enc_toutf8.md.txt
index 23e0cadd6..b3faee679 100644
--- a/docs/_sources/rapi/stri_enc_toutf8.md.txt
+++ b/docs/_sources/rapi/stri_enc_toutf8.md.txt
@@ -1,4 +1,4 @@
-# stri\_enc\_toutf8: Convert Strings To UTF-8
+# stri_enc_toutf8: Convert Strings To UTF-8
## Description
@@ -6,7 +6,7 @@ Converts character strings with declared marked encodings to UTF-8 strings.
## Usage
-```r
+``` r
stri_enc_toutf8(str, is_unknown_8bit = FALSE, validate = FALSE)
```
@@ -42,4 +42,6 @@ Returns a character vector.
The official online manual of stringi at
-Other encoding\_conversion: [`about_encoding`](about_encoding.md), [`stri_enc_fromutf32()`](stri_enc_fromutf32.md), [`stri_enc_toascii()`](stri_enc_toascii.md), [`stri_enc_tonative()`](stri_enc_tonative.md), [`stri_enc_toutf32()`](stri_enc_toutf32.md), [`stri_encode()`](stri_encode.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other encoding_conversion: [`about_encoding`](about_encoding.md), [`stri_enc_fromutf32()`](stri_enc_fromutf32.md), [`stri_enc_toascii()`](stri_enc_toascii.md), [`stri_enc_tonative()`](stri_enc_tonative.md), [`stri_enc_toutf32()`](stri_enc_toutf32.md), [`stri_encode()`](stri_encode.md)
diff --git a/docs/_sources/rapi/stri_encode.md.txt b/docs/_sources/rapi/stri_encode.md.txt
index b3c76571b..d9bced7f9 100644
--- a/docs/_sources/rapi/stri_encode.md.txt
+++ b/docs/_sources/rapi/stri_encode.md.txt
@@ -1,4 +1,4 @@
-# stri\_encode: Convert Strings Between Given Encodings
+# stri_encode: Convert Strings Between Given Encodings
## Description
@@ -6,7 +6,7 @@ These functions convert strings between encodings. They aim to serve as a more p
## Usage
-```r
+``` r
stri_encode(str, from = NULL, to = NULL, to_raw = FALSE)
stri_conv(str, from = NULL, to = NULL, to_raw = FALSE)
@@ -57,4 +57,6 @@ If `to_raw` is `FALSE`, then a character vector with encoded strings (and approp
The official online manual of stringi at
-Other encoding\_conversion: [`about_encoding`](about_encoding.md), [`stri_enc_fromutf32()`](stri_enc_fromutf32.md), [`stri_enc_toascii()`](stri_enc_toascii.md), [`stri_enc_tonative()`](stri_enc_tonative.md), [`stri_enc_toutf32()`](stri_enc_toutf32.md), [`stri_enc_toutf8()`](stri_enc_toutf8.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other encoding_conversion: [`about_encoding`](about_encoding.md), [`stri_enc_fromutf32()`](stri_enc_fromutf32.md), [`stri_enc_toascii()`](stri_enc_toascii.md), [`stri_enc_tonative()`](stri_enc_tonative.md), [`stri_enc_toutf32()`](stri_enc_toutf32.md), [`stri_enc_toutf8()`](stri_enc_toutf8.md)
diff --git a/docs/_sources/rapi/stri_escape_unicode.md.txt b/docs/_sources/rapi/stri_escape_unicode.md.txt
index d4fac8eb4..61931b9f8 100644
--- a/docs/_sources/rapi/stri_escape_unicode.md.txt
+++ b/docs/_sources/rapi/stri_escape_unicode.md.txt
@@ -1,4 +1,4 @@
-# stri\_escape\_unicode: Escape Unicode Code Points
+# stri_escape_unicode: Escape Unicode Code Points
## Description
@@ -6,7 +6,7 @@ Escapes all Unicode (not ASCII-printable) code points.
## Usage
-```r
+``` r
stri_escape_unicode(str)
```
@@ -34,6 +34,8 @@ Returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other escape: [`stri_unescape_unicode()`](stri_unescape_unicode.md)
## Examples
diff --git a/docs/_sources/rapi/stri_extract.md.txt b/docs/_sources/rapi/stri_extract.md.txt
index 6ee43d097..2b4c8ed5f 100644
--- a/docs/_sources/rapi/stri_extract.md.txt
+++ b/docs/_sources/rapi/stri_extract.md.txt
@@ -1,4 +1,4 @@
-# stri\_extract: Extract Pattern Occurrences
+# stri_extract: Extract Pattern Occurrences
## Description
@@ -8,7 +8,7 @@ These functions extract all substrings matching a given pattern.
## Usage
-```r
+``` r
stri_extract_all(str, ..., regex, fixed, coll, charclass)
stri_extract_first(str, ..., regex, fixed, coll, charclass)
@@ -116,7 +116,9 @@ Note that `stri_extract_last_regex` searches from start to end, but skips overla
The official online manual of stringi at
-Other search\_extract: [`about_search`](about_search.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_match_all()`](stri_match.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other search_extract: [`about_search`](about_search.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_match_all()`](stri_match.md)
## Examples
diff --git a/docs/_sources/rapi/stri_extract_boundaries.md.txt b/docs/_sources/rapi/stri_extract_boundaries.md.txt
index ea5d154dd..a1e52c666 100644
--- a/docs/_sources/rapi/stri_extract_boundaries.md.txt
+++ b/docs/_sources/rapi/stri_extract_boundaries.md.txt
@@ -1,4 +1,4 @@
-# stri\_extract\_boundaries: Extract Data Between Text Boundaries
+# stri_extract_boundaries: Extract Data Between Text Boundaries
## Description
@@ -6,7 +6,7 @@ These functions extract data between text boundaries.
## Usage
-```r
+``` r
stri_extract_all_boundaries(
str,
simplify = FALSE,
@@ -66,11 +66,13 @@ For `stri_extract_first_*` and `stri_extract_last_*`, a character vector is retu
The official online manual of stringi at
-Other search\_extract: [`about_search`](about_search.md), [`stri_extract_all()`](stri_extract.md), [`stri_match_all()`](stri_match.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other search_extract: [`about_search`](about_search.md), [`stri_extract_all()`](stri_extract.md), [`stri_match_all()`](stri_match.md)
-Other locale\_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
+Other locale_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
-Other text\_boundaries: [`about_search_boundaries`](about_search_boundaries.md), [`about_search`](about_search.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_brkiter()`](stri_opts_brkiter.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_split_lines()`](stri_split_lines.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_wrap()`](stri_wrap.md)
+Other text_boundaries: [`about_search_boundaries`](about_search_boundaries.md), [`about_search`](about_search.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_brkiter()`](stri_opts_brkiter.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_split_lines()`](stri_split_lines.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_wrap()`](stri_wrap.md)
## Examples
diff --git a/docs/_sources/rapi/stri_flatten.md.txt b/docs/_sources/rapi/stri_flatten.md.txt
index 7716f01f2..f31627b44 100644
--- a/docs/_sources/rapi/stri_flatten.md.txt
+++ b/docs/_sources/rapi/stri_flatten.md.txt
@@ -1,4 +1,4 @@
-# stri\_flatten: Flatten a String
+# stri_flatten: Flatten a String
## Description
@@ -6,7 +6,7 @@ Joins the elements of a character vector into one string.
## Usage
-```r
+``` r
stri_flatten(str, collapse = "", na_empty = FALSE, omit_empty = FALSE)
```
@@ -39,6 +39,8 @@ Returns a single string, i.e., a character vector of length 1.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other join: [`%s+%()`](+25s+2B+25.md), [`stri_dup()`](stri_dup.md), [`stri_join_list()`](stri_join_list.md), [`stri_join()`](stri_join.md)
## Examples
diff --git a/docs/_sources/rapi/stri_info.md.txt b/docs/_sources/rapi/stri_info.md.txt
index af5144bd4..a84c199ea 100644
--- a/docs/_sources/rapi/stri_info.md.txt
+++ b/docs/_sources/rapi/stri_info.md.txt
@@ -1,4 +1,4 @@
-# stri\_info: Query Default Settings for stringi
+# stri_info: Query Default Settings for stringi
## Description
@@ -6,7 +6,7 @@ Gives the current default settings used by the ICU libr
## Usage
-```r
+``` r
stri_info(short = FALSE)
```
@@ -43,3 +43,5 @@ Otherwise, a list with the following components is returned:
## See Also
The official online manual of stringi at
+
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
diff --git a/docs/_sources/rapi/stri_isempty.md.txt b/docs/_sources/rapi/stri_isempty.md.txt
index b48f38bc8..55de7b34d 100644
--- a/docs/_sources/rapi/stri_isempty.md.txt
+++ b/docs/_sources/rapi/stri_isempty.md.txt
@@ -1,4 +1,4 @@
-# stri\_isempty: Determine if a String is of Length Zero
+# stri_isempty: Determine if a String is of Length Zero
## Description
@@ -6,7 +6,7 @@ This is the fastest way to find out whether the elements of a character vector a
## Usage
-```r
+``` r
stri_isempty(str)
```
@@ -32,6 +32,8 @@ Returns a logical vector of the same length as `str`.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other length: [`%s$%()`](+25s+24+25.md), [`stri_length()`](stri_length.md), [`stri_numbytes()`](stri_numbytes.md), [`stri_pad_both()`](stri_pad.md), [`stri_sprintf()`](stri_sprintf.md), [`stri_width()`](stri_width.md)
## Examples
diff --git a/docs/_sources/rapi/stri_join.md.txt b/docs/_sources/rapi/stri_join.md.txt
index 96d07dcaf..c1b818977 100644
--- a/docs/_sources/rapi/stri_join.md.txt
+++ b/docs/_sources/rapi/stri_join.md.txt
@@ -1,4 +1,4 @@
-# stri\_join: Concatenate Character Vectors
+# stri_join: Concatenate Character Vectors
## Description
@@ -6,7 +6,7 @@ These are the stringi\'s equivalents of the built-in [`
## Usage
-```r
+``` r
stri_join(..., sep = "", collapse = NULL, ignore_null = FALSE)
stri_c(..., sep = "", collapse = NULL, ignore_null = FALSE)
@@ -47,6 +47,8 @@ Returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other join: [`%s+%()`](+25s+2B+25.md), [`stri_dup()`](stri_dup.md), [`stri_flatten()`](stri_flatten.md), [`stri_join_list()`](stri_join_list.md)
## Examples
diff --git a/docs/_sources/rapi/stri_join_list.md.txt b/docs/_sources/rapi/stri_join_list.md.txt
index f68f23c17..1b929ddd9 100644
--- a/docs/_sources/rapi/stri_join_list.md.txt
+++ b/docs/_sources/rapi/stri_join_list.md.txt
@@ -1,4 +1,4 @@
-# stri\_join\_list: Concatenate Strings in a List
+# stri_join_list: Concatenate Strings in a List
## Description
@@ -6,7 +6,7 @@ These functions concatenate all the strings in each character vector in a given
## Usage
-```r
+``` r
stri_join_list(x, sep = "", collapse = NULL)
stri_c_list(x, sep = "", collapse = NULL)
@@ -42,6 +42,8 @@ Returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other join: [`%s+%()`](+25s+2B+25.md), [`stri_dup()`](stri_dup.md), [`stri_flatten()`](stri_flatten.md), [`stri_join()`](stri_join.md)
## Examples
diff --git a/docs/_sources/rapi/stri_length.md.txt b/docs/_sources/rapi/stri_length.md.txt
index 631924a7c..0d327113e 100644
--- a/docs/_sources/rapi/stri_length.md.txt
+++ b/docs/_sources/rapi/stri_length.md.txt
@@ -1,4 +1,4 @@
-# stri\_length: Count the Number of Code Points
+# stri_length: Count the Number of Code Points
## Description
@@ -6,7 +6,7 @@ This function returns the number of code points in each string.
## Usage
-```r
+``` r
stri_length(str)
```
@@ -36,6 +36,8 @@ Returns an integer vector of the same length as `str`.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other length: [`%s$%()`](+25s+24+25.md), [`stri_isempty()`](stri_isempty.md), [`stri_numbytes()`](stri_numbytes.md), [`stri_pad_both()`](stri_pad.md), [`stri_sprintf()`](stri_sprintf.md), [`stri_width()`](stri_width.md)
## Examples
diff --git a/docs/_sources/rapi/stri_list2matrix.md.txt b/docs/_sources/rapi/stri_list2matrix.md.txt
index d12b4c6ab..7e199163e 100644
--- a/docs/_sources/rapi/stri_list2matrix.md.txt
+++ b/docs/_sources/rapi/stri_list2matrix.md.txt
@@ -1,4 +1,4 @@
-# stri\_list2matrix: Convert a List to a Character Matrix
+# stri_list2matrix: Convert a List to a Character Matrix
## Description
@@ -6,7 +6,7 @@ This function converts a given list of atomic vectors to a character matrix.
## Usage
-```r
+``` r
stri_list2matrix(
x,
byrow = FALSE,
@@ -48,6 +48,8 @@ Returns a character matrix.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other utils: [`stri_na2empty()`](stri_na2empty.md), [`stri_remove_empty()`](stri_remove_empty.md), [`stri_replace_na()`](stri_replace_na.md)
## Examples
diff --git a/docs/_sources/rapi/stri_locale_info.md.txt b/docs/_sources/rapi/stri_locale_info.md.txt
index 0f87d44fc..2eed39c0f 100644
--- a/docs/_sources/rapi/stri_locale_info.md.txt
+++ b/docs/_sources/rapi/stri_locale_info.md.txt
@@ -1,4 +1,4 @@
-# stri\_locale\_info: Query Given Locale
+# stri_locale_info: Query Given Locale
## Description
@@ -6,7 +6,7 @@ Provides some basic information on a given locale identifier.
## Usage
-```r
+``` r
stri_locale_info(locale = NULL)
```
@@ -18,7 +18,7 @@ stri_locale_info(locale = NULL)
## Details
-With this function you may obtain some basic information on any provided locale identifier, even if it is unsupported by ICU or if you pass a malformed locale identifier (the one that is not, e.g., of the form Language\_Country). See [stringi-locale](about_locale.md) for discussion.
+With this function you may obtain some basic information on any provided locale identifier, even if it is unsupported by ICU or if you pass a malformed locale identifier (the one that is not, e.g., of the form Language_Country). See [stringi-locale](about_locale.md) for discussion.
This function does not do anything really complicated. In many cases it is similar to a call to [`as.list(stri_split_fixed(locale, '_', 3L)[[1]])`](https://stat.ethz.ch/R-manual/R-devel/library/base/html/list.html), with `locale` case mapped. It may be used, however, to get insight on how ICU understands a given locale identifier.
@@ -34,7 +34,9 @@ Returns a list with the following named character strings: `Language`, `Country`
The official online manual of stringi at
-Other locale\_management: [`about_locale`](about_locale.md), [`stri_locale_list()`](stri_locale_list.md), [`stri_locale_set()`](stri_locale_set.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other locale_management: [`about_locale`](about_locale.md), [`stri_locale_list()`](stri_locale_list.md), [`stri_locale_set()`](stri_locale_set.md)
## Examples
diff --git a/docs/_sources/rapi/stri_locale_list.md.txt b/docs/_sources/rapi/stri_locale_list.md.txt
index 665cc06f9..0856c9b20 100644
--- a/docs/_sources/rapi/stri_locale_list.md.txt
+++ b/docs/_sources/rapi/stri_locale_list.md.txt
@@ -1,4 +1,4 @@
-# stri\_locale\_list: List Available Locales
+# stri_locale_list: List Available Locales
## Description
@@ -6,7 +6,7 @@ Creates a character vector with all available locale identifies.
## Usage
-```r
+``` r
stri_locale_list()
```
@@ -28,7 +28,9 @@ Returns a character vector with locale identifiers that are known to stringi at
-Other locale\_management: [`about_locale`](about_locale.md), [`stri_locale_info()`](stri_locale_info.md), [`stri_locale_set()`](stri_locale_set.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other locale_management: [`about_locale`](about_locale.md), [`stri_locale_info()`](stri_locale_info.md), [`stri_locale_set()`](stri_locale_set.md)
## Examples
diff --git a/docs/_sources/rapi/stri_locale_set.md.txt b/docs/_sources/rapi/stri_locale_set.md.txt
index 85c0e7af4..74fdc9e86 100644
--- a/docs/_sources/rapi/stri_locale_set.md.txt
+++ b/docs/_sources/rapi/stri_locale_set.md.txt
@@ -1,4 +1,4 @@
-# stri\_locale\_set: Set or Get Default Locale in stringi
+# stri_locale_set: Set or Get Default Locale in stringi
## Description
@@ -6,7 +6,7 @@
## Usage
-```r
+``` r
stri_locale_set(locale)
stri_locale_get()
@@ -38,7 +38,9 @@ See [stringi-locale](about_locale.md) for more information on the effect of chan
The official online manual of stringi at
-Other locale\_management: [`about_locale`](about_locale.md), [`stri_locale_info()`](stri_locale_info.md), [`stri_locale_list()`](stri_locale_list.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other locale_management: [`about_locale`](about_locale.md), [`stri_locale_info()`](stri_locale_info.md), [`stri_locale_list()`](stri_locale_list.md)
## Examples
@@ -48,12 +50,12 @@ Other locale\_management: [`about_locale`](about_locale.md), [`stri_locale_info(
```r
## Not run:
oldloc <- stri_locale_set('pt_BR')
-## You are now working with stringi_1.7.6 (pt_BR.UTF-8; ICU4C 69.1 [bundle]; Unicode 13.0)
+## You are now working with stringi_1.7.7 (pt_BR.UTF-8; ICU4C 69.1 [bundle]; Unicode 13.0)
# ... some locale-dependent operations
# ... note that you may always modify a locale per-call
# ... changing the default locale is convenient if you perform
# ... many operations
stri_locale_set(oldloc) # restore the previous default locale
-## You are now working with stringi_1.7.6 (en_AU.UTF-8; ICU4C 69.1 [bundle]; Unicode 13.0)
+## You are now working with stringi_1.7.7 (en_AU.UTF-8; ICU4C 69.1 [bundle]; Unicode 13.0)
## End(Not run)
```
diff --git a/docs/_sources/rapi/stri_locate.md.txt b/docs/_sources/rapi/stri_locate.md.txt
index 5e02e6db7..ec06130e2 100644
--- a/docs/_sources/rapi/stri_locate.md.txt
+++ b/docs/_sources/rapi/stri_locate.md.txt
@@ -1,4 +1,4 @@
-# stri\_locate: Locate Pattern Occurrences
+# stri_locate: Locate Pattern Occurrences
## Description
@@ -6,7 +6,7 @@ These functions find the indexes (positions) where there is a match to some patt
## Usage
-```r
+``` r
stri_locate_all(str, ..., regex, fixed, coll, charclass)
stri_locate_first(str, ..., regex, fixed, coll, charclass)
@@ -156,7 +156,9 @@ If `capture_groups=TRUE`, then the outputs are equipped with the `capture_groups
The official online manual of stringi at
-Other search\_locate: [`about_search`](about_search.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other search_locate: [`about_search`](about_search.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md)
Other indexing: [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_sub_all()`](stri_sub_all.md), [`stri_sub()`](stri_sub.md)
diff --git a/docs/_sources/rapi/stri_locate_boundaries.md.txt b/docs/_sources/rapi/stri_locate_boundaries.md.txt
index 52aedba04..4600c4d8f 100644
--- a/docs/_sources/rapi/stri_locate_boundaries.md.txt
+++ b/docs/_sources/rapi/stri_locate_boundaries.md.txt
@@ -1,4 +1,4 @@
-# stri\_locate\_boundaries: Locate Text Boundaries
+# stri_locate_boundaries: Locate Text Boundaries
## Description
@@ -6,7 +6,7 @@ These functions locate text boundaries (like character, word, line, or sentence
## Usage
-```r
+``` r
stri_locate_all_boundaries(
str,
omit_no_match = FALSE,
@@ -62,13 +62,15 @@ For `stri_locate_*_words`, just like in [`stri_extract_all_words`](stri_extract_
The official online manual of stringi at
-Other search\_locate: [`about_search`](about_search.md), [`stri_locate_all()`](stri_locate.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other search_locate: [`about_search`](about_search.md), [`stri_locate_all()`](stri_locate.md)
Other indexing: [`stri_locate_all()`](stri_locate.md), [`stri_sub_all()`](stri_sub_all.md), [`stri_sub()`](stri_sub.md)
-Other locale\_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
+Other locale_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
-Other text\_boundaries: [`about_search_boundaries`](about_search_boundaries.md), [`about_search`](about_search.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_opts_brkiter()`](stri_opts_brkiter.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_split_lines()`](stri_split_lines.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_wrap()`](stri_wrap.md)
+Other text_boundaries: [`about_search_boundaries`](about_search_boundaries.md), [`about_search`](about_search.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_opts_brkiter()`](stri_opts_brkiter.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_split_lines()`](stri_split_lines.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_wrap()`](stri_wrap.md)
## Examples
diff --git a/docs/_sources/rapi/stri_match.md.txt b/docs/_sources/rapi/stri_match.md.txt
index 585cbd337..6daf56a21 100644
--- a/docs/_sources/rapi/stri_match.md.txt
+++ b/docs/_sources/rapi/stri_match.md.txt
@@ -1,4 +1,4 @@
-# stri\_match: Extract Regex Pattern Matches, Together with Capture Groups
+# stri_match: Extract Regex Pattern Matches, Together with Capture Groups
## Description
@@ -6,7 +6,7 @@ These functions extract substrings in `str` that match a given regex `pattern`.
## Usage
-```r
+``` r
stri_match_all(str, ..., regex)
stri_match_first(str, ..., regex)
@@ -79,7 +79,9 @@ If regular expressions feature a named capture group, the matrix columns will be
The official online manual of stringi at
-Other search\_extract: [`about_search`](about_search.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_extract_all()`](stri_extract.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other search_extract: [`about_search`](about_search.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_extract_all()`](stri_extract.md)
## Examples
diff --git a/docs/_sources/rapi/stri_na2empty.md.txt b/docs/_sources/rapi/stri_na2empty.md.txt
index f2b8bb98a..453fa7f54 100644
--- a/docs/_sources/rapi/stri_na2empty.md.txt
+++ b/docs/_sources/rapi/stri_na2empty.md.txt
@@ -1,4 +1,4 @@
-# stri\_na2empty: Replace NAs with Empty Strings
+# stri_na2empty: Replace NAs with Empty Strings
## Description
@@ -6,7 +6,7 @@ This function replaces all missing values with empty strings. See [`stri_replace
## Usage
-```r
+``` r
stri_na2empty(x)
```
@@ -28,6 +28,8 @@ Returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other utils: [`stri_list2matrix()`](stri_list2matrix.md), [`stri_remove_empty()`](stri_remove_empty.md), [`stri_replace_na()`](stri_replace_na.md)
## Examples
diff --git a/docs/_sources/rapi/stri_numbytes.md.txt b/docs/_sources/rapi/stri_numbytes.md.txt
index 87798e336..6f69187a5 100644
--- a/docs/_sources/rapi/stri_numbytes.md.txt
+++ b/docs/_sources/rapi/stri_numbytes.md.txt
@@ -1,4 +1,4 @@
-# stri\_numbytes: Count the Number of Bytes
+# stri_numbytes: Count the Number of Bytes
## Description
@@ -6,7 +6,7 @@ Counts the number of bytes needed to store each string in the computer\'s memory
## Usage
-```r
+``` r
stri_numbytes(str)
```
@@ -40,6 +40,8 @@ Returns an integer vector of the same length as `str`.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other length: [`%s$%()`](+25s+24+25.md), [`stri_isempty()`](stri_isempty.md), [`stri_length()`](stri_length.md), [`stri_pad_both()`](stri_pad.md), [`stri_sprintf()`](stri_sprintf.md), [`stri_width()`](stri_width.md)
## Examples
diff --git a/docs/_sources/rapi/stri_opts_brkiter.md.txt b/docs/_sources/rapi/stri_opts_brkiter.md.txt
index 2e4844096..40d021391 100644
--- a/docs/_sources/rapi/stri_opts_brkiter.md.txt
+++ b/docs/_sources/rapi/stri_opts_brkiter.md.txt
@@ -1,4 +1,4 @@
-# stri\_opts\_brkiter: Generate a List with BreakIterator Settings
+# stri_opts_brkiter: Generate a List with BreakIterator Settings
## Description
@@ -6,7 +6,7 @@ A convenience function to tune the ICU `BreakIterator`\
## Usage
-```r
+``` r
stri_opts_brkiter(
type,
locale,
@@ -64,4 +64,6 @@ Returns a named list object. Omitted `skip_*` values act as they have been set t
The official online manual of stringi at
-Other text\_boundaries: [`about_search_boundaries`](about_search_boundaries.md), [`about_search`](about_search.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_split_lines()`](stri_split_lines.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_wrap()`](stri_wrap.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other text_boundaries: [`about_search_boundaries`](about_search_boundaries.md), [`about_search`](about_search.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_split_lines()`](stri_split_lines.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_wrap()`](stri_wrap.md)
diff --git a/docs/_sources/rapi/stri_opts_collator.md.txt b/docs/_sources/rapi/stri_opts_collator.md.txt
index f1b1c7498..4c4f0285d 100644
--- a/docs/_sources/rapi/stri_opts_collator.md.txt
+++ b/docs/_sources/rapi/stri_opts_collator.md.txt
@@ -1,4 +1,4 @@
-# stri\_opts\_collator: Generate a List with Collator Settings
+# stri_opts_collator: Generate a List with Collator Settings
## Description
@@ -6,7 +6,7 @@ A convenience function to tune the ICU Collator\'s beha
## Usage
-```r
+``` r
stri_opts_collator(
locale = NULL,
strength = 3L,
@@ -46,7 +46,7 @@ stri_coll(
| `case_level` | single logical value; controls whether an extra case level (positioned before the third level) is generated or not |
| `normalization` | single logical value; if `TRUE`, then incremental check is performed to see whether the input data is in the FCD form. If the data is not in the FCD form, incremental NFD normalization is performed |
| `normalisation` | alias of `normalization` |
-| `numeric` | single logical value; when turned on, this attribute generates a collation key for the numeric value of substrings of digits; this is a way to get \'100\' to sort AFTER \'2\' |
+| `numeric` | single logical value; when turned on, this attribute generates a collation key for the numeric value of substrings of digits; this is a way to get \'100\' to sort AFTER \'2\'; note that negative numbers will not be ordered properly |
| `...` | \[DEPRECATED\] any other arguments passed to this function generate a warning; this argument will be removed in the future |
## Details
@@ -73,9 +73,11 @@ Returns a named list object; missing settings are left with default values.
The official online manual of stringi at
-Other locale\_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other locale_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
-Other search\_coll: [`about_search_coll`](about_search_coll.md), [`about_search`](about_search.md)
+Other search_coll: [`about_search_coll`](about_search_coll.md), [`about_search`](about_search.md)
## Examples
diff --git a/docs/_sources/rapi/stri_opts_fixed.md.txt b/docs/_sources/rapi/stri_opts_fixed.md.txt
index 3eb7e1caa..ebdd6c6b5 100644
--- a/docs/_sources/rapi/stri_opts_fixed.md.txt
+++ b/docs/_sources/rapi/stri_opts_fixed.md.txt
@@ -1,4 +1,4 @@
-# stri\_opts\_fixed: Generate a List with Fixed Pattern Search Engine\'s Settings
+# stri_opts_fixed: Generate a List with Fixed Pattern Search Engine\'s Settings
## Description
@@ -6,7 +6,7 @@ A convenience function used to tune up the behavior of `stri_*_fixed` functions,
## Usage
-```r
+``` r
stri_opts_fixed(case_insensitive = FALSE, overlap = FALSE, ...)
```
@@ -40,7 +40,9 @@ Returns a named list object.
The official online manual of stringi at
-Other search\_fixed: [`about_search_fixed`](about_search_fixed.md), [`about_search`](about_search.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other search_fixed: [`about_search_fixed`](about_search_fixed.md), [`about_search`](about_search.md)
## Examples
diff --git a/docs/_sources/rapi/stri_opts_regex.md.txt b/docs/_sources/rapi/stri_opts_regex.md.txt
index b1649549c..f3d054046 100644
--- a/docs/_sources/rapi/stri_opts_regex.md.txt
+++ b/docs/_sources/rapi/stri_opts_regex.md.txt
@@ -1,4 +1,4 @@
-# stri\_opts\_regex: Generate a List with Regex Matcher Settings
+# stri_opts_regex: Generate a List with Regex Matcher Settings
## Description
@@ -6,7 +6,7 @@ A convenience function to tune the ICU regular expressi
## Usage
-```r
+``` r
stri_opts_regex(
case_insensitive,
comments,
@@ -64,7 +64,9 @@ Returns a named list object; missing settings are left with default values.
The official online manual of stringi at
-Other search\_regex: [`about_search_regex`](about_search_regex.md), [`about_search`](about_search.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other search_regex: [`about_search_regex`](about_search_regex.md), [`about_search`](about_search.md)
## Examples
diff --git a/docs/_sources/rapi/stri_order.md.txt b/docs/_sources/rapi/stri_order.md.txt
index 999236966..4904ac77e 100644
--- a/docs/_sources/rapi/stri_order.md.txt
+++ b/docs/_sources/rapi/stri_order.md.txt
@@ -1,4 +1,4 @@
-# stri\_order: Ordering Permutation
+# stri_order: Ordering Permutation
## Description
@@ -6,7 +6,7 @@ This function finds a permutation which rearranges the strings in a given charac
## Usage
-```r
+``` r
stri_order(str, decreasing = FALSE, na_last = TRUE, ..., opts_collator = NULL)
```
@@ -46,7 +46,9 @@ The function yields an integer vector that gives the sort order.
The official online manual of stringi at
-Other locale\_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other locale_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
## Examples
diff --git a/docs/_sources/rapi/stri_pad.md.txt b/docs/_sources/rapi/stri_pad.md.txt
index 59e63468c..ae1e4220f 100644
--- a/docs/_sources/rapi/stri_pad.md.txt
+++ b/docs/_sources/rapi/stri_pad.md.txt
@@ -1,4 +1,4 @@
-# stri\_pad: Pad (Center/Left/Right Align) a String
+# stri_pad: Pad (Center/Left/Right Align) a String
## Description
@@ -6,7 +6,7 @@ Add multiple `pad` characters at the given `side`(s) of each string so that each
## Usage
-```r
+``` r
stri_pad_both(
str,
width = floor(0.9 * getOption("width")),
@@ -69,6 +69,8 @@ These functions return a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other length: [`%s$%()`](+25s+24+25.md), [`stri_isempty()`](stri_isempty.md), [`stri_length()`](stri_length.md), [`stri_numbytes()`](stri_numbytes.md), [`stri_sprintf()`](stri_sprintf.md), [`stri_width()`](stri_width.md)
## Examples
diff --git a/docs/_sources/rapi/stri_rand_lipsum.md.txt b/docs/_sources/rapi/stri_rand_lipsum.md.txt
index 379440c8e..02fd3f336 100644
--- a/docs/_sources/rapi/stri_rand_lipsum.md.txt
+++ b/docs/_sources/rapi/stri_rand_lipsum.md.txt
@@ -1,4 +1,4 @@
-# stri\_rand\_lipsum: A Lorem Ipsum Generator
+# stri_rand_lipsum: A Lorem Ipsum Generator
## Description
@@ -6,7 +6,7 @@ Generates (pseudo)random *lorem ipsum* text consisting of a given number of text
## Usage
-```r
+``` r
stri_rand_lipsum(n_paragraphs, start_lipsum = TRUE, nparagraphs = n_paragraphs)
```
@@ -36,6 +36,8 @@ Returns a character vector of length `n_paragraphs`.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other random: [`stri_rand_shuffle()`](stri_rand_shuffle.md), [`stri_rand_strings()`](stri_rand_strings.md)
## Examples
diff --git a/docs/_sources/rapi/stri_rand_shuffle.md.txt b/docs/_sources/rapi/stri_rand_shuffle.md.txt
index 5a021bfd9..05eedd6af 100644
--- a/docs/_sources/rapi/stri_rand_shuffle.md.txt
+++ b/docs/_sources/rapi/stri_rand_shuffle.md.txt
@@ -1,4 +1,4 @@
-# stri\_rand\_shuffle: Randomly Shuffle Code Points in Each String
+# stri_rand_shuffle: Randomly Shuffle Code Points in Each String
## Description
@@ -6,7 +6,7 @@ Generates a (pseudo)random permutation of the code points in each string.
## Usage
-```r
+``` r
stri_rand_shuffle(str)
```
@@ -34,6 +34,8 @@ Returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other random: [`stri_rand_lipsum()`](stri_rand_lipsum.md), [`stri_rand_strings()`](stri_rand_strings.md)
## Examples
diff --git a/docs/_sources/rapi/stri_rand_strings.md.txt b/docs/_sources/rapi/stri_rand_strings.md.txt
index eba65921d..1a2c9afa0 100644
--- a/docs/_sources/rapi/stri_rand_strings.md.txt
+++ b/docs/_sources/rapi/stri_rand_strings.md.txt
@@ -1,4 +1,4 @@
-# stri\_rand\_strings: Generate Random Strings
+# stri_rand_strings: Generate Random Strings
## Description
@@ -6,7 +6,7 @@ Generates (pseudo)random strings of desired lengths.
## Usage
-```r
+``` r
stri_rand_strings(n, length, pattern = "[A-Za-z0-9]")
```
@@ -38,6 +38,8 @@ Returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other random: [`stri_rand_lipsum()`](stri_rand_lipsum.md), [`stri_rand_shuffle()`](stri_rand_shuffle.md)
## Examples
diff --git a/docs/_sources/rapi/stri_rank.md.txt b/docs/_sources/rapi/stri_rank.md.txt
index 51c8de87d..c6e00abdb 100644
--- a/docs/_sources/rapi/stri_rank.md.txt
+++ b/docs/_sources/rapi/stri_rank.md.txt
@@ -1,4 +1,4 @@
-# stri\_rank: Ranking
+# stri_rank: Ranking
## Description
@@ -6,7 +6,7 @@ This function ranks each string in a character vector according to a locale-depe
## Usage
-```r
+``` r
stri_rank(str, ..., opts_collator = NULL)
```
@@ -40,7 +40,9 @@ The result is a vector of ranks corresponding to each string in `str`.
The official online manual of stringi at
-Other locale\_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other locale_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
## Examples
diff --git a/docs/_sources/rapi/stri_read_lines.md.txt b/docs/_sources/rapi/stri_read_lines.md.txt
index ccba93893..4e16a20f2 100644
--- a/docs/_sources/rapi/stri_read_lines.md.txt
+++ b/docs/_sources/rapi/stri_read_lines.md.txt
@@ -1,4 +1,4 @@
-# stri\_read\_lines: Read Text Lines from a Text File
+# stri_read_lines: Read Text Lines from a Text File
## Description
@@ -6,7 +6,7 @@ Reads a text file in ins entirety, re-encodes it, and splits it into text lines.
## Usage
-```r
+``` r
stri_read_lines(con, encoding = NULL, fname = con, fallback_encoding = NULL)
```
@@ -39,4 +39,6 @@ Returns a character vector, each text line is a separate string. The output is a
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other files: [`stri_read_raw()`](stri_read_raw.md), [`stri_write_lines()`](stri_write_lines.md)
diff --git a/docs/_sources/rapi/stri_read_raw.md.txt b/docs/_sources/rapi/stri_read_raw.md.txt
index c319a5d17..4852a2e8f 100644
--- a/docs/_sources/rapi/stri_read_raw.md.txt
+++ b/docs/_sources/rapi/stri_read_raw.md.txt
@@ -1,4 +1,4 @@
-# stri\_read\_raw: Read Text File as Raw
+# stri_read_raw: Read Text File as Raw
## Description
@@ -6,7 +6,7 @@ Reads a text file as-is, with no conversion or text line splitting.
## Usage
-```r
+``` r
stri_read_raw(con, fname = con)
```
@@ -33,4 +33,6 @@ Returns a vector of type `raw`.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other files: [`stri_read_lines()`](stri_read_lines.md), [`stri_write_lines()`](stri_write_lines.md)
diff --git a/docs/_sources/rapi/stri_remove_empty.md.txt b/docs/_sources/rapi/stri_remove_empty.md.txt
index 026cb3169..318236211 100644
--- a/docs/_sources/rapi/stri_remove_empty.md.txt
+++ b/docs/_sources/rapi/stri_remove_empty.md.txt
@@ -1,4 +1,4 @@
-# stri\_remove\_empty: Remove All Empty Strings from a Character Vector
+# stri_remove_empty: Remove All Empty Strings from a Character Vector
## Description
@@ -10,7 +10,7 @@
## Usage
-```r
+``` r
stri_remove_empty(x, na_empty = FALSE)
stri_omit_empty(x, na_empty = FALSE)
@@ -43,6 +43,8 @@ Returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other utils: [`stri_list2matrix()`](stri_list2matrix.md), [`stri_na2empty()`](stri_na2empty.md), [`stri_replace_na()`](stri_replace_na.md)
## Examples
diff --git a/docs/_sources/rapi/stri_replace.md.txt b/docs/_sources/rapi/stri_replace.md.txt
index 6df3d0695..ee7b3d5eb 100644
--- a/docs/_sources/rapi/stri_replace.md.txt
+++ b/docs/_sources/rapi/stri_replace.md.txt
@@ -1,4 +1,4 @@
-# stri\_replace: Replace Pattern Occurrences
+# stri_replace: Replace Pattern Occurrences
## Description
@@ -6,7 +6,7 @@ These functions replace, with the given replacement string, every/first/last sub
## Usage
-```r
+``` r
stri_replace_all(str, replacement, ..., regex, fixed, coll, charclass)
stri_replace_first(str, replacement, ..., regex, fixed, coll, charclass)
@@ -120,7 +120,9 @@ All the functions return a character vector.
The official online manual of stringi at
-Other search\_replace: [`about_search`](about_search.md), [`stri_replace_rstr()`](stri_replace_rstr.md), [`stri_trim_both()`](stri_trim.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other search_replace: [`about_search`](about_search.md), [`stri_replace_rstr()`](stri_replace_rstr.md), [`stri_trim_both()`](stri_trim.md)
## Examples
diff --git a/docs/_sources/rapi/stri_replace_na.md.txt b/docs/_sources/rapi/stri_replace_na.md.txt
index 98d0d5630..4992a870d 100644
--- a/docs/_sources/rapi/stri_replace_na.md.txt
+++ b/docs/_sources/rapi/stri_replace_na.md.txt
@@ -1,4 +1,4 @@
-# stri\_replace\_na: Replace Missing Values in a Character Vector
+# stri_replace_na: Replace Missing Values in a Character Vector
## Description
@@ -6,7 +6,7 @@ This function gives a convenient way to replace each missing (`NA`) value with a
## Usage
-```r
+``` r
stri_replace_na(str, replacement = "NA")
```
@@ -33,6 +33,8 @@ Returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other utils: [`stri_list2matrix()`](stri_list2matrix.md), [`stri_na2empty()`](stri_na2empty.md), [`stri_remove_empty()`](stri_remove_empty.md)
## Examples
diff --git a/docs/_sources/rapi/stri_replace_rstr.md.txt b/docs/_sources/rapi/stri_replace_rstr.md.txt
index 9246549fe..614c91d36 100644
--- a/docs/_sources/rapi/stri_replace_rstr.md.txt
+++ b/docs/_sources/rapi/stri_replace_rstr.md.txt
@@ -1,4 +1,4 @@
-# stri\_replace\_rstr: Convert gsub-Style Replacement Strings
+# stri_replace_rstr: Convert gsub-Style Replacement Strings
## Description
@@ -6,7 +6,7 @@ Converts a [`gsub`](https://stat.ethz.ch/R-manual/R-devel/library/base/help/gsub
## Usage
-```r
+``` r
stri_replace_rstr(x)
```
@@ -28,4 +28,6 @@ Returns a character vector.
The official online manual of stringi at
-Other search\_replace: [`about_search`](about_search.md), [`stri_replace_all()`](stri_replace.md), [`stri_trim_both()`](stri_trim.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other search_replace: [`about_search`](about_search.md), [`stri_replace_all()`](stri_replace.md), [`stri_trim_both()`](stri_trim.md)
diff --git a/docs/_sources/rapi/stri_reverse.md.txt b/docs/_sources/rapi/stri_reverse.md.txt
index bba5dade9..fd44eda1b 100644
--- a/docs/_sources/rapi/stri_reverse.md.txt
+++ b/docs/_sources/rapi/stri_reverse.md.txt
@@ -1,4 +1,4 @@
-# stri\_reverse: Reverse Each String
+# stri_reverse: Reverse Each String
## Description
@@ -6,7 +6,7 @@ Reverses the order of the code points in every string.
## Usage
-```r
+``` r
stri_reverse(str)
```
@@ -34,6 +34,8 @@ Returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
## Examples
diff --git a/docs/_sources/rapi/stri_sort.md.txt b/docs/_sources/rapi/stri_sort.md.txt
index 391141e98..891a1e23a 100644
--- a/docs/_sources/rapi/stri_sort.md.txt
+++ b/docs/_sources/rapi/stri_sort.md.txt
@@ -1,4 +1,4 @@
-# stri\_sort: Sorting
+# stri_sort: Sorting
## Description
@@ -6,7 +6,7 @@ This function sorts a character vector according to a locale-dependent lexicogra
## Usage
-```r
+``` r
stri_sort(str, decreasing = FALSE, na_last = NA, ..., opts_collator = NULL)
```
@@ -44,7 +44,9 @@ The result is a sorted version of `str`, i.e., a character vector.
The official online manual of stringi at
-Other locale\_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other locale_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
## Examples
diff --git a/docs/_sources/rapi/stri_sort_key.md.txt b/docs/_sources/rapi/stri_sort_key.md.txt
index c2fb56ed9..b9ebef2a4 100644
--- a/docs/_sources/rapi/stri_sort_key.md.txt
+++ b/docs/_sources/rapi/stri_sort_key.md.txt
@@ -1,4 +1,4 @@
-# stri\_sort\_key: Sort Keys
+# stri_sort_key: Sort Keys
## Description
@@ -6,7 +6,7 @@ This function computes a locale-dependent sort key, which is an alternative char
## Usage
-```r
+``` r
stri_sort_key(str, ..., opts_collator = NULL)
```
@@ -40,7 +40,9 @@ The result is a character vector with the same length as `str` that contains the
The official online manual of stringi at
-Other locale\_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other locale_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
## Examples
diff --git a/docs/_sources/rapi/stri_split.md.txt b/docs/_sources/rapi/stri_split.md.txt
index f225b37d6..cd01355bf 100644
--- a/docs/_sources/rapi/stri_split.md.txt
+++ b/docs/_sources/rapi/stri_split.md.txt
@@ -1,4 +1,4 @@
-# stri\_split: Split a String By Pattern Matches
+# stri_split: Split a String By Pattern Matches
## Description
@@ -6,7 +6,7 @@ These functions split each element in `str` into substrings. `pattern` defines t
## Usage
-```r
+``` r
stri_split(str, ..., regex, fixed, coll, charclass)
stri_split_fixed(
@@ -91,7 +91,9 @@ Otherwise, [`stri_list2matrix`](stri_list2matrix.md) with `byrow=TRUE` and `n_mi
The official online manual of stringi at
-Other search\_split: [`about_search`](about_search.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_split_lines()`](stri_split_lines.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other search_split: [`about_search`](about_search.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_split_lines()`](stri_split_lines.md)
## Examples
diff --git a/docs/_sources/rapi/stri_split_boundaries.md.txt b/docs/_sources/rapi/stri_split_boundaries.md.txt
index 28a3acb64..5b845d3fc 100644
--- a/docs/_sources/rapi/stri_split_boundaries.md.txt
+++ b/docs/_sources/rapi/stri_split_boundaries.md.txt
@@ -1,4 +1,4 @@
-# stri\_split\_boundaries: Split a String at Text Boundaries
+# stri_split_boundaries: Split a String at Text Boundaries
## Description
@@ -6,7 +6,7 @@ This function locates text boundaries (like character, word, line, or sentence b
## Usage
-```r
+``` r
stri_split_boundaries(
str,
n = -1L,
@@ -52,11 +52,13 @@ Otherwise, [`stri_list2matrix`](stri_list2matrix.md) with `byrow=TRUE` and `n_mi
The official online manual of stringi at
-Other search\_split: [`about_search`](about_search.md), [`stri_split_lines()`](stri_split_lines.md), [`stri_split()`](stri_split.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other search_split: [`about_search`](about_search.md), [`stri_split_lines()`](stri_split_lines.md), [`stri_split()`](stri_split.md)
-Other locale\_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
+Other locale_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
-Other text\_boundaries: [`about_search_boundaries`](about_search_boundaries.md), [`about_search`](about_search.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_brkiter()`](stri_opts_brkiter.md), [`stri_split_lines()`](stri_split_lines.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_wrap()`](stri_wrap.md)
+Other text_boundaries: [`about_search_boundaries`](about_search_boundaries.md), [`about_search`](about_search.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_brkiter()`](stri_opts_brkiter.md), [`stri_split_lines()`](stri_split_lines.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_wrap()`](stri_wrap.md)
## Examples
diff --git a/docs/_sources/rapi/stri_split_lines.md.txt b/docs/_sources/rapi/stri_split_lines.md.txt
index 8c20cf3f8..4a7a4f5b1 100644
--- a/docs/_sources/rapi/stri_split_lines.md.txt
+++ b/docs/_sources/rapi/stri_split_lines.md.txt
@@ -1,4 +1,4 @@
-# stri\_split\_lines: Split a String Into Text Lines
+# stri_split_lines: Split a String Into Text Lines
## Description
@@ -6,7 +6,7 @@ These functions split each character string in a given vector into text lines.
## Usage
-```r
+``` r
stri_split_lines(str, omit_empty = FALSE)
stri_split_lines1(str)
@@ -27,7 +27,7 @@ Vectorized over `str` and `omit_empty`.
Newlines are represented with the Carriage Return (CR, 0x0D), Line Feed (LF, 0x0A), CRLF, or Next Line (NEL, 0x85) characters, depending on the platform. Moreover, the Unicode Standard defines two unambiguous separator characters, the Paragraph Separator (PS, 0x2029) and the Line Separator (LS, 0x2028). Sometimes also the Vertical Tab (VT, 0x0B) and the Form Feed (FF, 0x0C) are used for this purpose.
-These stringi functions follow UTR\#18 rules, where a newline sequence corresponds to the following regular expression: `(?:\u{D A}|(?!\u{D A})[\u{A}-\u{D}\u{85}\u{2028}\u{2029}]`. Each match serves as a text line separator.
+These stringi functions follow UTR#18 rules, where a newline sequence corresponds to the following regular expression: `(?:\u{D A}|(?!\u{D A})[\u{A}-\u{D}\u{85}\u{2028}\u{2029}]`. Each match serves as a text line separator.
## Value
@@ -41,14 +41,16 @@ These stringi functions follow UTR\#18 rules, where a n
## References
-*Unicode Newline Guidelines* -- Unicode Technical Report \#13,
+*Unicode Newline Guidelines* -- Unicode Technical Report #13,
-*Unicode Regular Expressions* -- Unicode Technical Standard \#18,
+*Unicode Regular Expressions* -- Unicode Technical Standard #18,
## See Also
The official online manual of stringi at
-Other search\_split: [`about_search`](about_search.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_split()`](stri_split.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
-Other text\_boundaries: [`about_search_boundaries`](about_search_boundaries.md), [`about_search`](about_search.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_brkiter()`](stri_opts_brkiter.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_wrap()`](stri_wrap.md)
+Other search_split: [`about_search`](about_search.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_split()`](stri_split.md)
+
+Other text_boundaries: [`about_search_boundaries`](about_search_boundaries.md), [`about_search`](about_search.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_brkiter()`](stri_opts_brkiter.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_wrap()`](stri_wrap.md)
diff --git a/docs/_sources/rapi/stri_sprintf.md.txt b/docs/_sources/rapi/stri_sprintf.md.txt
index 83de4e546..9c542a434 100644
--- a/docs/_sources/rapi/stri_sprintf.md.txt
+++ b/docs/_sources/rapi/stri_sprintf.md.txt
@@ -1,4 +1,4 @@
-# stri\_sprintf: Format Strings
+# stri_sprintf: Format Strings
## Description
@@ -6,7 +6,7 @@
## Usage
-```r
+``` r
stri_sprintf(
format,
...,
@@ -88,6 +88,8 @@ The other functions return a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other length: [`%s$%()`](+25s+24+25.md), [`stri_isempty()`](stri_isempty.md), [`stri_length()`](stri_length.md), [`stri_numbytes()`](stri_numbytes.md), [`stri_pad_both()`](stri_pad.md), [`stri_width()`](stri_width.md)
## Examples
@@ -141,14 +143,14 @@ stri_printf("%+10.3f", c(-Inf, -0, 0, Inf, NaN, NA_real_),
## 💩
##
stri_sprintf("UNIX time %1$f is %1$s.", Sys.time())
-## [1] "UNIX time 1638142141.168161 is 2021-11-29 10:29:01."
+## [1] "UNIX time 1656747632.723760 is 2022-07-02 17:40:32."
# the following do not work in sprintf()
stri_sprintf("%1$#- *2$.*3$f", 1.23456, 10, 3) # two asterisks
## [1] " 1.235 "
stri_sprintf(c("%s", "%f"), pi) # re-coercion needed
## [1] "3.14159265358979" "3.141593"
stri_sprintf("%1$s is %1$f UNIX time.", Sys.time()) # re-coercion needed
-## [1] "2021-11-29 10:29:01 is 1638142141.170445 UNIX time."
+## [1] "2022-07-02 17:40:32 is 1656747632.726045 UNIX time."
stri_sprintf(c("%d", "%s"), factor(11:12)) # re-coercion needed
## [1] "1" "12"
stri_sprintf(c("%s", "%d"), factor(11:12)) # re-coercion needed
diff --git a/docs/_sources/rapi/stri_startsendswith.md.txt b/docs/_sources/rapi/stri_startsendswith.md.txt
index 923532958..f437078ef 100644
--- a/docs/_sources/rapi/stri_startsendswith.md.txt
+++ b/docs/_sources/rapi/stri_startsendswith.md.txt
@@ -1,4 +1,4 @@
-# stri\_startsendswith: Determine if the Start or End of a String Matches a Pattern
+# stri_startsendswith: Determine if the Start or End of a String Matches a Pattern
## Description
@@ -6,7 +6,7 @@ These functions check if a string starts or ends with a match to a given pattern
## Usage
-```r
+``` r
stri_startswith(str, ..., fixed, coll, charclass)
stri_endswith(str, ..., fixed, coll, charclass)
@@ -94,7 +94,9 @@ Each function returns a logical vector.
The official online manual of stringi at
-Other search\_detect: [`about_search`](about_search.md), [`stri_detect()`](stri_detect.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other search_detect: [`about_search`](about_search.md), [`stri_detect()`](stri_detect.md)
## Examples
diff --git a/docs/_sources/rapi/stri_stats_general.md.txt b/docs/_sources/rapi/stri_stats_general.md.txt
index d55418fc7..cd16e8b9c 100644
--- a/docs/_sources/rapi/stri_stats_general.md.txt
+++ b/docs/_sources/rapi/stri_stats_general.md.txt
@@ -1,4 +1,4 @@
-# stri\_stats\_general: General Statistics for a Character Vector
+# stri_stats_general: General Statistics for a Character Vector
## Description
@@ -6,7 +6,7 @@ This function gives general statistics for a character vector, e.g., obtained by
## Usage
-```r
+``` r
stri_stats_general(str)
```
@@ -44,6 +44,8 @@ Returns an integer vector with the following named elements:
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other stats: [`stri_stats_latex()`](stri_stats_latex.md)
## Examples
diff --git a/docs/_sources/rapi/stri_stats_latex.md.txt b/docs/_sources/rapi/stri_stats_latex.md.txt
index 5f3c11a6e..2da3547ab 100644
--- a/docs/_sources/rapi/stri_stats_latex.md.txt
+++ b/docs/_sources/rapi/stri_stats_latex.md.txt
@@ -1,4 +1,4 @@
-# stri\_stats\_latex: Statistics for a Character Vector Containing LaTeX Commands
+# stri_stats_latex: Statistics for a Character Vector Containing LaTeX Commands
## Description
@@ -6,7 +6,7 @@ This function gives LaTeX-oriented statistics for a character vector, e.g., obta
## Usage
-```r
+``` r
stri_stats_latex(str)
```
@@ -46,6 +46,8 @@ Returns an integer vector with the following named elements:
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other stats: [`stri_stats_general()`](stri_stats_general.md)
## Examples
diff --git a/docs/_sources/rapi/stri_sub.md.txt b/docs/_sources/rapi/stri_sub.md.txt
index b93bb24f5..48ca4a167 100644
--- a/docs/_sources/rapi/stri_sub.md.txt
+++ b/docs/_sources/rapi/stri_sub.md.txt
@@ -1,4 +1,4 @@
-# stri\_sub: Extract a Substring From or Replace a Substring In a Character Vector
+# stri_sub: Extract a Substring From or Replace a Substring In a Character Vector
## Description
@@ -8,7 +8,7 @@ For extracting/replacing multiple substrings from/within each string, see [`stri
## Usage
-```r
+``` r
stri_sub(
str,
from = 1L,
@@ -68,6 +68,8 @@ Note that for some Unicode strings, the extracted substrings might not be well-f
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other indexing: [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_locate_all()`](stri_locate.md), [`stri_sub_all()`](stri_sub_all.md)
## Examples
diff --git a/docs/_sources/rapi/stri_sub_all.md.txt b/docs/_sources/rapi/stri_sub_all.md.txt
index 03b058482..f90ae9618 100644
--- a/docs/_sources/rapi/stri_sub_all.md.txt
+++ b/docs/_sources/rapi/stri_sub_all.md.txt
@@ -1,4 +1,4 @@
-# stri\_sub\_all: Extract or Replace Multiple Substrings
+# stri_sub_all: Extract or Replace Multiple Substrings
## Description
@@ -8,7 +8,7 @@ For extracting/replacing single substrings from/within each string, see [`stri_s
## Usage
-```r
+``` r
stri_sub_all(
str,
from = list(1L),
@@ -71,6 +71,8 @@ In the replacement function, the index ranges must be sorted with respect to `fr
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other indexing: [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_locate_all()`](stri_locate.md), [`stri_sub()`](stri_sub.md)
## Examples
diff --git a/docs/_sources/rapi/stri_subset.md.txt b/docs/_sources/rapi/stri_subset.md.txt
index b2877e392..af8bd09f0 100644
--- a/docs/_sources/rapi/stri_subset.md.txt
+++ b/docs/_sources/rapi/stri_subset.md.txt
@@ -1,4 +1,4 @@
-# stri\_subset: Select Elements that Match a Given Pattern
+# stri_subset: Select Elements that Match a Given Pattern
## Description
@@ -6,7 +6,7 @@ These functions return or modify a sub-vector where there is a match to a given
## Usage
-```r
+``` r
stri_subset(str, ..., regex, fixed, coll, charclass)
stri_subset(str, ..., regex, fixed, coll, charclass) <- value
@@ -81,7 +81,9 @@ The `stri_subset_*<-` functions modifies `str` \'in-place\'.
The official online manual of stringi at
-Other search\_subset: [`about_search`](about_search.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other search_subset: [`about_search`](about_search.md)
## Examples
diff --git a/docs/_sources/rapi/stri_timezone_info.md.txt b/docs/_sources/rapi/stri_timezone_info.md.txt
index 4b804a931..1bad92325 100644
--- a/docs/_sources/rapi/stri_timezone_info.md.txt
+++ b/docs/_sources/rapi/stri_timezone_info.md.txt
@@ -1,4 +1,4 @@
-# stri\_timezone\_info: Query a Given Time Zone
+# stri_timezone_info: Query a Given Time Zone
## Description
@@ -6,7 +6,7 @@ Provides some basic information on a given time zone identifier.
## Usage
-```r
+``` r
stri_timezone_info(tz = NULL, locale = NULL, display_type = "long")
```
@@ -48,6 +48,8 @@ Returns a list with the following named components:
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other datetime: [`stri_datetime_add()`](stri_datetime_add.md), [`stri_datetime_create()`](stri_datetime_create.md), [`stri_datetime_fields()`](stri_datetime_fields.md), [`stri_datetime_format()`](stri_datetime_format.md), [`stri_datetime_fstr()`](stri_datetime_fstr.md), [`stri_datetime_now()`](stri_datetime_now.md), [`stri_datetime_symbols()`](stri_datetime_symbols.md), [`stri_timezone_get()`](stri_timezone_set.md), [`stri_timezone_list()`](stri_timezone_list.md)
Other timezone: [`stri_timezone_get()`](stri_timezone_set.md), [`stri_timezone_list()`](stri_timezone_list.md)
diff --git a/docs/_sources/rapi/stri_timezone_list.md.txt b/docs/_sources/rapi/stri_timezone_list.md.txt
index 1e0e719de..25c5b597d 100644
--- a/docs/_sources/rapi/stri_timezone_list.md.txt
+++ b/docs/_sources/rapi/stri_timezone_list.md.txt
@@ -1,4 +1,4 @@
-# stri\_timezone\_list: List Available Time Zone Identifiers
+# stri_timezone_list: List Available Time Zone Identifiers
## Description
@@ -6,7 +6,7 @@ Returns a list of available time zone identifiers.
## Usage
-```r
+``` r
stri_timezone_list(region = NA_character_, offset = NA_integer_)
```
@@ -51,6 +51,8 @@ Returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other datetime: [`stri_datetime_add()`](stri_datetime_add.md), [`stri_datetime_create()`](stri_datetime_create.md), [`stri_datetime_fields()`](stri_datetime_fields.md), [`stri_datetime_format()`](stri_datetime_format.md), [`stri_datetime_fstr()`](stri_datetime_fstr.md), [`stri_datetime_now()`](stri_datetime_now.md), [`stri_datetime_symbols()`](stri_datetime_symbols.md), [`stri_timezone_get()`](stri_timezone_set.md), [`stri_timezone_info()`](stri_timezone_info.md)
Other timezone: [`stri_timezone_get()`](stri_timezone_set.md), [`stri_timezone_info()`](stri_timezone_info.md)
diff --git a/docs/_sources/rapi/stri_timezone_set.md.txt b/docs/_sources/rapi/stri_timezone_set.md.txt
index 46b142118..671d5d380 100644
--- a/docs/_sources/rapi/stri_timezone_set.md.txt
+++ b/docs/_sources/rapi/stri_timezone_set.md.txt
@@ -1,4 +1,4 @@
-# stri\_timezone\_set: Set or Get Default Time Zone in stringi
+# stri_timezone_set: Set or Get Default Time Zone in stringi
## Description
@@ -10,7 +10,7 @@ For more information on time zone representation in ICU
## Usage
-```r
+``` r
stri_timezone_get()
stri_timezone_set(tz)
@@ -44,6 +44,8 @@ Unless the default time zone has already been set using `stri_timezone_set`, the
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other datetime: [`stri_datetime_add()`](stri_datetime_add.md), [`stri_datetime_create()`](stri_datetime_create.md), [`stri_datetime_fields()`](stri_datetime_fields.md), [`stri_datetime_format()`](stri_datetime_format.md), [`stri_datetime_fstr()`](stri_datetime_fstr.md), [`stri_datetime_now()`](stri_datetime_now.md), [`stri_datetime_symbols()`](stri_datetime_symbols.md), [`stri_timezone_info()`](stri_timezone_info.md), [`stri_timezone_list()`](stri_timezone_list.md)
Other timezone: [`stri_timezone_info()`](stri_timezone_info.md), [`stri_timezone_list()`](stri_timezone_list.md)
diff --git a/docs/_sources/rapi/stri_trans_casemap.md.txt b/docs/_sources/rapi/stri_trans_casemap.md.txt
index a97cb3122..b4613219b 100644
--- a/docs/_sources/rapi/stri_trans_casemap.md.txt
+++ b/docs/_sources/rapi/stri_trans_casemap.md.txt
@@ -1,4 +1,4 @@
-# stri\_trans\_casemap: Transform Strings with Case Mapping or Folding
+# stri_trans_casemap: Transform Strings with Case Mapping or Folding
## Description
@@ -6,7 +6,7 @@ These functions transform strings either to lower case, UPPER CASE, or Title Cas
## Usage
-```r
+``` r
stri_trans_tolower(str, locale = NULL)
stri_trans_toupper(str, locale = NULL)
@@ -59,11 +59,13 @@ Each function returns a character vector.
The official online manual of stringi at
-Other locale\_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other locale_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_unique()`](stri_unique.md), [`stri_wrap()`](stri_wrap.md)
Other transform: [`stri_trans_char()`](stri_trans_char.md), [`stri_trans_general()`](stri_trans_general.md), [`stri_trans_list()`](stri_trans_list.md), [`stri_trans_nfc()`](stri_trans_nf.md)
-Other text\_boundaries: [`about_search_boundaries`](about_search_boundaries.md), [`about_search`](about_search.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_brkiter()`](stri_opts_brkiter.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_split_lines()`](stri_split_lines.md), [`stri_wrap()`](stri_wrap.md)
+Other text_boundaries: [`about_search_boundaries`](about_search_boundaries.md), [`about_search`](about_search.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_brkiter()`](stri_opts_brkiter.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_split_lines()`](stri_split_lines.md), [`stri_wrap()`](stri_wrap.md)
## Examples
diff --git a/docs/_sources/rapi/stri_trans_char.md.txt b/docs/_sources/rapi/stri_trans_char.md.txt
index cac05918c..4378ad6c4 100644
--- a/docs/_sources/rapi/stri_trans_char.md.txt
+++ b/docs/_sources/rapi/stri_trans_char.md.txt
@@ -1,4 +1,4 @@
-# stri\_trans\_char: Translate Characters
+# stri_trans_char: Translate Characters
## Description
@@ -6,7 +6,7 @@ Translates Unicode code points in each input string.
## Usage
-```r
+``` r
stri_trans_char(str, pattern, replacement)
```
@@ -40,6 +40,8 @@ Returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other transform: [`stri_trans_general()`](stri_trans_general.md), [`stri_trans_list()`](stri_trans_list.md), [`stri_trans_nfc()`](stri_trans_nf.md), [`stri_trans_tolower()`](stri_trans_casemap.md)
## Examples
diff --git a/docs/_sources/rapi/stri_trans_general.md.txt b/docs/_sources/rapi/stri_trans_general.md.txt
index f9c3aa24a..c716ab437 100644
--- a/docs/_sources/rapi/stri_trans_general.md.txt
+++ b/docs/_sources/rapi/stri_trans_general.md.txt
@@ -1,4 +1,4 @@
-# stri\_trans\_general: General Text Transforms, Including Transliteration
+# stri_trans_general: General Text Transforms, Including Transliteration
## Description
@@ -14,7 +14,7 @@
## Usage
-```r
+``` r
stri_trans_general(str, id, rules = FALSE, forward = TRUE)
```
@@ -53,6 +53,8 @@ Returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other transform: [`stri_trans_char()`](stri_trans_char.md), [`stri_trans_list()`](stri_trans_list.md), [`stri_trans_nfc()`](stri_trans_nf.md), [`stri_trans_tolower()`](stri_trans_casemap.md)
## Examples
diff --git a/docs/_sources/rapi/stri_trans_list.md.txt b/docs/_sources/rapi/stri_trans_list.md.txt
index 630f31f6d..f6d6c4f93 100644
--- a/docs/_sources/rapi/stri_trans_list.md.txt
+++ b/docs/_sources/rapi/stri_trans_list.md.txt
@@ -1,4 +1,4 @@
-# stri\_trans\_list: List Available Text Transforms and Transliterators
+# stri_trans_list: List Available Text Transforms and Transliterators
## Description
@@ -6,7 +6,7 @@ Returns a list of available text transform identifiers. Each of them may be used
## Usage
-```r
+``` r
stri_trans_list()
```
@@ -26,6 +26,8 @@ Returns a character vector.
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other transform: [`stri_trans_char()`](stri_trans_char.md), [`stri_trans_general()`](stri_trans_general.md), [`stri_trans_nfc()`](stri_trans_nf.md), [`stri_trans_tolower()`](stri_trans_casemap.md)
## Examples
diff --git a/docs/_sources/rapi/stri_trans_nf.md.txt b/docs/_sources/rapi/stri_trans_nf.md.txt
index a065c1474..f65fdb8ec 100644
--- a/docs/_sources/rapi/stri_trans_nf.md.txt
+++ b/docs/_sources/rapi/stri_trans_nf.md.txt
@@ -1,12 +1,12 @@
-# stri\_trans\_nf: Perform or Check For Unicode Normalization
+# stri_trans_nf: Perform or Check For Unicode Normalization
## Description
-These functions convert strings to NFC, NFKC, NFD, NFKD, or NFKC\_Casefold Unicode Normalization Form or check whether strings are normalized.
+These functions convert strings to NFC, NFKC, NFD, NFKD, or NFKC_Casefold Unicode Normalization Form or check whether strings are normalized.
## Usage
-```r
+``` r
stri_trans_nfc(str)
stri_trans_nfd(str)
@@ -48,9 +48,9 @@ The following Normalization Forms (NFs) are supported:
- NFKD (Compatibility Decomposition),
-- NFKC\_Casefold (combination of NFKC, case folding, and removing ignorable characters which was introduced with Unicode 5.2).
+- NFKC_Casefold (combination of NFKC, case folding, and removing ignorable characters which was introduced with Unicode 5.2).
-Note that many W3C Specifications recommend using NFC for all content, because this form avoids potential interoperability problems arising from the use of canonically equivalent, yet different, character sequences in document formats on the Web. Thus, you will rather not use these functions in typical string processing activities. Most often you may assume that a string is in NFC, see RFC\\\#5198.
+Note that many W3C Specifications recommend using NFC for all content, because this form avoids potential interoperability problems arising from the use of canonically equivalent, yet different, character sequences in document formats on the Web. Thus, you will rather not use these functions in typical string processing activities. Most often you may assume that a string is in NFC, see RFC5198.
As usual in stringi, if the input character vector is in the native encoding, it will be automatically converted to UTF-8.
@@ -68,9 +68,9 @@ The `stri_trans_nf*` functions return a character vector of the same length as i
## References
-*Unicode Normalization Forms* -- Unicode Standard Annex \#15,
+*Unicode Normalization Forms* -- Unicode Standard Annex #15,
-*Unicode Format for Network Interchange* -- RFC\\\#5198,
+*Unicode Format for Network Interchange* -- RFC5198,
*Character Model for the World Wide Web 1.0: Normalization* -- W3C Working Draft,
@@ -82,6 +82,8 @@ The `stri_trans_nf*` functions return a character vector of the same length as i
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other transform: [`stri_trans_char()`](stri_trans_char.md), [`stri_trans_general()`](stri_trans_general.md), [`stri_trans_list()`](stri_trans_list.md), [`stri_trans_tolower()`](stri_trans_casemap.md)
## Examples
diff --git a/docs/_sources/rapi/stri_trim.md.txt b/docs/_sources/rapi/stri_trim.md.txt
index 9a209a087..897efdbe2 100644
--- a/docs/_sources/rapi/stri_trim.md.txt
+++ b/docs/_sources/rapi/stri_trim.md.txt
@@ -1,4 +1,4 @@
-# stri\_trim: Trim Characters from the Left and/or Right Side of a String
+# stri_trim: Trim Characters from the Left and/or Right Side of a String
## Description
@@ -6,7 +6,7 @@ These functions may be used, e.g., to remove unnecessary white-spaces from strin
## Usage
-```r
+``` r
stri_trim_both(str, pattern = "\\P{Wspace}", negate = FALSE)
stri_trim_left(str, pattern = "\\P{Wspace}", negate = FALSE)
@@ -56,9 +56,11 @@ All functions return a character vector.
The official online manual of stringi at
-Other search\_replace: [`about_search`](about_search.md), [`stri_replace_all()`](stri_replace.md), [`stri_replace_rstr()`](stri_replace_rstr.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other search_replace: [`about_search`](about_search.md), [`stri_replace_all()`](stri_replace.md), [`stri_replace_rstr()`](stri_replace_rstr.md)
-Other search\_charclass: [`about_search_charclass`](about_search_charclass.md), [`about_search`](about_search.md)
+Other search_charclass: [`about_search_charclass`](about_search_charclass.md), [`about_search`](about_search.md)
## Examples
diff --git a/docs/_sources/rapi/stri_unescape_unicode.md.txt b/docs/_sources/rapi/stri_unescape_unicode.md.txt
index cc11aad01..a8e1da341 100644
--- a/docs/_sources/rapi/stri_unescape_unicode.md.txt
+++ b/docs/_sources/rapi/stri_unescape_unicode.md.txt
@@ -1,4 +1,4 @@
-# stri\_unescape\_unicode: Un-escape All Escape Sequences
+# stri_unescape_unicode: Un-escape All Escape Sequences
## Description
@@ -6,7 +6,7 @@ Un-escapes all known escape sequences
## Usage
-```r
+``` r
stri_unescape_unicode(str)
```
@@ -38,6 +38,8 @@ Returns a character vector. If an escape sequence is ill-formed, result will be
The official online manual of stringi at
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
Other escape: [`stri_escape_unicode()`](stri_escape_unicode.md)
## Examples
diff --git a/docs/_sources/rapi/stri_unique.md.txt b/docs/_sources/rapi/stri_unique.md.txt
index 50dc20f54..f4f9f8e3d 100644
--- a/docs/_sources/rapi/stri_unique.md.txt
+++ b/docs/_sources/rapi/stri_unique.md.txt
@@ -1,4 +1,4 @@
-# stri\_unique: Extract Unique Elements
+# stri_unique: Extract Unique Elements
## Description
@@ -6,7 +6,7 @@ This function returns a character vector like `str`, but with duplicate elements
## Usage
-```r
+``` r
stri_unique(str, ..., opts_collator = NULL)
```
@@ -40,7 +40,9 @@ Returns a character vector.
The official online manual of stringi at
-Other locale\_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_wrap()`](stri_wrap.md)
+Gagolewski M., stringi: Fast and portable character string processing in R, *Journal of Statistical Software* 103(2), 2022, 1-59, doi:
+
+Other locale_sensitive: [`%s<%()`](+25s+3C+25.md), [`about_locale`](about_locale.md), [`about_search_boundaries`](about_search_boundaries.md), [`about_search_coll`](about_search_coll.md), [`stri_compare()`](stri_compare.md), [`stri_count_boundaries()`](stri_count_boundaries.md), [`stri_duplicated()`](stri_duplicated.md), [`stri_enc_detect2()`](stri_enc_detect2.md), [`stri_extract_all_boundaries()`](stri_extract_boundaries.md), [`stri_locate_all_boundaries()`](stri_locate_boundaries.md), [`stri_opts_collator()`](stri_opts_collator.md), [`stri_order()`](stri_order.md), [`stri_rank()`](stri_rank.md), [`stri_sort_key()`](stri_sort_key.md), [`stri_sort()`](stri_sort.md), [`stri_split_boundaries()`](stri_split_boundaries.md), [`stri_trans_tolower()`](stri_trans_casemap.md), [`stri_wrap()`](stri_wrap.md)
## Examples
diff --git a/docs/_sources/rapi/stri_width.md.txt b/docs/_sources/rapi/stri_width.md.txt
index 6a2972ef0..c89a8229b 100644
--- a/docs/_sources/rapi/stri_width.md.txt
+++ b/docs/_sources/rapi/stri_width.md.txt
@@ -1,4 +1,4 @@
-# stri\_width: Determine the Width of Code Points
+# stri_width: Determine the Width of Code Points
## Description
@@ -6,7 +6,7 @@ Approximates the number of text columns the \'cat()\' function might use to prin
## Usage
-```r
+``` r
stri_width(str)
```
@@ -18,7 +18,7 @@ stri_width(str)
## Details
-The Unicode standard does not formalize the notion of a character width. Roughly based on