From f23c5c93085926f774ec72602a919e189d0c6947 Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Tue, 11 Mar 2025 13:51:08 -0700
Subject: [PATCH 001/117] docs: link-ify `vignette("backtesting",
 package="epipredict")`

`downlit` fails to link these, maybe because we're not on CRAN yet.
---
 R/methods-epi_archive.R   | 3 ++-
 man/epix_slide.Rd         | 2 +-
 vignettes/epi_archive.Rmd | 3 ++-
 3 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/R/methods-epi_archive.R b/R/methods-epi_archive.R
index 362fd4eaa..4a8cd164b 100644
--- a/R/methods-epi_archive.R
+++ b/R/methods-epi_archive.R
@@ -629,7 +629,8 @@ epix_detailed_restricted_mutate <- function(.data, ...) {
 #' version-aware: the sliding computation at any given reference time t is
 #' performed on **data that would have been available as of t**. This function
 #' is intended for use in accurate backtesting of models; see
-#' `vignette("backtesting", package="epipredict")` for a walkthrough.
+#' \href{https://cmu-delphi.github.io/epipredict/articles/backtesting.html}{`vignette("backtesting",
+#' package="epipredict")`} for a walkthrough.
 #'
 #' @param .x An [`epi_archive`] or [`grouped_epi_archive`] object. If ungrouped,
 #'   all data in `x` will be treated as part of a single data group.
diff --git a/man/epix_slide.Rd b/man/epix_slide.Rd
index cadb19833..1d4009cb5 100644
--- a/man/epix_slide.Rd
+++ b/man/epix_slide.Rd
@@ -115,7 +115,7 @@ behaves similarly to \code{epi_slide()}, with the key exception that it is
 version-aware: the sliding computation at any given reference time t is
 performed on \strong{data that would have been available as of t}. This function
 is intended for use in accurate backtesting of models; see
-\code{vignette("backtesting", package="epipredict")} for a walkthrough.
+\href{https://cmu-delphi.github.io/epipredict/articles/backtesting.html}{\code{vignette("backtesting", package="epipredict")}} for a walkthrough.
 }
 \details{
 A few key distinctions between the current function and \code{epi_slide()}:
diff --git a/vignettes/epi_archive.Rmd b/vignettes/epi_archive.Rmd
index f87ea2915..90eadad66 100644
--- a/vignettes/epi_archive.Rmd
+++ b/vignettes/epi_archive.Rmd
@@ -234,7 +234,8 @@ observation carried forward (LOCF). For more information, see `epix_merge()`.
 ## Backtesting forecasting models
 
 One of the most common use cases of `epiprocess::epi_archive()` object is for
-accurate model backtesting. See `vignette("backtesting", package="epipredict")`
+accurate model backtesting. See [`vignette("backtesting",
+package="epipredict")`](https://cmu-delphi.github.io/epipredict/articles/backtesting.html)
 for an in-depth demo, using a pre-built forecaster in that package.
 
 ## Attribution

From 4f90dda7621d0451a6105deb359d659345ce037c Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Tue, 11 Mar 2025 14:05:32 -0700
Subject: [PATCH 002/117] docs: fix `epi_slide()` column packing note

---
 R/slide.R        | 8 +++++---
 man/epi_slide.Rd | 8 +++++---
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/R/slide.R b/R/slide.R
index abc7c3b77..abc93259b 100644
--- a/R/slide.R
+++ b/R/slide.R
@@ -19,9 +19,11 @@
 #' @template basic-slide-params
 #' @param .f Function, formula, or missing; together with `...` specifies the
 #'   computation to slide. The return of the computation should either be a
-#'   scalar or a 1-row data frame. Data frame returns will be
-#'    `tidyr::unpack()`-ed, if named, and will be [`tidyr::pack`]-ed columns, if
-#'    not named. See examples.
+#'   scalar or a 1-row data frame; these outputs will be collected and form a
+#'   new column or columns in the `epi_slide()` result. Data frame returns will
+#'   be unpacked into multiple columns in the result by default, or
+#'   [`tidyr::pack`]ed into a single data-frame-type column if you provide a
+#'   name for such a column. See examples.
 #'
 #'   - If `.f` is missing, then `...` will specify the computation via
 #'     tidy-evaluation. This is usually the most convenient way to use
diff --git a/man/epi_slide.Rd b/man/epi_slide.Rd
index f053909db..f5b9f1c83 100644
--- a/man/epi_slide.Rd
+++ b/man/epi_slide.Rd
@@ -22,9 +22,11 @@ and any columns in \code{other_keys}. If grouped, we make sure the grouping is b
 
 \item{.f}{Function, formula, or missing; together with \code{...} specifies the
 computation to slide. The return of the computation should either be a
-scalar or a 1-row data frame. Data frame returns will be
-\code{tidyr::unpack()}-ed, if named, and will be \code{\link[tidyr:pack]{tidyr::pack}}-ed columns, if
-not named. See examples.
+scalar or a 1-row data frame; these outputs will be collected and form a
+new column or columns in the \code{epi_slide()} result. Data frame returns will
+be unpacked into multiple columns in the result by default, or
+\code{\link[tidyr:pack]{tidyr::pack}}ed into a single data-frame-type column if you provide a
+name for such a column. See examples.
 \itemize{
 \item If \code{.f} is missing, then \code{...} will specify the computation via
 tidy-evaluation. This is usually the most convenient way to use

From 51ce893d734f160e7ebb88347b3ad2dfa5a9cded Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Wed, 12 Mar 2025 12:40:33 -0700
Subject: [PATCH 003/117] docs: iterate on `epi_slide(.f, ...)` roxygen

---
 R/slide.R        | 57 +++++++++++++++++++++++++++++++++---------------
 man/epi_slide.Rd | 45 ++++++++++++++++++++++++--------------
 2 files changed, 68 insertions(+), 34 deletions(-)

diff --git a/R/slide.R b/R/slide.R
index abc93259b..3cf92865b 100644
--- a/R/slide.R
+++ b/R/slide.R
@@ -17,28 +17,49 @@
 #' See `vignette("epi_df")` for more examples.
 #'
 #' @template basic-slide-params
-#' @param .f Function, formula, or missing; together with `...` specifies the
-#'   computation to slide. The return of the computation should either be a
-#'   scalar or a 1-row data frame; these outputs will be collected and form a
-#'   new column or columns in the `epi_slide()` result. Data frame returns will
-#'   be unpacked into multiple columns in the result by default, or
+#' @param .f,... The computation to slide. The input will be a time window of
+#'   the data for a single `geo_value` --- or a single combination of
+#'   `geo_value` and any [`other_keys`][as_epi_df] you used to specify
+#'   demographical breakdowns. The input will always have the same size,
+#'   determined by `.window_size`, and will fill in any missing `time_values`,
+#'   using `NA` values for missing measurements. The output should be a scalar
+#'   value or a 1-row data frame; these outputs will be collected and form a new
+#'   column or columns in the `epi_slide()` result. Data frame outputs will be
+#'   unpacked into multiple columns in the result by default, or
 #'   [`tidyr::pack`]ed into a single data-frame-type column if you provide a
-#'   name for such a column. See examples.
+#'   name for such a column (e.g., via `.new_col_name`).
+#'
+#' You can specify the computation in one of the following ways:
+#'
+#' - Don't provide `.f`, and instead use use one or more
+#'   [`dplyr::summarize`]-esque ["data-masking"][rlang::args_data_masking]
+#'   expressions in `...`, e.g., `cases_7dmed = median(cases)`. This is usually
+#'   the most convenient way to use `epi_slide`. See examples.
+#'
+#' - Provide a formula in `.f`, e.g., `~ median(.x$cases)`. In this formula,
+#'   `.x` is an `epi_df` containing data for a single time window as described
+#'   above, taken from the original `.x` fed into `epi_slide()`.
+#'
+#' - Provide a function in `.f`. The function should be of the form `function(x,
+#'   g, t)` or `function(x, g, t, <additional configuration arguments>)`, where:
 #'
-#'   - If `.f` is missing, then `...` will specify the computation via
-#'     tidy-evaluation. This is usually the most convenient way to use
-#'     `epi_slide`. See examples.
-#'   - If `.f` is a formula, then the formula should use `.x` (not the same as
-#'     the input `epi_df`) to operate on the columns of the input `epi_df`, e.g.
-#'     `~mean(.x$var)` to compute a mean of `var`.
-#'   - If a function, `.f` must have the form `function(x, g, t, ...)`, where:
 #'     - `x` is a data frame with the same column names as the original object,
-#'     minus any grouping variables, with only the windowed data for one
-#'     group-`.ref_time_value` combination
-#'     - `g` is a one-row tibble containing the values of the grouping variables
-#'     for the associated group
+#'       minus any grouping variables, with only the windowed data for one
+#'       group-`.ref_time_value` combination
+#'
+#'     - `g` is a one-row tibble specifying the `geo_value` and value of any
+#'       `other_keys` for this computation
+#'
 #'     - `t` is the `.ref_time_value` for the current window
-#'     - `...` are additional arguments
+#'
+#'     - If you have a complex `.f` containing `<additional configuration
+#'     arguments>`, you can provide values for those arguments in the `...`
+#'     argument to `epi_slide()`.
+#'
+#'   The values of `g` and `t` are also available to data-masking expression and
+#'   formula-based computations as `.group_key` and `.ref_time_value`,
+#'   respectively. Formula computations also let you use `.y` or `.z`,
+#'   respectively.
 #'
 #' @param ... Additional arguments to pass to the function or formula specified
 #'   via `.f`. Alternatively, if `.f` is missing, then the `...` is interpreted
diff --git a/man/epi_slide.Rd b/man/epi_slide.Rd
index f5b9f1c83..0405f3600 100644
--- a/man/epi_slide.Rd
+++ b/man/epi_slide.Rd
@@ -20,30 +20,43 @@ epi_slide(
 and any columns in \code{other_keys}. If grouped, we make sure the grouping is by
 \code{geo_value} and \code{other_keys}.}
 
-\item{.f}{Function, formula, or missing; together with \code{...} specifies the
-computation to slide. The return of the computation should either be a
-scalar or a 1-row data frame; these outputs will be collected and form a
-new column or columns in the \code{epi_slide()} result. Data frame returns will
-be unpacked into multiple columns in the result by default, or
+\item{.f, ...}{The computation to slide. The input will be a time window of
+the data for a single \code{geo_value} --- or a single combination of
+\code{geo_value} and any \code{\link[=as_epi_df]{other_keys}} you used to specify
+demographical breakdowns. The input will always have the same size,
+determined by \code{.window_size}, and will fill in any missing \code{time_values},
+using \code{NA} values for missing measurements. The output should be a scalar
+value or a 1-row data frame; these outputs will be collected and form a new
+column or columns in the \code{epi_slide()} result. Data frame outputs will be
+unpacked into multiple columns in the result by default, or
 \code{\link[tidyr:pack]{tidyr::pack}}ed into a single data-frame-type column if you provide a
-name for such a column. See examples.
+name for such a column (e.g., via \code{.new_col_name}).
+
+You can specify the computation in one of the following ways:
 \itemize{
-\item If \code{.f} is missing, then \code{...} will specify the computation via
-tidy-evaluation. This is usually the most convenient way to use
-\code{epi_slide}. See examples.
-\item If \code{.f} is a formula, then the formula should use \code{.x} (not the same as
-the input \code{epi_df}) to operate on the columns of the input \code{epi_df}, e.g.
-\code{~mean(.x$var)} to compute a mean of \code{var}.
-\item If a function, \code{.f} must have the form \verb{function(x, g, t, ...)}, where:
+\item Don't provide \code{.f}, and instead use use one or more
+\code{\link[dplyr:summarise]{dplyr::summarize}}-esque \link[rlang:args_data_masking]{"data-masking"}
+expressions in \code{...}, e.g., \code{cases_7dmed = median(cases)}. This is usually
+the most convenient way to use \code{epi_slide}. See examples.
+\item Provide a formula in \code{.f}, e.g., \code{~ median(.x$cases)}. In this formula,
+\code{.x} is an \code{epi_df} containing data for a single time window as described
+above, taken from the original \code{.x} fed into \code{epi_slide()}.
+\item Provide a function in \code{.f}. The function should be of the form \verb{function(x, g, t)} or \verb{function(x, g, t, <additional configuration arguments>)}, where:
 \itemize{
 \item \code{x} is a data frame with the same column names as the original object,
 minus any grouping variables, with only the windowed data for one
 group-\code{.ref_time_value} combination
-\item \code{g} is a one-row tibble containing the values of the grouping variables
-for the associated group
+\item \code{g} is a one-row tibble specifying the \code{geo_value} and value of any
+\code{other_keys} for this computation
 \item \code{t} is the \code{.ref_time_value} for the current window
-\item \code{...} are additional arguments
+\item If you have a complex \code{.f} containing \verb{<additional configuration   arguments>}, you can provide values for those arguments in the \code{...}
+argument to \code{epi_slide()}.
 }
+
+The values of \code{g} and \code{t} are also available to data-masking expression and
+formula-based computations as \code{.group_key} and \code{.ref_time_value},
+respectively. Formula computations also let you use \code{.y} or \code{.z},
+respectively.
 }}
 
 \item{...}{Additional arguments to pass to the function or formula specified

From 3868c7ddfb1996a73091a71182cab5160c48b64d Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Wed, 12 Mar 2025 15:30:26 -0700
Subject: [PATCH 004/117] docs: iterate on `?epi_slide` intro

---
 R/slide.R        | 11 ++++++-----
 man/epi_slide.Rd | 16 ++++++++--------
 2 files changed, 14 insertions(+), 13 deletions(-)

diff --git a/R/slide.R b/R/slide.R
index 3cf92865b..017359cf3 100644
--- a/R/slide.R
+++ b/R/slide.R
@@ -1,9 +1,10 @@
-#' Slide a function over variables in an `epi_df` object
+#' More general form of [`epi_slide_opt`] for rolling/running computations
 #'
-#' @description Slides a given function over variables in an `epi_df` object.
-#' This is useful for computations like rolling averages. The function supports
-#' many ways to specify the computation, but by far the most common use case is
-#' as follows:
+#' Check first whether you can use [`epi_slide_mean`], [`epi_slide_sum`], or the
+#' medium-generality [`epi_slide_opt`] instead, as they are faster and more
+#' convenient to use. You typically only need to use `epi_slide()` if you have a
+#' computation that depends on multiple columns simultaneously, outputs multiple
+#' columns simultaneously, or produces non-numeric output.
 #'
 #' ```
 #' # Create new column `cases_7dmed` that contains a 7-day trailing median of cases
diff --git a/man/epi_slide.Rd b/man/epi_slide.Rd
index 0405f3600..1146da313 100644
--- a/man/epi_slide.Rd
+++ b/man/epi_slide.Rd
@@ -2,7 +2,7 @@
 % Please edit documentation in R/slide.R
 \name{epi_slide}
 \alias{epi_slide}
-\title{Slide a function over variables in an \code{epi_df} object}
+\title{More general form of \code{\link{epi_slide_opt}} for rolling/running computations}
 \usage{
 epi_slide(
   .x,
@@ -110,11 +110,13 @@ added. It will be ungrouped if \code{.x} was ungrouped, and have the same groups
 as \code{.x} if \code{.x} was grouped.
 }
 \description{
-Slides a given function over variables in an \code{epi_df} object.
-This is useful for computations like rolling averages. The function supports
-many ways to specify the computation, but by far the most common use case is
-as follows:
-
+Check first whether you can use \code{\link{epi_slide_mean}}, \code{\link{epi_slide_sum}}, or the
+medium-generality \code{\link{epi_slide_opt}} instead, as they are faster and more
+convenient to use. You typically only need to use \code{epi_slide()} if you have a
+computation that depends on multiple columns simultaneously, outputs multiple
+columns simultaneously, or produces non-numeric output.
+}
+\details{
 \if{html}{\out{<div class="sourceCode">}}\preformatted{# Create new column `cases_7dmed` that contains a 7-day trailing median of cases
 epi_slide(edf, cases_7dmed = median(cases), .window_size = 7)
 }\if{html}{\out{</div>}}
@@ -124,8 +126,6 @@ faster than \code{epi_slide}: \code{epi_slide_mean()} and \code{epi_slide_sum()}
 recommend using these functions when possible.
 
 See \code{vignette("epi_df")} for more examples.
-}
-\details{
 \subsection{Advanced uses of \code{.f} via tidy evaluation}{
 
 If specifying \code{.f} via tidy evaluation, in addition to the standard \code{\link{.data}}

From 618f84129daeb8c2ad5dc71fedfb4c8138ff47b9 Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Wed, 12 Mar 2025 17:01:01 -0700
Subject: [PATCH 005/117] docs: iterate on `epix_slide()` roxygen

---
 R/methods-epi_archive.R | 197 ++++++++++++++++++++--------------------
 man/epix_slide.Rd       | 192 +++++++++++++++++++--------------------
 2 files changed, 191 insertions(+), 198 deletions(-)

diff --git a/R/methods-epi_archive.R b/R/methods-epi_archive.R
index 4a8cd164b..d6e2815e5 100644
--- a/R/methods-epi_archive.R
+++ b/R/methods-epi_archive.R
@@ -622,67 +622,83 @@ epix_detailed_restricted_mutate <- function(.data, ...) {
 }
 
 
-#' Slide a function over variables in an `epi_archive` or `grouped_epi_archive`
+#' Take each requested (group and) version in an archive, run a computation (e.g., forecast)
 #'
-#' Slides a given function over variables in an `epi_archive` object. This
-#' behaves similarly to `epi_slide()`, with the key exception that it is
-#' version-aware: the sliding computation at any given reference time t is
-#' performed on **data that would have been available as of t**. This function
-#' is intended for use in accurate backtesting of models; see
+#' ... and collect the results. This is useful for more accurately simulating
+#' how a forecaster, nowcaster, or other algorithm would have behaved in real
+#' time, factoring in reporting latency and data revisions; see
 #' \href{https://cmu-delphi.github.io/epipredict/articles/backtesting.html}{`vignette("backtesting",
 #' package="epipredict")`} for a walkthrough.
 #'
+#' This is similar to looping over versions and calling [`epix_as_of`], but has
+#' some conveniences such as working naturally with [`grouped_epi_archive`]s,
+#' optional time windowing, and syntactic sugar to make things shorter to write.
+#'
 #' @param .x An [`epi_archive`] or [`grouped_epi_archive`] object. If ungrouped,
 #'   all data in `x` will be treated as part of a single data group.
 #' @param .f Function, formula, or missing; together with `...` specifies the
-#'   computation to slide. To "slide" means to apply a computation over a
-#'   sliding (a.k.a. "rolling") time window for each data group. The window is
-#'   determined by the `.before` parameter (see details for more). If a
-#'   function, `.f` must have the form `function(x, g, t, ...)`, where
-#'
-#'   - "x" is an epi_df with the same column names as the archive's `DT`, minus
-#'     the `version` column
-#'   - "g" is a one-row tibble containing the values of the grouping variables
-#'   for the associated group
-#'   - "t" is the ref_time_value for the current window
-#'   - "..." are additional arguments
+#'   computation. The computation will be run on each requested group-version
+#'   combination, with a time window filter applied if `.before` is supplied.
+#'
+#'   - If `.f` is a function must have the form `function(x, g, v)` or
+#'     `function(x, g, v, <additional configuration args>)`, where
+#'
+#'     - `x` is an `epi_df` with the same column names as the archive's `DT`,
+#'       minus the `version` column. (Or, if `.all_versions = TRUE`, an
+#'       `epi_archive` with the requested partial version history.)
+#'
+#'     - `g` is a one-row tibble containing the values of the grouping variables
+#'       for the associated group.
+#'
+#'     - `v` (length-1) is the associated `version` (one of the requested
+#'       `.versions`)
+#'
+#'     - `<additional configuration args>` are optional; you can add such
+#'       arguments to your function and set them by passing them through the
+#'       `...` argument to `epix_slide()`.
 #'
 #'   If a formula, `.f` can operate directly on columns accessed via `.x$var` or
 #'   `.$var`, as in `~ mean (.x$var)` to compute a mean of a column `var` for
 #'   each group-`ref_time_value` combination. The group key can be accessed via
-#'   `.y` or `.group_key`, and the reference time value can be accessed via `.z`
-#'   or `.ref_time_value`. If `.f` is missing, then `...` will specify the
-#'   computation.
+#'   `.y` or `.group_key`, and the reference time value can be accessed via
+#'   `.z`, `.version`, or `.ref_time_value`. If `.f` is missing, then `...` will
+#'   specify the computation.
 #' @param ... Additional arguments to pass to the function or formula specified
 #'   via `f`. Alternatively, if `.f` is missing, then the `...` is interpreted
 #'   as a ["data-masking"][rlang::args_data_masking] expression or expressions
 #'   for tidy evaluation; in addition to referring columns directly by name, the
 #'   expressions have access to `.data` and `.env` pronouns as in `dplyr` verbs,
 #'   and can also refer to `.x` (not the same as the input epi_archive),
-#'   `.group_key`, and `.ref_time_value`. See details for more.
-#' @param .before How many time values before the `.ref_time_value`
-#'   should each snapshot handed to the function `.f` contain? If provided, it
-#'   should be a single value that is compatible with the time_type of the
-#'   time_value column (more below), but most commonly an integer. This window
-#'   endpoint is inclusive. For example, if `.before = 7`, `time_type`
-#'   in the archive is "day", and the `.ref_time_value` is January 8, then the
-#'   smallest time_value in the snapshot will be January 1. If missing, then the
-#'   default is no limit on the time values, so the full snapshot is given.
-#' @param .versions Reference time values / versions for sliding
-#'   computations; each element of this vector serves both as the anchor point
-#'   for the `time_value` window for the computation and the `max_version`
-#'   `epix_as_of` which we fetch data in this window. If missing, then this will
-#'   set to a regularly-spaced sequence of values set to cover the range of
-#'   `version`s in the `DT` plus the `versions_end`; the spacing of values will
-#'   be guessed (using the GCD of the skips between values).
+#'   `.group_key` and `.version`/`.ref_time_value`. See details for more.
+#' @param .before Optional; applies a `time_value` filter before running each
+#'   computation. The default is not to apply a `time_value` filter. If
+#'   provided, it should be a single integer or difftime that is compatible with
+#'   the time_type of the time_value column. If an integer, then the minimum
+#'   possible `time_value` included will be that many time steps (according to
+#'   the `time_type`) before each requested `.version`. This window endpoint is
+#'   inclusive. For example, if `.before = 14`, the `time_type` in the archive
+#'   is "day", and the requested `.version` is January 15, then the smallest
+#'   possible `time_value` possible in the snapshot will be January 1. Note that
+#'   this does not mean that there will be 14 or 15 distinct `time_value`s
+#'   actually appearing in the data; for most reporting streams, reporting as of
+#'   January 15 won't include `time_value`s all the way through January 14, due
+#'   to reporting latency. Unlike `epi_slide()`, `epix_slide()` won't fill in
+#'   any missing `time_values` in this window.
+#' @param .versions Requested versions on which to run the computation. Each
+#'   requested `.version` also serves as the anchor point around which for which
+#'   the `time_value` window specified by `.before` is drawn. If `.versions` is
+#'   missing, it will be set to a regularly-spaced sequence of values set to
+#'   cover the range of `version`s in the `DT` plus the `versions_end`; the
+#'   spacing of values will be guessed (using the GCD of the skips between
+#'   values).
 #' @param .new_col_name Either `NULL` or a string indicating the name of the new
 #'   column that will contain the derived values. The default, `NULL`, will use
 #'   the name "slide_value" unless your slide computations output data frames,
-#'   in which case they will be unpacked into the constituent columns and those
-#'   names used. If the resulting column name(s) overlap with the column names
-#'   used for labeling the computations, which are `group_vars(x)` and
-#'   `"version"`, then the values for these columns must be identical to the
-#'   labels we assign.
+#'   in which case they will be unpacked into the constituent columns and the
+#'   data frame's column names will be used instead. If the resulting column
+#'   name(s) overlap with the column names used for labeling the computations,
+#'   which are `group_vars(x)` and `"version"`, then the values for these
+#'   columns must be identical to the labels we assign.
 #' @param .all_versions (Not the same as `.all_rows` parameter of `epi_slide`.)
 #'   If `.all_versions = TRUE`, then the slide computation will be passed the
 #'   version history (all `version <= .version` where `.version` is one of the
@@ -697,16 +713,17 @@ epix_detailed_restricted_mutate <- function(.data, ...) {
 #' @details A few key distinctions between the current function and `epi_slide()`:
 #'   1. In `.f` functions for `epix_slide`, one should not assume that the input
 #'   data to contain any rows with `time_value` matching the computation's
-#'   `.ref_time_value` (accessible via `attributes(<data>)$metadata$as_of`); for
-#'   typical epidemiological surveillance data, observations pertaining to a
-#'   particular time period (`time_value`) are first reported `as_of` some
-#'   instant after that time period has ended.
+#'   `.version`, due to reporting latency; for typical epidemiological
+#'   surveillance data, observations pertaining to a particular time period
+#'   (`time_value`) are first reported `as_of` some instant after that time
+#'   period has ended. No time window completion is performed as in
+#'   `epi_slide()`.
 #'   2. The input class and columns are similar but different: `epix_slide`
 #'   (with the default `.all_versions=FALSE`) keeps all columns and the
 #'   `epi_df`-ness of the first argument to each computation; `epi_slide` only
 #'   provides the grouping variables in the second input, and will convert the
 #'   first input into a regular tibble if the grouping variables include the
-#'   essential `geo_value` column. (With .all_versions=TRUE`, `epix_slide` will
+#'   essential `geo_value` column. (With `.all_versions=TRUE`, `epix_slide` will
 #'   will provide an `epi_archive` rather than an `epi-df` to each
 #'   computation.)
 #'   3. The output class and columns are similar but different: `epix_slide()`
@@ -726,75 +743,55 @@ epix_detailed_restricted_mutate <- function(.data, ...) {
 #'   computations are allowed more flexibility in their outputs than in
 #'   `epi_slide`, we can't guess a good representation for missing computations
 #'   for excluded group-`.ref_time_value` pairs.
-#'   76. The `.versions` default for `epix_slide` is based on making an
+#'   6. The `.versions` default for `epix_slide` is based on making an
 #'   evenly-spaced sequence out of the `version`s in the `DT` plus the
 #'   `versions_end`, rather than the `time_value`s.
+#'   7. `epix_slide()` computations can refer to the current element of
+#'   `.versions` as either `.version` or `.ref_time_value`, while `epi_slide()`
+#'   computations refer to the current element of `.ref_time_values` with
+#'   `.ref_time_value`.
 #'
 #' Apart from the above distinctions, the interfaces between `epix_slide()` and
 #' `epi_slide()` are the same.
 #'
-#' Furthermore, the current function can be considerably slower than
-#'   `epi_slide()`, for two reasons: (1) it must repeatedly fetch
-#'   properly-versioned snapshots from the data archive (via `epix_as_of()`),
-#'   and (2) it performs a "manual" sliding of sorts, and does not benefit from
-#'   the highly efficient `slider` package. For this reason, it should never be
-#'   used in place of `epi_slide()`, and only used when version-aware sliding is
-#'   necessary (as it its purpose).
-#'
 #' @examples
 #' library(dplyr)
 #'
-#' # Reference time points for which we want to compute slide values:
-#' versions <- seq(as.Date("2020-06-02"),
-#'   as.Date("2020-06-15"),
-#'   by = "1 day"
-#' )
+#' # Request only a small set of versions, for example's sake:
+#' requested_versions <-
+#'   seq(as.Date("2020-09-02"), as.Date("2020-09-15"), by = "1 day")
 #'
-#' # A simple (but not very useful) example (see the archive vignette for a more
-#' # realistic one):
+#' # Investigate reporting lag of `percent_cli` signal (though normally we'd
+#' # probably work off of the dedicated `revision_summary()` function instead):
 #' archive_cases_dv_subset %>%
-#'   group_by(geo_value) %>%
 #'   epix_slide(
-#'     .f = ~ mean(.x$case_rate_7d_av),
-#'     .before = 2,
-#'     .versions = versions,
-#'     .new_col_name = "case_rate_7d_av_recent_av"
-#'   ) %>%
-#'   ungroup()
-#' # We requested time windows that started 2 days before the corresponding time
-#' # values. The actual number of `time_value`s in each computation depends on
-#' # the reporting latency of the signal and `time_value` range covered by the
-#' # archive (2020-06-01 -- 2021-11-30 in this example).  In this case, we have
-#' # * 0 `time_value`s, for ref time 2020-06-01 --> the result is automatically
-#' #                                                discarded
-#' # * 1 `time_value`, for ref time 2020-06-02
-#' # * 2 `time_value`s, for the rest of the results
-#' # * never the 3 `time_value`s we would get from `epi_slide`, since, because
-#' #   of data latency, we'll never have an observation
-#' #   `time_value == .ref_time_value` as of `.ref_time_value`.
-#' # The example below shows this type of behavior in more detail.
-#'
-#' # Examining characteristics of the data passed to each computation with
-#' # `all_versions=FALSE`.
+#'     geowide_percent_cli_max_time = max(time_value[!is.na(percent_cli)]),
+#'     geowide_percent_cli_rpt_lag = .version - geowide_percent_cli_max_time,
+#'     .versions = requested_versions
+#'   )
 #' archive_cases_dv_subset %>%
 #'   group_by(geo_value) %>%
 #'   epix_slide(
-#'     function(x, gk, rtv) {
-#'       tibble(
-#'         time_range = if (nrow(x) == 0L) {
-#'           "0 `time_value`s"
-#'         } else {
-#'           sprintf("%s -- %s", min(x$time_value), max(x$time_value))
-#'         },
-#'         n = nrow(x),
-#'         class1 = class(x)[[1L]]
-#'       )
-#'     },
-#'     .before = 5, .all_versions = FALSE,
-#'     .versions = versions
-#'   ) %>%
-#'   ungroup() %>%
-#'   arrange(geo_value, version)
+#'     percent_cli_max_time = max(time_value[!is.na(percent_cli)]),
+#'     percent_cli_rpt_lag = .version - percent_cli_max_time,
+#'     .versions = requested_versions
+#'   )
+#'
+#' # Backtest a forecaster "pseudoprospectively" (i.e., faithfully with respect
+#' # to the data version history):
+#' case_death_rate_archive %>%
+#'   epix_slide(
+#'     .versions = as.Date(c("2021-10-01", "2021-10-08")),
+#'     function(x, g, v) {
+#'       epipredict::arx_forecaster(
+#'         x,
+#'         outcome = "death_rate",
+#'         predictors = c("death_rate_7d_av", "case_rate_7d_av")
+#'       )$predictions
+#'     }
+#'   )
+#' # See `vignette("backtesting", package="epipredict")` for a full walkthrough
+#' # on backtesting forecasters, including plots, etc.
 #'
 #' # --- Advanced: ---
 #'
diff --git a/man/epix_slide.Rd b/man/epix_slide.Rd
index 1d4009cb5..06c4972d5 100644
--- a/man/epix_slide.Rd
+++ b/man/epix_slide.Rd
@@ -4,7 +4,7 @@
 \alias{epix_slide}
 \alias{epix_slide.epi_archive}
 \alias{epix_slide.grouped_epi_archive}
-\title{Slide a function over variables in an \code{epi_archive} or \code{grouped_epi_archive}}
+\title{Take each requested (group and) version in an archive, run a computation (e.g., forecast)}
 \usage{
 epix_slide(
   .x,
@@ -41,25 +41,31 @@ epix_slide(
 all data in \code{x} will be treated as part of a single data group.}
 
 \item{.f}{Function, formula, or missing; together with \code{...} specifies the
-computation to slide. To "slide" means to apply a computation over a
-sliding (a.k.a. "rolling") time window for each data group. The window is
-determined by the \code{.before} parameter (see details for more). If a
-function, \code{.f} must have the form \verb{function(x, g, t, ...)}, where
+computation. The computation will be run on each requested group-version
+combination, with a time window filter applied if \code{.before} is supplied.
 \itemize{
-\item "x" is an epi_df with the same column names as the archive's \code{DT}, minus
-the \code{version} column
-\item "g" is a one-row tibble containing the values of the grouping variables
-for the associated group
-\item "t" is the ref_time_value for the current window
-\item "..." are additional arguments
+\item If \code{.f} is a function must have the form \verb{function(x, g, v)} or
+\verb{function(x, g, v, <additional configuration args>)}, where
+\itemize{
+\item \code{x} is an \code{epi_df} with the same column names as the archive's \code{DT},
+minus the \code{version} column. (Or, if \code{.all_versions = TRUE}, an
+\code{epi_archive} with the requested partial version history.)
+\item \code{g} is a one-row tibble containing the values of the grouping variables
+for the associated group.
+\item \code{v} (length-1) is the associated \code{version} (one of the requested
+\code{.versions})
+\item \verb{<additional configuration args>} are optional; you can add such
+arguments to your function and set them by passing them through the
+\code{...} argument to \code{epix_slide()}.
+}
 }
 
 If a formula, \code{.f} can operate directly on columns accessed via \code{.x$var} or
 \code{.$var}, as in \code{~ mean (.x$var)} to compute a mean of a column \code{var} for
 each group-\code{ref_time_value} combination. The group key can be accessed via
-\code{.y} or \code{.group_key}, and the reference time value can be accessed via \code{.z}
-or \code{.ref_time_value}. If \code{.f} is missing, then \code{...} will specify the
-computation.}
+\code{.y} or \code{.group_key}, and the reference time value can be accessed via
+\code{.z}, \code{.version}, or \code{.ref_time_value}. If \code{.f} is missing, then \code{...} will
+specify the computation.}
 
 \item{...}{Additional arguments to pass to the function or formula specified
 via \code{f}. Alternatively, if \code{.f} is missing, then the \code{...} is interpreted
@@ -67,33 +73,39 @@ as a \link[rlang:args_data_masking]{"data-masking"} expression or expressions
 for tidy evaluation; in addition to referring columns directly by name, the
 expressions have access to \code{.data} and \code{.env} pronouns as in \code{dplyr} verbs,
 and can also refer to \code{.x} (not the same as the input epi_archive),
-\code{.group_key}, and \code{.ref_time_value}. See details for more.}
+\code{.group_key} and \code{.version}/\code{.ref_time_value}. See details for more.}
 
-\item{.before}{How many time values before the \code{.ref_time_value}
-should each snapshot handed to the function \code{.f} contain? If provided, it
-should be a single value that is compatible with the time_type of the
-time_value column (more below), but most commonly an integer. This window
-endpoint is inclusive. For example, if \code{.before = 7}, \code{time_type}
-in the archive is "day", and the \code{.ref_time_value} is January 8, then the
-smallest time_value in the snapshot will be January 1. If missing, then the
-default is no limit on the time values, so the full snapshot is given.}
+\item{.before}{Optional; applies a \code{time_value} filter before running each
+computation. The default is not to apply a \code{time_value} filter. If
+provided, it should be a single integer or difftime that is compatible with
+the time_type of the time_value column. If an integer, then the minimum
+possible \code{time_value} included will be that many time steps (according to
+the \code{time_type}) before each requested \code{.version}. This window endpoint is
+inclusive. For example, if \code{.before = 14}, the \code{time_type} in the archive
+is "day", and the requested \code{.version} is January 15, then the smallest
+possible \code{time_value} possible in the snapshot will be January 1. Note that
+this does not mean that there will be 14 or 15 distinct \code{time_value}s
+actually appearing in the data; for most reporting streams, reporting as of
+January 15 won't include \code{time_value}s all the way through January 14, due
+to reporting latency. Unlike \code{epi_slide()}, \code{epix_slide()} won't fill in
+any missing \code{time_values} in this window.}
 
-\item{.versions}{Reference time values / versions for sliding
-computations; each element of this vector serves both as the anchor point
-for the \code{time_value} window for the computation and the \code{max_version}
-\code{epix_as_of} which we fetch data in this window. If missing, then this will
-set to a regularly-spaced sequence of values set to cover the range of
-\code{version}s in the \code{DT} plus the \code{versions_end}; the spacing of values will
-be guessed (using the GCD of the skips between values).}
+\item{.versions}{Requested versions on which to run the computation. Each
+requested \code{.version} also serves as the anchor point around which for which
+the \code{time_value} window specified by \code{.before} is drawn. If \code{.versions} is
+missing, it will be set to a regularly-spaced sequence of values set to
+cover the range of \code{version}s in the \code{DT} plus the \code{versions_end}; the
+spacing of values will be guessed (using the GCD of the skips between
+values).}
 
 \item{.new_col_name}{Either \code{NULL} or a string indicating the name of the new
 column that will contain the derived values. The default, \code{NULL}, will use
 the name "slide_value" unless your slide computations output data frames,
-in which case they will be unpacked into the constituent columns and those
-names used. If the resulting column name(s) overlap with the column names
-used for labeling the computations, which are \code{group_vars(x)} and
-\code{"version"}, then the values for these columns must be identical to the
-labels we assign.}
+in which case they will be unpacked into the constituent columns and the
+data frame's column names will be used instead. If the resulting column
+name(s) overlap with the column names used for labeling the computations,
+which are \code{group_vars(x)} and \code{"version"}, then the values for these
+columns must be identical to the labels we assign.}
 
 \item{.all_versions}{(Not the same as \code{.all_rows} parameter of \code{epi_slide}.)
 If \code{.all_versions = TRUE}, then the slide computation will be passed the
@@ -110,28 +122,32 @@ computation, and a column named according to the \code{.new_col_name} argument,
 containing the slide values. It will be grouped by the grouping variables.
 }
 \description{
-Slides a given function over variables in an \code{epi_archive} object. This
-behaves similarly to \code{epi_slide()}, with the key exception that it is
-version-aware: the sliding computation at any given reference time t is
-performed on \strong{data that would have been available as of t}. This function
-is intended for use in accurate backtesting of models; see
+... and collect the results. This is useful for more accurately simulating
+how a forecaster, nowcaster, or other algorithm would have behaved in real
+time, factoring in reporting latency and data revisions; see
 \href{https://cmu-delphi.github.io/epipredict/articles/backtesting.html}{\code{vignette("backtesting", package="epipredict")}} for a walkthrough.
 }
 \details{
+This is similar to looping over versions and calling \code{\link{epix_as_of}}, but has
+some conveniences such as working naturally with \code{\link{grouped_epi_archive}}s,
+optional time windowing, and syntactic sugar to make things shorter to write.
+
 A few key distinctions between the current function and \code{epi_slide()}:
 \enumerate{
 \item In \code{.f} functions for \code{epix_slide}, one should not assume that the input
 data to contain any rows with \code{time_value} matching the computation's
-\code{.ref_time_value} (accessible via \verb{attributes(<data>)$metadata$as_of}); for
-typical epidemiological surveillance data, observations pertaining to a
-particular time period (\code{time_value}) are first reported \code{as_of} some
-instant after that time period has ended.
+\code{.version}, due to reporting latency; for typical epidemiological
+surveillance data, observations pertaining to a particular time period
+(\code{time_value}) are first reported \code{as_of} some instant after that time
+period has ended. No time window completion is performed as in
+\code{epi_slide()}.
 \item The input class and columns are similar but different: \code{epix_slide}
 (with the default \code{.all_versions=FALSE}) keeps all columns and the
 \code{epi_df}-ness of the first argument to each computation; \code{epi_slide} only
 provides the grouping variables in the second input, and will convert the
 first input into a regular tibble if the grouping variables include the
-essential \code{geo_value} column. (With .all_versions=TRUE\verb{, }epix_slide\verb{will   will provide an}epi_archive\verb{rather than an}epi-df` to each
+essential \code{geo_value} column. (With \code{.all_versions=TRUE}, \code{epix_slide} will
+will provide an \code{epi_archive} rather than an \code{epi-df} to each
 computation.)
 \item The output class and columns are similar but different: \code{epix_slide()}
 returns a tibble containing only the grouping variables, \code{time_value}, and
@@ -153,73 +169,53 @@ for excluded group-\code{.ref_time_value} pairs.
 \item The \code{.versions} default for \code{epix_slide} is based on making an
 evenly-spaced sequence out of the \code{version}s in the \code{DT} plus the
 \code{versions_end}, rather than the \code{time_value}s.
+\item \code{epix_slide()} computations can refer to the current element of
+\code{.versions} as either \code{.version} or \code{.ref_time_value}, while \code{epi_slide()}
+computations refer to the current element of \code{.ref_time_values} with
+\code{.ref_time_value}.
 }
 
 Apart from the above distinctions, the interfaces between \code{epix_slide()} and
 \code{epi_slide()} are the same.
-
-Furthermore, the current function can be considerably slower than
-\code{epi_slide()}, for two reasons: (1) it must repeatedly fetch
-properly-versioned snapshots from the data archive (via \code{epix_as_of()}),
-and (2) it performs a "manual" sliding of sorts, and does not benefit from
-the highly efficient \code{slider} package. For this reason, it should never be
-used in place of \code{epi_slide()}, and only used when version-aware sliding is
-necessary (as it its purpose).
 }
 \examples{
 library(dplyr)
 
-# Reference time points for which we want to compute slide values:
-versions <- seq(as.Date("2020-06-02"),
-  as.Date("2020-06-15"),
-  by = "1 day"
-)
+# Request only a small set of versions, for example's sake:
+requested_versions <-
+  seq(as.Date("2020-09-02"), as.Date("2020-09-15"), by = "1 day")
 
-# A simple (but not very useful) example (see the archive vignette for a more
-# realistic one):
+# Investigate reporting lag of `percent_cli` signal (though normally we'd
+# probably work off of the dedicated `revision_summary()` function instead):
 archive_cases_dv_subset \%>\%
-  group_by(geo_value) \%>\%
   epix_slide(
-    .f = ~ mean(.x$case_rate_7d_av),
-    .before = 2,
-    .versions = versions,
-    .new_col_name = "case_rate_7d_av_recent_av"
-  ) \%>\%
-  ungroup()
-# We requested time windows that started 2 days before the corresponding time
-# values. The actual number of `time_value`s in each computation depends on
-# the reporting latency of the signal and `time_value` range covered by the
-# archive (2020-06-01 -- 2021-11-30 in this example).  In this case, we have
-# * 0 `time_value`s, for ref time 2020-06-01 --> the result is automatically
-#                                                discarded
-# * 1 `time_value`, for ref time 2020-06-02
-# * 2 `time_value`s, for the rest of the results
-# * never the 3 `time_value`s we would get from `epi_slide`, since, because
-#   of data latency, we'll never have an observation
-#   `time_value == .ref_time_value` as of `.ref_time_value`.
-# The example below shows this type of behavior in more detail.
-
-# Examining characteristics of the data passed to each computation with
-# `all_versions=FALSE`.
+    geowide_percent_cli_max_time = max(time_value[!is.na(percent_cli)]),
+    geowide_percent_cli_rpt_lag = .version - geowide_percent_cli_max_time,
+    .versions = requested_versions
+  )
 archive_cases_dv_subset \%>\%
   group_by(geo_value) \%>\%
   epix_slide(
-    function(x, gk, rtv) {
-      tibble(
-        time_range = if (nrow(x) == 0L) {
-          "0 `time_value`s"
-        } else {
-          sprintf("\%s -- \%s", min(x$time_value), max(x$time_value))
-        },
-        n = nrow(x),
-        class1 = class(x)[[1L]]
-      )
-    },
-    .before = 5, .all_versions = FALSE,
-    .versions = versions
-  ) \%>\%
-  ungroup() \%>\%
-  arrange(geo_value, version)
+    percent_cli_max_time = max(time_value[!is.na(percent_cli)]),
+    percent_cli_rpt_lag = .version - percent_cli_max_time,
+    .versions = requested_versions
+  )
+
+# Backtest a forecaster "pseudoprospectively" (i.e., faithfully with respect
+# to the data version history):
+case_death_rate_archive \%>\%
+  epix_slide(
+    .versions = as.Date(c("2021-10-01", "2021-10-08")),
+    function(x, g, v) {
+      epipredict::arx_forecaster(
+        x,
+        outcome = "death_rate",
+        predictors = c("death_rate_7d_av", "case_rate_7d_av")
+      )$predictions
+    }
+  )
+# See `vignette("backtesting", package="epipredict")` for a full walkthrough
+# on backtesting forecasters, including plots, etc.
 
 # --- Advanced: ---
 

From 4a3eafc746dfd30caa161cb9a984b310b8f8ff8b Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Thu, 13 Mar 2025 10:32:19 -0700
Subject: [PATCH 006/117] docs(epi_slide): use "subpopulation" to help clarify
 grouping

---
 R/slide.R        | 20 ++++++++++----------
 man/epi_slide.Rd | 20 ++++++++++----------
 2 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/R/slide.R b/R/slide.R
index 017359cf3..967f880fd 100644
--- a/R/slide.R
+++ b/R/slide.R
@@ -19,16 +19,16 @@
 #'
 #' @template basic-slide-params
 #' @param .f,... The computation to slide. The input will be a time window of
-#'   the data for a single `geo_value` --- or a single combination of
-#'   `geo_value` and any [`other_keys`][as_epi_df] you used to specify
-#'   demographical breakdowns. The input will always have the same size,
-#'   determined by `.window_size`, and will fill in any missing `time_values`,
-#'   using `NA` values for missing measurements. The output should be a scalar
-#'   value or a 1-row data frame; these outputs will be collected and form a new
-#'   column or columns in the `epi_slide()` result. Data frame outputs will be
-#'   unpacked into multiple columns in the result by default, or
-#'   [`tidyr::pack`]ed into a single data-frame-type column if you provide a
-#'   name for such a column (e.g., via `.new_col_name`).
+#'   the data for a single subpopulation (i.e., a single `geo_value` and single
+#'   value for any [`other_keys`][as_epi_df] you set up for age groups, etc.).
+#'   The input will always have the same size, determined by `.window_size`, and
+#'   will fill in any missing `time_values`, using `NA` values for missing
+#'   measurements. The output should be a scalar value or a 1-row data frame;
+#'   these outputs will be collected and form a new column or columns in the
+#'   `epi_slide()` result. Data frame outputs will be unpacked into multiple
+#'   columns in the result by default, or [`tidyr::pack`]ed into a single
+#'   data-frame-type column if you provide a name for such a column (e.g., via
+#'   `.new_col_name`).
 #'
 #' You can specify the computation in one of the following ways:
 #'
diff --git a/man/epi_slide.Rd b/man/epi_slide.Rd
index 1146da313..83d1cc61a 100644
--- a/man/epi_slide.Rd
+++ b/man/epi_slide.Rd
@@ -21,16 +21,16 @@ and any columns in \code{other_keys}. If grouped, we make sure the grouping is b
 \code{geo_value} and \code{other_keys}.}
 
 \item{.f, ...}{The computation to slide. The input will be a time window of
-the data for a single \code{geo_value} --- or a single combination of
-\code{geo_value} and any \code{\link[=as_epi_df]{other_keys}} you used to specify
-demographical breakdowns. The input will always have the same size,
-determined by \code{.window_size}, and will fill in any missing \code{time_values},
-using \code{NA} values for missing measurements. The output should be a scalar
-value or a 1-row data frame; these outputs will be collected and form a new
-column or columns in the \code{epi_slide()} result. Data frame outputs will be
-unpacked into multiple columns in the result by default, or
-\code{\link[tidyr:pack]{tidyr::pack}}ed into a single data-frame-type column if you provide a
-name for such a column (e.g., via \code{.new_col_name}).
+the data for a single subpopulation (i.e., a single \code{geo_value} and single
+value for any \code{\link[=as_epi_df]{other_keys}} you set up for age groups, etc.).
+The input will always have the same size, determined by \code{.window_size}, and
+will fill in any missing \code{time_values}, using \code{NA} values for missing
+measurements. The output should be a scalar value or a 1-row data frame;
+these outputs will be collected and form a new column or columns in the
+\code{epi_slide()} result. Data frame outputs will be unpacked into multiple
+columns in the result by default, or \code{\link[tidyr:pack]{tidyr::pack}}ed into a single
+data-frame-type column if you provide a name for such a column (e.g., via
+\code{.new_col_name}).
 
 You can specify the computation in one of the following ways:
 \itemize{

From 03bf732fb1b295bc76b664ab77d592178720971a Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Thu, 13 Mar 2025 11:37:56 -0700
Subject: [PATCH 007/117] docs(epi_slide_opt desc): cast as main time slide,
 +NA behavior, edits

---
 R/slide.R            | 20 +++++++++++++++-----
 man/epi_slide_opt.Rd | 20 +++++++++++++++-----
 2 files changed, 30 insertions(+), 10 deletions(-)

diff --git a/R/slide.R b/R/slide.R
index 967f880fd..ae82816fe 100644
--- a/R/slide.R
+++ b/R/slide.R
@@ -573,12 +573,22 @@ get_before_after_from_window <- function(window_size, align, time_type) {
   list(before = before, after = after)
 }
 
-#' Optimized slide functions for common cases
+#' Calculate rolling or running means, sums, etc., or custom calculations
 #'
-#' @description `epi_slide_opt` allows sliding an n-timestep [data.table::froll]
-#' or [slider::summary-slide] function over variables in an `epi_df` object.
-#' These functions tend to be much faster than `epi_slide()`. See
-#' `vignette("epi_df")` for more examples.
+#' @description These methods take each subpopulation (i.e., a single
+#'   `geo_value` and combination of any `other_keys` you set up for age groups,
+#'   etc.) and perform a `.window_size`-width time window rolling/sliding
+#'   computation, or alternatively, a running/cumulative computation (with
+#'   `.window_size = Inf`) on the requested columns. Explicit `NA` measurements
+#'   are temporarily added to fill in any time gaps, and, for rolling
+#'   computations, to pad the time series to ensure that the first & last
+#'   computations are over exactly `.window_size` values.
+#'
+#' `epi_slide_opt` allows you to use any [data.table::froll] or
+#' [slider::summary-slide] function. If none of the specialized functions here
+#' work, you can use `data.table::frollapply` with your own function. See
+#' [`epi_slide`] if you need to work with multiple columns at once or output a
+#' custom type.
 #'
 #' @template basic-slide-params
 #' @param .col_names <[`tidy-select`][dplyr_tidy_select]> An unquoted column
diff --git a/man/epi_slide_opt.Rd b/man/epi_slide_opt.Rd
index 4b75e9ffb..c33a4208a 100644
--- a/man/epi_slide_opt.Rd
+++ b/man/epi_slide_opt.Rd
@@ -4,7 +4,7 @@
 \alias{epi_slide_opt}
 \alias{epi_slide_mean}
 \alias{epi_slide_sum}
-\title{Optimized slide functions for common cases}
+\title{Calculate rolling or running means, sums, etc., or custom calculations}
 \usage{
 epi_slide_opt(
   .x,
@@ -134,10 +134,20 @@ added. It will be ungrouped if \code{.x} was ungrouped, and have the same groups
 as \code{.x} if \code{.x} was grouped.
 }
 \description{
-\code{epi_slide_opt} allows sliding an n-timestep \link[data.table:froll]{data.table::froll}
-or \link[slider:summary-slide]{slider::summary-slide} function over variables in an \code{epi_df} object.
-These functions tend to be much faster than \code{epi_slide()}. See
-\code{vignette("epi_df")} for more examples.
+These methods take each subpopulation (i.e., a single
+\code{geo_value} and combination of any \code{other_keys} you set up for age groups,
+etc.) and perform a \code{.window_size}-width time window rolling/sliding
+computation, or alternatively, a running/cumulative computation (with
+\code{.window_size = Inf}) on the requested columns. Explicit \code{NA} measurements
+are temporarily added to fill in any time gaps, and, for rolling
+computations, to pad the time series to ensure that the first & last
+computations are over exactly \code{.window_size} values.
+
+\code{epi_slide_opt} allows you to use any \link[data.table:froll]{data.table::froll} or
+\link[slider:summary-slide]{slider::summary-slide} function. If none of the specialized functions here
+work, you can use \code{data.table::frollapply} with your own function. See
+\code{\link{epi_slide}} if you need to work with multiple columns at once or output a
+custom type.
 
 \code{epi_slide_mean} is a wrapper around \code{epi_slide_opt} with \code{.f = data.table::frollmean}.
 

From 68d6c573338e2df3931c53dddf5e06e3ab29428d Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Thu, 13 Mar 2025 12:42:57 -0700
Subject: [PATCH 008/117] docs(epix_slide): remove inaccurate + misformatted
 `.version - before`

Inaccurate in that `before` isn't `.before` and `-` isn't `time_type`-aware.

Misformatted in that `-` was interpreted as a bullet point.
---
 R/methods-epi_archive.R | 8 ++++----
 man/epix_slide.Rd       | 9 ++++-----
 2 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/R/methods-epi_archive.R b/R/methods-epi_archive.R
index d6e2815e5..42a90f62a 100644
--- a/R/methods-epi_archive.R
+++ b/R/methods-epi_archive.R
@@ -701,10 +701,10 @@ epix_detailed_restricted_mutate <- function(.data, ...) {
 #'   columns must be identical to the labels we assign.
 #' @param .all_versions (Not the same as `.all_rows` parameter of `epi_slide`.)
 #'   If `.all_versions = TRUE`, then the slide computation will be passed the
-#'   version history (all `version <= .version` where `.version` is one of the
-#'   requested `.versions`) for rows having a `time_value` of at least `.version
-#'   - before`. Otherwise, the slide computation will be passed only the most
-#'   recent `version` for every unique `time_value`. Default is `FALSE`.
+#'   version history (all versions `<= .version` where `.version` is one of the
+#'   requested `.version`s), in `epi_archive` format. Otherwise, the slide
+#'   computation will be passed only the most recent `version` for every unique
+#'   `time_value`, in `epi_df` format. Default is `FALSE`.
 #' @return A tibble whose columns are: the grouping variables (if any),
 #'   `time_value`, containing the reference time values for the slide
 #'   computation, and a column named according to the `.new_col_name` argument,
diff --git a/man/epix_slide.Rd b/man/epix_slide.Rd
index 06c4972d5..6ab2ad265 100644
--- a/man/epix_slide.Rd
+++ b/man/epix_slide.Rd
@@ -109,11 +109,10 @@ columns must be identical to the labels we assign.}
 
 \item{.all_versions}{(Not the same as \code{.all_rows} parameter of \code{epi_slide}.)
 If \code{.all_versions = TRUE}, then the slide computation will be passed the
-version history (all \code{version <= .version} where \code{.version} is one of the
-requested \code{.versions}) for rows having a \code{time_value} of at least `.version
-\itemize{
-\item before\verb{. Otherwise, the slide computation will be passed only the most   recent }version\verb{for every unique}time_value\verb{. Default is }FALSE`.
-}}
+version history (all versions \verb{<= .version} where \code{.version} is one of the
+requested \code{.version}s), in \code{epi_archive} format. Otherwise, the slide
+computation will be passed only the most recent \code{version} for every unique
+\code{time_value}, in \code{epi_df} format. Default is \code{FALSE}.}
 }
 \value{
 A tibble whose columns are: the grouping variables (if any),

From 01ab5f4095e6f45dd59b492dc973fe9c94eff070 Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Thu, 13 Mar 2025 14:18:27 -0700
Subject: [PATCH 009/117] docs(epix_slide): remove duplicated word

---
 R/methods-epi_archive.R | 2 +-
 man/epix_slide.Rd       | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/R/methods-epi_archive.R b/R/methods-epi_archive.R
index 42a90f62a..76a1ab170 100644
--- a/R/methods-epi_archive.R
+++ b/R/methods-epi_archive.R
@@ -723,7 +723,7 @@ epix_detailed_restricted_mutate <- function(.data, ...) {
 #'   `epi_df`-ness of the first argument to each computation; `epi_slide` only
 #'   provides the grouping variables in the second input, and will convert the
 #'   first input into a regular tibble if the grouping variables include the
-#'   essential `geo_value` column. (With `.all_versions=TRUE`, `epix_slide` will
+#'   essential `geo_value` column. (With `.all_versions=TRUE`, `epix_slide`
 #'   will provide an `epi_archive` rather than an `epi-df` to each
 #'   computation.)
 #'   3. The output class and columns are similar but different: `epix_slide()`
diff --git a/man/epix_slide.Rd b/man/epix_slide.Rd
index 6ab2ad265..601d5811f 100644
--- a/man/epix_slide.Rd
+++ b/man/epix_slide.Rd
@@ -145,7 +145,7 @@ period has ended. No time window completion is performed as in
 \code{epi_df}-ness of the first argument to each computation; \code{epi_slide} only
 provides the grouping variables in the second input, and will convert the
 first input into a regular tibble if the grouping variables include the
-essential \code{geo_value} column. (With \code{.all_versions=TRUE}, \code{epix_slide} will
+essential \code{geo_value} column. (With \code{.all_versions=TRUE}, \code{epix_slide}
 will provide an \code{epi_archive} rather than an \code{epi-df} to each
 computation.)
 \item The output class and columns are similar but different: \code{epix_slide()}

From 576e97d74d14f2d4887b8b90992fa08f3d1958f4 Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Mon, 17 Mar 2025 12:11:47 -0700
Subject: [PATCH 010/117] docs(epi_slide): iterate on intro, examples,
 motivation

---
 R/slide.R            | 132 ++++++++++++++++++++++++++++---------------
 man/epi_slide.Rd     | 123 ++++++++++++++++++++++++++--------------
 man/epi_slide_opt.Rd |   6 +-
 3 files changed, 171 insertions(+), 90 deletions(-)

diff --git a/R/slide.R b/R/slide.R
index ae82816fe..863f68123 100644
--- a/R/slide.R
+++ b/R/slide.R
@@ -1,19 +1,27 @@
 #' More general form of [`epi_slide_opt`] for rolling/running computations
 #'
-#' Check first whether you can use [`epi_slide_mean`], [`epi_slide_sum`], or the
-#' medium-generality [`epi_slide_opt`] instead, as they are faster and more
-#' convenient to use. You typically only need to use `epi_slide()` if you have a
-#' computation that depends on multiple columns simultaneously, outputs multiple
-#' columns simultaneously, or produces non-numeric output.
+#' Most rolling/running computations can be handled by [`epi_slide_mean`],
+#' [`epi_slide_sum`], or the medium-generality [`epi_slide_opt`] functions
+#' instead, which are much faster. You typically only need to consider
+#' `epi_slide()` if you have a computation that depends on multiple columns
+#' simultaneously, outputs multiple columns simultaneously, or produces
+#' non-numeric output.  For example, this computation depends on multiple
+#' columns:
 #'
 #' ```
-#' # Create new column `cases_7dmed` that contains a 7-day trailing median of cases
-#' epi_slide(edf, cases_7dmed = median(cases), .window_size = 7)
+#' cases_deaths_subset %>%
+#'   epi_slide(
+#'     cfr_estimate_v0 = death_rate_7d_av[[22]]/case_rate_7d_av[[1]],
+#'     .window_size = 22
+#'   ) %>%
+#'   print(n = 30)
 #' ```
 #'
-#' For two very common use cases, we provide optimized functions that are much
-#' faster than `epi_slide`: `epi_slide_mean()` and `epi_slide_sum()`. We
-#' recommend using these functions when possible.
+#' (Here, the value 22 was selected using `epi_cor()` and averaging across
+#' `geo_value`s. See
+#' \href{https://www.medrxiv.org/content/10.1101/2024.12.27.24319518v1}{this
+#' manuscript}{this manuscript} for some warnings & information using similar
+#' types of CFR estimators.)
 #'
 #' See `vignette("epi_df")` for more examples.
 #'
@@ -34,15 +42,19 @@
 #'
 #' - Don't provide `.f`, and instead use use one or more
 #'   [`dplyr::summarize`]-esque ["data-masking"][rlang::args_data_masking]
-#'   expressions in `...`, e.g., `cases_7dmed = median(cases)`. This is usually
-#'   the most convenient way to use `epi_slide`. See examples.
+#'   expressions in `...`, e.g., `cfr_estimate_v0 =
+#'   death_rate_7d_av[[22]]/case_rate_7d_av[[1]]`. This way is sometimes more
+#'   convenient, but also has the most computational overhead.
 #'
-#' - Provide a formula in `.f`, e.g., `~ median(.x$cases)`. In this formula,
-#'   `.x` is an `epi_df` containing data for a single time window as described
-#'   above, taken from the original `.x` fed into `epi_slide()`.
+#' - Provide a formula in `.f`, e.g., `~
+#'   .x$death_rate_7d_av[[22]]/.x$case_rate_7d_av[[1]]`. In this formula, `.x`
+#'   is an `epi_df` containing data for a single time window as described above,
+#'   taken from the original `.x` fed into `epi_slide()`.
 #'
-#' - Provide a function in `.f`. The function should be of the form `function(x,
-#'   g, t)` or `function(x, g, t, <additional configuration arguments>)`, where:
+#' - Provide a function in `.f`, e.g., `function(x, g, t)
+#'   x$death_rate_7d_av[[22]]/x$case_rate_7d_av[[1]]`. The function should be of
+#'   the form `function(x, g, t)` or `function(x, g, t, <additional
+#'   configuration arguments>)`, where:
 #'
 #'     - `x` is a data frame with the same column names as the original object,
 #'       minus any grouping variables, with only the windowed data for one
@@ -60,7 +72,8 @@
 #'   The values of `g` and `t` are also available to data-masking expression and
 #'   formula-based computations as `.group_key` and `.ref_time_value`,
 #'   respectively. Formula computations also let you use `.y` or `.z`,
-#'   respectively.
+#'   respectively, as additional names for these same quantities (similar to
+#'   [`dplyr::group_modify`]).
 #'
 #' @param ... Additional arguments to pass to the function or formula specified
 #'   via `.f`. Alternatively, if `.f` is missing, then the `...` is interpreted
@@ -73,6 +86,26 @@
 #'   be given names that clash with the existing columns of `.x`.
 #'
 #' @details
+#'
+#' ## Motivation and lower-level alternatives
+#'
+#' `epi_slide()` is focused on preventing errors and providing a convenient
+#' interface. If you need computational speed, many computations can be optimized
+#' by one of the following:
+#'
+#' * Performing core sliding operations with `epi_slide_opt()` with
+#'   `frollapply`, and using potentially-grouped `mutate()`s to transform or
+#'   combine the results.
+#'
+#' * Grouping by `geo_value` and any `other_keys`; [`complete()`]ing with
+#'   `full_seq()` to fill in time gaps; `arrange()`ing by `time_value`s within
+#'   each group; using `mutate()` with vectorized operations and shift operators
+#'   like `dplyr::lead()` and `dplyr::lag()` to perform the core operations,
+#'   being careful to give the desired results for the least and most recent
+#'   `time_value`s (often `NA`s for the least recent); ungrouping; and
+#'   `filter()`ing back down to only rows that existed before the `complete()`
+#'   stage if necessary.
+#'
 #' ## Advanced uses of `.f` via tidy evaluation
 #'
 #' If specifying `.f` via tidy evaluation, in addition to the standard [`.data`]
@@ -96,34 +129,43 @@
 #' @examples
 #' library(dplyr)
 #'
-#' # Get the 7-day trailing standard deviation of cases and the 7-day trailing mean of cases
-#' cases_deaths_subset %>%
+#' # Generate some simple time-varying CFR estimates:
+#' with_cfr_estimates <- cases_deaths_subset %>%
 #'   epi_slide(
-#'     cases_7sd = sd(cases, na.rm = TRUE),
-#'     cases_7dav = mean(cases, na.rm = TRUE),
-#'     .window_size = 7
-#'   ) %>%
-#'   select(geo_value, time_value, cases, cases_7sd, cases_7dav)
-#' # Note that epi_slide_mean could be used to more quickly calculate cases_7dav.
+#'     cfr_estimate_v0 = death_rate_7d_av[[22]] / case_rate_7d_av[[1]],
+#'     .window_size = 22
+#'   )
+#' with_cfr_estimates %>%
+#'   print(n = 30)
+#' # (Here, the value 22 was selected using `epi_cor()` and averaging across
+#' # `geo_value`s. See
+#' # https://www.medrxiv.org/content/10.1101/2024.12.27.24319518v1 for some
+#' # warnings & information using CFR estimators along these lines.)
 #'
-#' # In addition to the [`dplyr::mutate`]-like syntax, you can feed in a function or
-#' # formula in a way similar to [`dplyr::group_modify`]:
-#' my_summarizer <- function(window_data) {
-#'   window_data %>%
-#'     summarize(
-#'       cases_7sd = sd(cases, na.rm = TRUE),
-#'       cases_7dav = mean(cases, na.rm = TRUE)
-#'     )
+#' # In addition to the [`dplyr::mutate`]-like syntax, you can feed in a
+#' # function or formula in a way similar to [`dplyr::group_modify`]; these
+#' # often run much more quickly:
+#' my_computation <- function(window_data) {
+#'   tibble(
+#'     cfr_estimate_v0 = window_data$death_rate_7d_av[[nrow(window_data)]] /
+#'       window_data$case_rate_7d_av[[1]]
+#'   )
 #' }
-#' cases_deaths_subset %>%
+#' with_cfr_estimates2 <- cases_deaths_subset %>%
 #'   epi_slide(
-#'     ~ my_summarizer(.x),
-#'     .window_size = 7
-#'   ) %>%
-#'   select(geo_value, time_value, cases, cases_7sd, cases_7dav)
-#'
-#'
-#'
+#'     ~ my_computation(.x),
+#'     .window_size = 22
+#'   )
+#' with_cfr_estimates3 <- cases_deaths_subset %>%
+#'   epi_slide(
+#'     function(window_data, g, t) {
+#'       tibble(
+#'         cfr_estimate_v0 = window_data$death_rate_7d_av[[nrow(window_data)]] /
+#'           window_data$case_rate_7d_av[[1]]
+#'       )
+#'     },
+#'     .window_size = 22
+#'   )
 #'
 #'
 #' #### Advanced: ####
@@ -586,9 +628,9 @@ get_before_after_from_window <- function(window_size, align, time_type) {
 #'
 #' `epi_slide_opt` allows you to use any [data.table::froll] or
 #' [slider::summary-slide] function. If none of the specialized functions here
-#' work, you can use `data.table::frollapply` with your own function. See
-#' [`epi_slide`] if you need to work with multiple columns at once or output a
-#' custom type.
+#' work, you can use `data.table::frollapply` together with a non-rolling
+#' function (e.g., `median`). See [`epi_slide`] if you need to work with
+#' multiple columns at once or output a custom type.
 #'
 #' @template basic-slide-params
 #' @param .col_names <[`tidy-select`][dplyr_tidy_select]> An unquoted column
diff --git a/man/epi_slide.Rd b/man/epi_slide.Rd
index 83d1cc61a..f88605559 100644
--- a/man/epi_slide.Rd
+++ b/man/epi_slide.Rd
@@ -36,12 +36,13 @@ You can specify the computation in one of the following ways:
 \itemize{
 \item Don't provide \code{.f}, and instead use use one or more
 \code{\link[dplyr:summarise]{dplyr::summarize}}-esque \link[rlang:args_data_masking]{"data-masking"}
-expressions in \code{...}, e.g., \code{cases_7dmed = median(cases)}. This is usually
-the most convenient way to use \code{epi_slide}. See examples.
-\item Provide a formula in \code{.f}, e.g., \code{~ median(.x$cases)}. In this formula,
-\code{.x} is an \code{epi_df} containing data for a single time window as described
-above, taken from the original \code{.x} fed into \code{epi_slide()}.
-\item Provide a function in \code{.f}. The function should be of the form \verb{function(x, g, t)} or \verb{function(x, g, t, <additional configuration arguments>)}, where:
+expressions in \code{...}, e.g., \code{cfr_estimate_v0 = death_rate_7d_av[[22]]/case_rate_7d_av[[1]]}. This way is sometimes more
+convenient, but also has the most computational overhead.
+\item Provide a formula in \code{.f}, e.g., \code{~ .x$death_rate_7d_av[[22]]/.x$case_rate_7d_av[[1]]}. In this formula, \code{.x}
+is an \code{epi_df} containing data for a single time window as described above,
+taken from the original \code{.x} fed into \code{epi_slide()}.
+\item Provide a function in \code{.f}, e.g., \code{function(x, g, t) x$death_rate_7d_av[[22]]/x$case_rate_7d_av[[1]]}. The function should be of
+the form \verb{function(x, g, t)} or \verb{function(x, g, t, <additional configuration arguments>)}, where:
 \itemize{
 \item \code{x} is a data frame with the same column names as the original object,
 minus any grouping variables, with only the windowed data for one
@@ -56,7 +57,8 @@ argument to \code{epi_slide()}.
 The values of \code{g} and \code{t} are also available to data-masking expression and
 formula-based computations as \code{.group_key} and \code{.ref_time_value},
 respectively. Formula computations also let you use \code{.y} or \code{.z},
-respectively.
+respectively, as additional names for these same quantities (similar to
+\code{\link[dplyr:group_map]{dplyr::group_modify}}).
 }}
 
 \item{...}{Additional arguments to pass to the function or formula specified
@@ -110,22 +112,50 @@ added. It will be ungrouped if \code{.x} was ungrouped, and have the same groups
 as \code{.x} if \code{.x} was grouped.
 }
 \description{
-Check first whether you can use \code{\link{epi_slide_mean}}, \code{\link{epi_slide_sum}}, or the
-medium-generality \code{\link{epi_slide_opt}} instead, as they are faster and more
-convenient to use. You typically only need to use \code{epi_slide()} if you have a
-computation that depends on multiple columns simultaneously, outputs multiple
-columns simultaneously, or produces non-numeric output.
+Most rolling/running computations can be handled by \code{\link{epi_slide_mean}},
+\code{\link{epi_slide_sum}}, or the medium-generality \code{\link{epi_slide_opt}} functions
+instead, which are much faster. You typically only need to consider
+\code{epi_slide()} if you have a computation that depends on multiple columns
+simultaneously, outputs multiple columns simultaneously, or produces
+non-numeric output.  For example, this computation depends on multiple
+columns:
 }
 \details{
-\if{html}{\out{<div class="sourceCode">}}\preformatted{# Create new column `cases_7dmed` that contains a 7-day trailing median of cases
-epi_slide(edf, cases_7dmed = median(cases), .window_size = 7)
+\if{html}{\out{<div class="sourceCode">}}\preformatted{cases_deaths_subset \%>\%
+  epi_slide(
+    cfr_estimate_v0 = death_rate_7d_av[[22]]/case_rate_7d_av[[1]],
+    .window_size = 22
+  ) \%>\%
+  print(n = 30)
 }\if{html}{\out{</div>}}
 
-For two very common use cases, we provide optimized functions that are much
-faster than \code{epi_slide}: \code{epi_slide_mean()} and \code{epi_slide_sum()}. We
-recommend using these functions when possible.
+(Here, the value 22 was selected using \code{epi_cor()} and averaging across
+\code{geo_value}s. See
+\href{https://www.medrxiv.org/content/10.1101/2024.12.27.24319518v1}{this
+manuscript}{this manuscript} for some warnings & information using similar
+types of CFR estimators.)
 
 See \code{vignette("epi_df")} for more examples.
+\subsection{Motivation and lower-level alternatives}{
+
+\code{epi_slide()} is focused on preventing errors and providing a convenient
+interface. If you need computational speed, many computations can be optimized
+by one of the following:
+\itemize{
+\item Performing core sliding operations with \code{epi_slide_opt()} with
+\code{frollapply}, and using potentially-grouped \code{mutate()}s to transform or
+combine the results.
+\item Grouping by \code{geo_value} and any \code{other_keys}; \code{\link[=complete]{complete()}}ing with
+\code{full_seq()} to fill in time gaps; \code{arrange()}ing by \code{time_value}s within
+each group; using \code{mutate()} with vectorized operations and shift operators
+like \code{dplyr::lead()} and \code{dplyr::lag()} to perform the core operations,
+being careful to give the desired results for the least and most recent
+\code{time_value}s (often \code{NA}s for the least recent); ungrouping; and
+\code{filter()}ing back down to only rows that existed before the \code{complete()}
+stage if necessary.
+}
+}
+
 \subsection{Advanced uses of \code{.f} via tidy evaluation}{
 
 If specifying \code{.f} via tidy evaluation, in addition to the standard \code{\link{.data}}
@@ -146,34 +176,43 @@ determined the time window for the current computation.
 \examples{
 library(dplyr)
 
-# Get the 7-day trailing standard deviation of cases and the 7-day trailing mean of cases
-cases_deaths_subset \%>\%
+# Generate some simple time-varying CFR estimates:
+with_cfr_estimates <- cases_deaths_subset \%>\%
   epi_slide(
-    cases_7sd = sd(cases, na.rm = TRUE),
-    cases_7dav = mean(cases, na.rm = TRUE),
-    .window_size = 7
-  ) \%>\%
-  select(geo_value, time_value, cases, cases_7sd, cases_7dav)
-# Note that epi_slide_mean could be used to more quickly calculate cases_7dav.
-
-# In addition to the [`dplyr::mutate`]-like syntax, you can feed in a function or
-# formula in a way similar to [`dplyr::group_modify`]:
-my_summarizer <- function(window_data) {
-  window_data \%>\%
-    summarize(
-      cases_7sd = sd(cases, na.rm = TRUE),
-      cases_7dav = mean(cases, na.rm = TRUE)
-    )
+    cfr_estimate_v0 = death_rate_7d_av[[22]] / case_rate_7d_av[[1]],
+    .window_size = 22
+  )
+with_cfr_estimates \%>\%
+  print(n = 30)
+# (Here, the value 22 was selected using `epi_cor()` and averaging across
+# `geo_value`s. See
+# https://www.medrxiv.org/content/10.1101/2024.12.27.24319518v1 for some
+# warnings & information using CFR estimators along these lines.)
+
+# In addition to the [`dplyr::mutate`]-like syntax, you can feed in a
+# function or formula in a way similar to [`dplyr::group_modify`]; these
+# often run much more quickly:
+my_computation <- function(window_data) {
+  tibble(
+    cfr_estimate_v0 = window_data$death_rate_7d_av[[nrow(window_data)]] /
+      window_data$case_rate_7d_av[[1]]
+  )
 }
-cases_deaths_subset \%>\%
+with_cfr_estimates2 <- cases_deaths_subset \%>\%
   epi_slide(
-    ~ my_summarizer(.x),
-    .window_size = 7
-  ) \%>\%
-  select(geo_value, time_value, cases, cases_7sd, cases_7dav)
-
-
-
+    ~ my_computation(.x),
+    .window_size = 22
+  )
+with_cfr_estimates3 <- cases_deaths_subset \%>\%
+  epi_slide(
+    function(window_data, g, t) {
+      tibble(
+        cfr_estimate_v0 = window_data$death_rate_7d_av[[nrow(window_data)]] /
+          window_data$case_rate_7d_av[[1]]
+      )
+    },
+    .window_size = 22
+  )
 
 
 #### Advanced: ####
diff --git a/man/epi_slide_opt.Rd b/man/epi_slide_opt.Rd
index c33a4208a..687d2ac0a 100644
--- a/man/epi_slide_opt.Rd
+++ b/man/epi_slide_opt.Rd
@@ -145,9 +145,9 @@ computations are over exactly \code{.window_size} values.
 
 \code{epi_slide_opt} allows you to use any \link[data.table:froll]{data.table::froll} or
 \link[slider:summary-slide]{slider::summary-slide} function. If none of the specialized functions here
-work, you can use \code{data.table::frollapply} with your own function. See
-\code{\link{epi_slide}} if you need to work with multiple columns at once or output a
-custom type.
+work, you can use \code{data.table::frollapply} together with a non-rolling
+function (e.g., \code{median}). See \code{\link{epi_slide}} if you need to work with
+multiple columns at once or output a custom type.
 
 \code{epi_slide_mean} is a wrapper around \code{epi_slide_opt} with \code{.f = data.table::frollmean}.
 

From 8ac32e54f0e2c4c12af381c827863c49cb818033 Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Mon, 17 Mar 2025 13:00:31 -0700
Subject: [PATCH 011/117] Remove reference to removed "advanced" vignette

---
 R/methods-epi_archive.R | 3 +--
 man/epix_slide.Rd       | 3 +--
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/R/methods-epi_archive.R b/R/methods-epi_archive.R
index 76a1ab170..e479ada45 100644
--- a/R/methods-epi_archive.R
+++ b/R/methods-epi_archive.R
@@ -737,8 +737,7 @@ epix_detailed_restricted_mutate <- function(.data, ...) {
 #'   4. There are no size stability checks or element/row recycling to maintain
 #'   size stability in `epix_slide`, unlike in `epi_slide`. (`epix_slide` is
 #'   roughly analogous to [`dplyr::group_modify`], while `epi_slide` is roughly
-#'   analogous to `dplyr::mutate` followed by `dplyr::arrange`) This is detailed
-#'   in the "advanced" vignette.
+#'   analogous to [`dplyr::mutate`].)
 #'   5. `.all_rows` is not supported in `epix_slide`; since the slide
 #'   computations are allowed more flexibility in their outputs than in
 #'   `epi_slide`, we can't guess a good representation for missing computations
diff --git a/man/epix_slide.Rd b/man/epix_slide.Rd
index 601d5811f..7f1090554 100644
--- a/man/epix_slide.Rd
+++ b/man/epix_slide.Rd
@@ -159,8 +159,7 @@ results as they are not supported by tibbles.)
 \item There are no size stability checks or element/row recycling to maintain
 size stability in \code{epix_slide}, unlike in \code{epi_slide}. (\code{epix_slide} is
 roughly analogous to \code{\link[dplyr:group_map]{dplyr::group_modify}}, while \code{epi_slide} is roughly
-analogous to \code{dplyr::mutate} followed by \code{dplyr::arrange}) This is detailed
-in the "advanced" vignette.
+analogous to \code{\link[dplyr:mutate]{dplyr::mutate}}.)
 \item \code{.all_rows} is not supported in \code{epix_slide}; since the slide
 computations are allowed more flexibility in their outputs than in
 \code{epi_slide}, we can't guess a good representation for missing computations

From 26cd412cec23b9e0eefaa01f15987bb41736fd75 Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Mon, 17 Mar 2025 15:15:44 -0700
Subject: [PATCH 012/117] docs(epix_slide): clarify .versions vs. epi_slide
 .ref_time_values defaults

---
 R/methods-epi_archive.R | 2 +-
 man/epix_slide.Rd       | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/R/methods-epi_archive.R b/R/methods-epi_archive.R
index e479ada45..a33c91c9a 100644
--- a/R/methods-epi_archive.R
+++ b/R/methods-epi_archive.R
@@ -744,7 +744,7 @@ epix_detailed_restricted_mutate <- function(.data, ...) {
 #'   for excluded group-`.ref_time_value` pairs.
 #'   6. The `.versions` default for `epix_slide` is based on making an
 #'   evenly-spaced sequence out of the `version`s in the `DT` plus the
-#'   `versions_end`, rather than the `time_value`s.
+#'   `versions_end`, rather than all unique `time_value`s.
 #'   7. `epix_slide()` computations can refer to the current element of
 #'   `.versions` as either `.version` or `.ref_time_value`, while `epi_slide()`
 #'   computations refer to the current element of `.ref_time_values` with
diff --git a/man/epix_slide.Rd b/man/epix_slide.Rd
index 7f1090554..a72e2c685 100644
--- a/man/epix_slide.Rd
+++ b/man/epix_slide.Rd
@@ -166,7 +166,7 @@ computations are allowed more flexibility in their outputs than in
 for excluded group-\code{.ref_time_value} pairs.
 \item The \code{.versions} default for \code{epix_slide} is based on making an
 evenly-spaced sequence out of the \code{version}s in the \code{DT} plus the
-\code{versions_end}, rather than the \code{time_value}s.
+\code{versions_end}, rather than all unique \code{time_value}s.
 \item \code{epix_slide()} computations can refer to the current element of
 \code{.versions} as either \code{.version} or \code{.ref_time_value}, while \code{epi_slide()}
 computations refer to the current element of \code{.ref_time_values} with

From 7a421f6bc1e83c9116e99a17aac11fa6979cfdac Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Mon, 17 Mar 2025 15:32:20 -0700
Subject: [PATCH 013/117] docs(validate_epi_archive): note omitted `*_type`
 checks

---
 R/archive.R        | 3 +++
 man/epi_archive.Rd | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/R/archive.R b/R/archive.R
index 922f77835..f168e4ddb 100644
--- a/R/archive.R
+++ b/R/archive.R
@@ -329,6 +329,9 @@ new_epi_archive <- function(
 
 #' Perform second (costly) round of validation that `x` is a proper `epi_archive`
 #'
+#' Does not validate `geo_type` or `time_type` against `geo_value` and
+#' `time_value` columns. These are assumed to have been set to compatibly.
+#'
 #' @rdname epi_archive
 #' @export
 validate_epi_archive <- function(x) {
diff --git a/man/epi_archive.Rd b/man/epi_archive.Rd
index f98666e03..3918b4c1b 100644
--- a/man/epi_archive.Rd
+++ b/man/epi_archive.Rd
@@ -109,6 +109,9 @@ only performs some fast, basic checks on the inputs. \code{validate_epi_archive}
 can perform more costly validation checks on its output. But most users
 should use \code{as_epi_archive}, which performs all necessary checks and has some
 additional features.
+
+Does not validate \code{geo_type} or \code{time_type} against \code{geo_value} and
+\code{time_value} columns. These are assumed to have been set to compatibly.
 }
 \details{
 An \code{epi_archive} contains a \code{data.table} object \code{DT} (from the

From 0c9007e41530b4698a16493c0df043200e190af9 Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Mon, 17 Mar 2025 15:46:47 -0700
Subject: [PATCH 014/117] docs: remove duplicate word, mention `time_type`
 `epi_df` attr

---
 R/epi_df.R       | 8 +++++---
 R/slide.R        | 2 +-
 man/epi_df.Rd    | 8 +++++---
 man/epi_slide.Rd | 2 +-
 4 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/R/epi_df.R b/R/epi_df.R
index 4955ab08d..b9d999d97 100644
--- a/R/epi_df.R
+++ b/R/epi_df.R
@@ -6,7 +6,8 @@
 #' which can be seen as measured variables at each key. In brief, an `epi_df`
 #' represents a snapshot of an epidemiological data set at a point in time.
 #'
-#' @details An `epi_df` is a tibble with (at least) the following columns:
+#' @details An `epi_df` is a kind of tibble with (at least) the following
+#'   columns:
 #'
 #' - `geo_value`: A character vector representing the geographical unit of
 #'    observation. This could be a country code, a state name, a county code,
@@ -14,10 +15,11 @@
 #' - `time_value`: A date or integer vector representing the time of observation.
 #'
 #' Other columns can be considered as measured variables, which we also refer to
-#'   as signal variables. An `epi_df` object also has metadata with (at least)
-#'   the following fields:
+#'   as indicators or signals. An `epi_df` object also has metadata with (at
+#'   least) the following fields:
 #'
 #' * `geo_type`: the type for the geo values.
+#' * `time_type`: the type for the time values.
 #' * `as_of`: the time value at which the given data were available.
 #'
 #' Most users should use `as_epi_df`. The input tibble `x` to the constructor
diff --git a/R/slide.R b/R/slide.R
index 863f68123..be112d070 100644
--- a/R/slide.R
+++ b/R/slide.R
@@ -40,7 +40,7 @@
 #'
 #' You can specify the computation in one of the following ways:
 #'
-#' - Don't provide `.f`, and instead use use one or more
+#' - Don't provide `.f`, and instead use one or more
 #'   [`dplyr::summarize`]-esque ["data-masking"][rlang::args_data_masking]
 #'   expressions in `...`, e.g., `cfr_estimate_v0 =
 #'   death_rate_7d_av[[22]]/case_rate_7d_av[[1]]`. This way is sometimes more
diff --git a/man/epi_df.Rd b/man/epi_df.Rd
index a67827187..0ec5830ca 100644
--- a/man/epi_df.Rd
+++ b/man/epi_df.Rd
@@ -89,7 +89,8 @@ which can be seen as measured variables at each key. In brief, an \code{epi_df}
 represents a snapshot of an epidemiological data set at a point in time.
 }
 \details{
-An \code{epi_df} is a tibble with (at least) the following columns:
+An \code{epi_df} is a kind of tibble with (at least) the following
+columns:
 \itemize{
 \item \code{geo_value}: A character vector representing the geographical unit of
 observation. This could be a country code, a state name, a county code,
@@ -98,10 +99,11 @@ etc.
 }
 
 Other columns can be considered as measured variables, which we also refer to
-as signal variables. An \code{epi_df} object also has metadata with (at least)
-the following fields:
+as indicators or signals. An \code{epi_df} object also has metadata with (at
+least) the following fields:
 \itemize{
 \item \code{geo_type}: the type for the geo values.
+\item \code{time_type}: the type for the time values.
 \item \code{as_of}: the time value at which the given data were available.
 }
 
diff --git a/man/epi_slide.Rd b/man/epi_slide.Rd
index f88605559..8f86466a7 100644
--- a/man/epi_slide.Rd
+++ b/man/epi_slide.Rd
@@ -34,7 +34,7 @@ data-frame-type column if you provide a name for such a column (e.g., via
 
 You can specify the computation in one of the following ways:
 \itemize{
-\item Don't provide \code{.f}, and instead use use one or more
+\item Don't provide \code{.f}, and instead use one or more
 \code{\link[dplyr:summarise]{dplyr::summarize}}-esque \link[rlang:args_data_masking]{"data-masking"}
 expressions in \code{...}, e.g., \code{cfr_estimate_v0 = death_rate_7d_av[[22]]/case_rate_7d_av[[1]]}. This way is sometimes more
 convenient, but also has the most computational overhead.

From 98fdca40daa4eb27580273f6f59cd30479d7a5d3 Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Mon, 17 Mar 2025 16:02:16 -0700
Subject: [PATCH 015/117] docs(NEWS.md): 0.12 NEWS entry + highlights for 0.11

---
 NEWS.md | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/NEWS.md b/NEWS.md
index 3ac814aa2..1a122e0e1 100644
--- a/NEWS.md
+++ b/NEWS.md
@@ -2,8 +2,22 @@
 
 Pre-1.0.0 numbering scheme: 0.x will indicate releases, while 0.x.y will indicate PR's.
 
+# epiprocess 0.12
+
+## Improvements
+
+- Various documentation has been updated, simplified, and improved with better
+  examples.
+
 # epiprocess 0.11
 
+## Highlights
+
+`{epiprocess}` should once again not require Rtools or a compiler to be able to
+install! We've also updated some function interfaces to be more consistent
+throughout the package & with tidyverse, and improved the generality of and
+fixed bugs in various functions and documentation.
+
 ## Breaking changes
 
 - `growth_rate()` argument order and names have changed. You will need to

From 01a9d9f441e918375e6d43afbe3e212047482f46 Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Mon, 17 Mar 2025 16:46:05 -0700
Subject: [PATCH 016/117] Move ignored Suggests: version bounds to active
 checks

---
 DESCRIPTION      | 3 ++-
 tests/testthat.R | 2 ++
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/DESCRIPTION b/DESCRIPTION
index 57981a195..92a19d145 100755
--- a/DESCRIPTION
+++ b/DESCRIPTION
@@ -66,12 +66,13 @@ Suggests:
     distributional,
     epidatr,
     epipredict,
+    hardhat,
     here,
     knitr,
     outbreaks,
     readr,
     rmarkdown,
-    testthat (>= 3.1.5),
+    testthat,
     trendfilter,
     withr
 VignetteBuilder:
diff --git a/tests/testthat.R b/tests/testthat.R
index b26d274b5..3c4cbff24 100644
--- a/tests/testthat.R
+++ b/tests/testthat.R
@@ -1,4 +1,6 @@
 library(testthat)
 library(epiprocess)
 
+stopifnot(packageVersion("testthat") >= "3.1.5")
+
 test_check("epiprocess")

From b2a9860ce3a1ac0c5910305e5d0637d4979b5d98 Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Wed, 19 Mar 2025 14:57:42 -0700
Subject: [PATCH 017/117] Improve `epix_merge()` error message, tweak
 `format_chr_with_quotes()`

---
 R/key_colnames.R                 |  4 ++--
 R/methods-epi_archive.R          |  3 ++-
 R/utils.R                        | 31 +++++++++++++++++--------------
 man/format_chr_with_quotes.Rd    | 20 +++++++++++++++-----
 tests/testthat/test-epix_merge.R |  2 +-
 5 files changed, 37 insertions(+), 23 deletions(-)

diff --git a/R/key_colnames.R b/R/key_colnames.R
index eeecce05b..f685b7210 100644
--- a/R/key_colnames.R
+++ b/R/key_colnames.R
@@ -91,8 +91,8 @@ key_colnames.epi_df <- function(x, ...,
   if (!identical(other_keys, expected_other_keys)) {
     cli_abort(c(
       "The provided `other_keys` argument didn't match the `other_keys` of `x`",
-      "*" = "`other_keys` was {format_chr_with_quotes(other_keys)}",
-      "*" = "`expected_other_keys` was {format_chr_with_quotes(expected_other_keys)}",
+      "*" = "`other_keys` was {format_chr_deparse(other_keys)}",
+      "*" = "`expected_other_keys` was {format_chr_deparse(expected_other_keys)}",
       "i" = "If you know that `x` will always be an `epi_df` and
              resolve this discrepancy by adjusting the metadata of `x`, you
              shouldn't have to pass `other_keys =` here anymore,
diff --git a/R/methods-epi_archive.R b/R/methods-epi_archive.R
index 362fd4eaa..a8421efc1 100644
--- a/R/methods-epi_archive.R
+++ b/R/methods-epi_archive.R
@@ -450,7 +450,8 @@ epix_merge <- function(x, y,
   y_nonby_colnames <- setdiff(names(y_dt), by)
   if (length(intersect(x_nonby_colnames, y_nonby_colnames)) != 0L) {
     cli_abort("
-            `x` and `y` DTs have overlapping non-by column names;
+            `x` and `y` DTs both have measurement columns named
+            {format_chr_with_quotes(intersect(x_nonby_colnames, y_nonby_colnames))};
             this is currently not supported; please manually fix up first:
             any overlapping columns that can are key-like should be
             incorporated into the key, and other columns should be renamed.
diff --git a/R/utils.R b/R/utils.R
index 66270d69c..6326efe7f 100644
--- a/R/utils.R
+++ b/R/utils.R
@@ -98,26 +98,29 @@ format_chr_deparse <- function(x) {
   paste(collapse = "", deparse(x))
 }
 
-#' Format a character vector as a string via deparsing/quoting each
+#' Format each entry in a character vector via quoting; special replacement for length 0
+#'
+#' Performs no escaping within the strings; if you want something that reader
+#' could copy-paste to debug, look into `format_deparse` (note that this
+#' collapses into a single string).
+#'
+#' @param x chr; e.g., `colnames` of some data frame
+#' @param empty chr, likely string; what should be output if `x` is of length 0?
+#' @return chr; same `length` as `x` if `x` had nonzero length; value of `empty` otherwise
+#'
+#' @examples
+#' cli::cli_inform('{format_chr_with_quotes("x")}')
+#' cli::cli_inform('{format_chr_with_quotes(c("x","y"))}')
+#' nms <- c("x","\"Total Cases\"")
+#' cli::cli_inform('{format_chr_with_quotes(nms)}')
+#' cli::cli_inform('{format_chr_with_quotes(character())}')
 #'
-#' @param x `chr`; e.g., `colnames` of some data frame
-#' @param empty string; what should be output if `x` is of length 0?
-#' @return string
 #' @keywords internal
 format_chr_with_quotes <- function(x, empty = "*none*") {
   if (length(x) == 0L) {
     empty
   } else {
-    # Deparse to get quoted + escape-sequenced versions of varnames; collapse to
-    # single line (assuming no newlines in `x`). Though if we hand this to cli
-    # it may insert them (even in middle of quotes) while wrapping lines.
-    deparsed_collapsed <- paste(collapse = "", deparse(x))
-    if (length(x) == 1L) {
-      deparsed_collapsed
-    } else {
-      # remove surrounding `c()`:
-      substr(deparsed_collapsed, 3L, nchar(deparsed_collapsed) - 1L)
-    }
+    paste0('"', x, '"')
   }
 }
 
diff --git a/man/format_chr_with_quotes.Rd b/man/format_chr_with_quotes.Rd
index 49beffb00..156c6b85b 100644
--- a/man/format_chr_with_quotes.Rd
+++ b/man/format_chr_with_quotes.Rd
@@ -2,19 +2,29 @@
 % Please edit documentation in R/utils.R
 \name{format_chr_with_quotes}
 \alias{format_chr_with_quotes}
-\title{Format a character vector as a string via deparsing/quoting each}
+\title{Format each entry in a character vector via quoting; special replacement for length 0}
 \usage{
 format_chr_with_quotes(x, empty = "*none*")
 }
 \arguments{
-\item{x}{\code{chr}; e.g., \code{colnames} of some data frame}
+\item{x}{chr; e.g., \code{colnames} of some data frame}
 
-\item{empty}{string; what should be output if \code{x} is of length 0?}
+\item{empty}{chr, likely string; what should be output if \code{x} is of length 0?}
 }
 \value{
-string
+chr; same \code{length} as \code{x} if \code{x} had nonzero length; value of \code{empty} otherwise
 }
 \description{
-Format a character vector as a string via deparsing/quoting each
+Performs no escaping within the strings; if you want something that reader
+could copy-paste to debug, look into \code{format_deparse} (note that this
+collapses into a single string).
+}
+\examples{
+cli::cli_inform('{format_chr_with_quotes("x")}')
+cli::cli_inform('{format_chr_with_quotes(c("x","y"))}')
+nms <- c("x","\"Total Cases\"")
+cli::cli_inform('{format_chr_with_quotes(nms)}')
+cli::cli_inform('{format_chr_with_quotes(character())}')
+
 }
 \keyword{internal}
diff --git a/tests/testthat/test-epix_merge.R b/tests/testthat/test-epix_merge.R
index 5b3de284f..ded143531 100644
--- a/tests/testthat/test-epix_merge.R
+++ b/tests/testthat/test-epix_merge.R
@@ -175,7 +175,7 @@ test_that("epix_merge forbids and warns on metadata and naming issues", {
       as_epi_archive(tibble::tibble(geo_value = "ak", time_value = test_date, version = test_date + 1L, value = 1L)),
       as_epi_archive(tibble::tibble(geo_value = "ak", time_value = test_date, version = test_date + 1L, value = 2L))
     ),
-    regexp = "overlapping.*names"
+    regexp = 'both have measurement columns named "value"'
   )
 })
 

From 02d4d269e8fa79a949dcc0a37dc87323b6b199bc Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Wed, 19 Mar 2025 14:58:39 -0700
Subject: [PATCH 018/117] Update + skip forecast archive tests until other
 updates in

---
 DESCRIPTION                      |  2 +-
 R/utils.R                        |  6 ++---
 man/format_chr_with_quotes.Rd    |  6 ++---
 tests/testthat/test-compactify.R | 45 ++++++++++++++++++++++----------
 4 files changed, 38 insertions(+), 21 deletions(-)

diff --git a/DESCRIPTION b/DESCRIPTION
index 92a19d145..abb0cf86c 100755
--- a/DESCRIPTION
+++ b/DESCRIPTION
@@ -1,7 +1,7 @@
 Type: Package
 Package: epiprocess
 Title: Tools for basic signal processing in epidemiology
-Version: 0.11.0
+Version: 0.11.1
 Authors@R: c(
     person("Jacob", "Bien", role = "ctb"),
     person("Logan", "Brooks", , "lcbrooks+github@andrew.cmu.edu", role = c("aut", "cre")),
diff --git a/R/utils.R b/R/utils.R
index 6326efe7f..dfbfc8ac3 100644
--- a/R/utils.R
+++ b/R/utils.R
@@ -111,9 +111,9 @@ format_chr_deparse <- function(x) {
 #' @examples
 #' cli::cli_inform('{format_chr_with_quotes("x")}')
 #' cli::cli_inform('{format_chr_with_quotes(c("x","y"))}')
-#' nms <- c("x","\"Total Cases\"")
-#' cli::cli_inform('{format_chr_with_quotes(nms)}')
-#' cli::cli_inform('{format_chr_with_quotes(character())}')
+#' nms <- c("x", "\"Total Cases\"")
+#' cli::cli_inform("{format_chr_with_quotes(nms)}")
+#' cli::cli_inform("{format_chr_with_quotes(character())}")
 #'
 #' @keywords internal
 format_chr_with_quotes <- function(x, empty = "*none*") {
diff --git a/man/format_chr_with_quotes.Rd b/man/format_chr_with_quotes.Rd
index 156c6b85b..76bbf0b85 100644
--- a/man/format_chr_with_quotes.Rd
+++ b/man/format_chr_with_quotes.Rd
@@ -22,9 +22,9 @@ collapses into a single string).
 \examples{
 cli::cli_inform('{format_chr_with_quotes("x")}')
 cli::cli_inform('{format_chr_with_quotes(c("x","y"))}')
-nms <- c("x","\"Total Cases\"")
-cli::cli_inform('{format_chr_with_quotes(nms)}')
-cli::cli_inform('{format_chr_with_quotes(character())}')
+nms <- c("x", "\"Total Cases\"")
+cli::cli_inform("{format_chr_with_quotes(nms)}")
+cli::cli_inform("{format_chr_with_quotes(character())}")
 
 }
 \keyword{internal}
diff --git a/tests/testthat/test-compactify.R b/tests/testthat/test-compactify.R
index 2eed5025f..3e6139303 100644
--- a/tests/testthat/test-compactify.R
+++ b/tests/testthat/test-compactify.R
@@ -120,37 +120,54 @@ test_that("compactify does not alter the default clobberable and observed versio
   expect_identical(ea_true$versions_end, ea_false$versions_end)
 })
 
+quantile_pred_once <- function(estimates_vec, levels_vec) {
+  hardhat::quantile_pred(t(as.matrix(estimates_vec)), levels_vec)
+}
 test_that("compactify works on distributions", {
+  skip("Until #611 is merged or hardhat/epipredict is patched")
   forecasts <- tibble(
     ahead = 2L,
     geo_value = "ak",
     target_end_date = as.Date("2020-01-19"),
-    forecast_date = as.Date("2020-01-17") + 1:8,
+    forecast_date = as.Date("2020-01-17") + 1:6,
     actual = 25,
     .pred_distn = c(
-      epipredict::dist_quantiles(c(1, 5, 9), c(0.1, 0.5, 0.9)),
-      epipredict::dist_quantiles(c(1, NA, 9), c(0.1, 0.5, 0.9)), # single NA in quantiles
-      epipredict::dist_quantiles(c(NA, NA, NA), c(0.1, 0.5, 0.9)), # all NAs in quantiles
-      distributional::dist_missing(1), # the actual `NA` for distributions
-      epipredict::dist_quantiles(c(1, 5, 9), c(0.1, 0.5, 0.9)), # and back
-      epipredict::dist_quantiles(c(3, 5, 9), c(0.1, 0.5, 0.9)), # change quantile
-      epipredict::dist_quantiles(c(3, 5, 9), c(0.2, 0.5, 0.8)), # change level
-      epipredict::dist_quantiles(c(3, 5, 9), c(0.2, 0.5, 0.8)) # LOCF
+      quantile_pred_once(c(1, 5, 9), c(0.1, 0.5, 0.9)),
+      quantile_pred_once(c(1, NA, 9), c(0.1, 0.5, 0.9)), # single NA in quantiles
+      quantile_pred_once(c(NA, NA, NA), c(0.1, 0.5, 0.9)), # all NAs in quantiles (hardhat+vctrs+epipredict treats as missing)
+      quantile_pred_once(c(1, 5, 9), c(0.1, 0.5, 0.9)), # and back
+      quantile_pred_once(c(3, 5, 9), c(0.1, 0.5, 0.9)), # change quantile
+      quantile_pred_once(c(3, 5, 9), c(0.1, 0.5, 0.9)) # LOCF
     )
   )
   expect_equal(
     forecasts %>%
-      as_epi_archive(
-        other_keys = "ahead", time_value = target_end_date, version = forecast_date,
-        compactify = TRUE
-      ) %>%
+      as_epi_archive(other_keys = "ahead", time_value = target_end_date, version = forecast_date) %>%
       .$DT %>%
       as.data.frame() %>%
       as_tibble(),
-    forecasts[-8, ] %>%
+    forecasts[-6, ] %>%
       rename(time_value = target_end_date, version = forecast_date)
   )
 })
+test_that("epix_merge works with distributions", {
+  skip("Until hardhat/epipredict is patched")
+  forecasts1ea <- tibble(
+    ahead = 2L,
+    geo_value = "ak",
+    target_end_date = as.Date("2020-01-19"),
+    forecast_date = as.Date("2020-01-17") + 1,
+    .pred_distn1 = quantile_pred_once(c(1, 5, 9), c(0.1, 0.5, 0.9))
+  ) %>% as_epi_archive(other_keys = "ahead", time_value = target_end_date, version = forecast_date)
+  forecasts2ea <- tibble(
+    ahead = 2L,
+    geo_value = "ak",
+    target_end_date = as.Date("2020-01-19"),
+    forecast_date = as.Date("2020-01-17") + 2,
+    .pred_distn2 = quantile_pred_once(c(2, 4, 8), c(0.1, 0.5, 0.9))
+  ) %>% as_epi_archive(other_keys = "ahead", time_value = target_end_date, version = forecast_date)
+  forecasts12ea <- epix_merge(forecasts1ea, forecasts2ea, sync = "locf")
+})
 
 test_that("Large compactify_abs_tol does not drop edf keys", {
   # several epikeytimes, each with a single version

From 4e00ac39ef59c1f298e432b697285d574d2fec5a Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Thu, 20 Mar 2025 09:21:20 -0700
Subject: [PATCH 019/117] Fix missing epiprocess::: in internal examples; addr
 line length lint

---
 R/utils.R                        | 8 ++++----
 man/format_chr_with_quotes.Rd    | 8 ++++----
 tests/testthat/test-compactify.R | 2 +-
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/R/utils.R b/R/utils.R
index dfbfc8ac3..972170998 100644
--- a/R/utils.R
+++ b/R/utils.R
@@ -109,11 +109,11 @@ format_chr_deparse <- function(x) {
 #' @return chr; same `length` as `x` if `x` had nonzero length; value of `empty` otherwise
 #'
 #' @examples
-#' cli::cli_inform('{format_chr_with_quotes("x")}')
-#' cli::cli_inform('{format_chr_with_quotes(c("x","y"))}')
+#' cli::cli_inform('{epiprocess:::format_chr_with_quotes("x")}')
+#' cli::cli_inform('{epiprocess:::format_chr_with_quotes(c("x","y"))}')
 #' nms <- c("x", "\"Total Cases\"")
-#' cli::cli_inform("{format_chr_with_quotes(nms)}")
-#' cli::cli_inform("{format_chr_with_quotes(character())}")
+#' cli::cli_inform("{epiprocess:::format_chr_with_quotes(nms)}")
+#' cli::cli_inform("{epiprocess:::format_chr_with_quotes(character())}")
 #'
 #' @keywords internal
 format_chr_with_quotes <- function(x, empty = "*none*") {
diff --git a/man/format_chr_with_quotes.Rd b/man/format_chr_with_quotes.Rd
index 76bbf0b85..555dd402c 100644
--- a/man/format_chr_with_quotes.Rd
+++ b/man/format_chr_with_quotes.Rd
@@ -20,11 +20,11 @@ could copy-paste to debug, look into \code{format_deparse} (note that this
 collapses into a single string).
 }
 \examples{
-cli::cli_inform('{format_chr_with_quotes("x")}')
-cli::cli_inform('{format_chr_with_quotes(c("x","y"))}')
+cli::cli_inform('{epiprocess:::format_chr_with_quotes("x")}')
+cli::cli_inform('{epiprocess:::format_chr_with_quotes(c("x","y"))}')
 nms <- c("x", "\"Total Cases\"")
-cli::cli_inform("{format_chr_with_quotes(nms)}")
-cli::cli_inform("{format_chr_with_quotes(character())}")
+cli::cli_inform("{epiprocess:::format_chr_with_quotes(nms)}")
+cli::cli_inform("{epiprocess:::format_chr_with_quotes(character())}")
 
 }
 \keyword{internal}
diff --git a/tests/testthat/test-compactify.R b/tests/testthat/test-compactify.R
index 3e6139303..229af8453 100644
--- a/tests/testthat/test-compactify.R
+++ b/tests/testthat/test-compactify.R
@@ -134,7 +134,7 @@ test_that("compactify works on distributions", {
     .pred_distn = c(
       quantile_pred_once(c(1, 5, 9), c(0.1, 0.5, 0.9)),
       quantile_pred_once(c(1, NA, 9), c(0.1, 0.5, 0.9)), # single NA in quantiles
-      quantile_pred_once(c(NA, NA, NA), c(0.1, 0.5, 0.9)), # all NAs in quantiles (hardhat+vctrs+epipredict treats as missing)
+      quantile_pred_once(c(NA, NA, NA), c(0.1, 0.5, 0.9)), # all NAs in quantiles
       quantile_pred_once(c(1, 5, 9), c(0.1, 0.5, 0.9)), # and back
       quantile_pred_once(c(3, 5, 9), c(0.1, 0.5, 0.9)), # change quantile
       quantile_pred_once(c(3, 5, 9), c(0.1, 0.5, 0.9)) # LOCF

From 353848fc2879b5ea1ccf76bb8081067de04265f4 Mon Sep 17 00:00:00 2001
From: Dmitry Shemetov <dshemetov@ucdavis.edu>
Date: Wed, 26 Mar 2025 18:08:12 -0700
Subject: [PATCH 020/117] feat: add is_epi_archive

---
 R/archive.R | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/R/archive.R b/R/archive.R
index 922f77835..d298a8e43 100644
--- a/R/archive.R
+++ b/R/archive.R
@@ -808,3 +808,16 @@ clone.epi_archive <- function(x) {
   x$DT <- data.table::copy(x$DT)
   x
 }
+
+#' Test for `epi_archive` format
+#'
+#' @param x An object.
+#' @return * Of `is_epi_archive`: `TRUE` if the object inherits from `epi_archive`,
+#'           otherwise `FALSE`.
+#'
+#' @rdname epi_archive
+#' @order 1
+#' @export
+is_epi_archive <- function(x) {
+  inherits(x, "epi_archive")
+}

From faf17fc4745a17b731a7c86c209bceb12bbea7b2 Mon Sep 17 00:00:00 2001
From: dshemetov <dshemetov@users.noreply.github.com>
Date: Thu, 27 Mar 2025 01:10:55 +0000
Subject: [PATCH 021/117] docs: document (GHA)

---
 NAMESPACE          |  1 +
 man/epi_archive.Rd | 12 ++++++++++--
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/NAMESPACE b/NAMESPACE
index 2f97e5b3d..6f3ef6a11 100644
--- a/NAMESPACE
+++ b/NAMESPACE
@@ -86,6 +86,7 @@ export(group_modify)
 export(growth_rate)
 export(growth_rate_params)
 export(guess_period)
+export(is_epi_archive)
 export(is_epi_df)
 export(is_grouped_epi_archive)
 export(key_colnames)
diff --git a/man/epi_archive.Rd b/man/epi_archive.Rd
index f98666e03..2ed9d15f8 100644
--- a/man/epi_archive.Rd
+++ b/man/epi_archive.Rd
@@ -1,12 +1,15 @@
 % Generated by roxygen2: do not edit by hand
 % Please edit documentation in R/archive.R
-\name{epi_archive}
+\name{is_epi_archive}
+\alias{is_epi_archive}
 \alias{epi_archive}
 \alias{new_epi_archive}
 \alias{validate_epi_archive}
 \alias{as_epi_archive}
-\title{\code{epi_archive} object}
+\title{Test for \code{epi_archive} format}
 \usage{
+is_epi_archive(x)
+
 new_epi_archive(
   x,
   geo_type,
@@ -92,6 +95,11 @@ used to rename columns.}
 example \code{version = release_date}}
 }
 \value{
+\itemize{
+\item Of \code{is_epi_archive}: \code{TRUE} if the object inherits from \code{epi_archive},
+otherwise \code{FALSE}.
+}
+
 An \code{epi_archive} object.
 }
 \description{

From 2a0097535563f55706bf358db784be741c8cfec0 Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Fri, 28 Mar 2025 12:31:54 -0700
Subject: [PATCH 022/117] bump version, mention removed autoplot arg

---
 DESCRIPTION | 4 ++--
 NEWS.md     | 2 ++
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/DESCRIPTION b/DESCRIPTION
index 57981a195..18ae8e424 100755
--- a/DESCRIPTION
+++ b/DESCRIPTION
@@ -1,7 +1,7 @@
 Type: Package
 Package: epiprocess
 Title: Tools for basic signal processing in epidemiology
-Version: 0.11.0
+Version: 0.11.1
 Authors@R: c(
     person("Jacob", "Bien", role = "ctb"),
     person("Logan", "Brooks", , "lcbrooks+github@andrew.cmu.edu", role = c("aut", "cre")),
@@ -46,7 +46,7 @@ Imports:
     dplyr (>= 1.1.0),
     ggplot2,
     glue,
-    lifecycle (>= 1.0.1),
+    lifecycle,
     lubridate,
     magrittr,
     pkgconfig,
diff --git a/NEWS.md b/NEWS.md
index 3ac814aa2..5ec1522ee 100644
--- a/NEWS.md
+++ b/NEWS.md
@@ -42,6 +42,8 @@ Pre-1.0.0 numbering scheme: 0.x will indicate releases, while 0.x.y will indicat
   `new_epi_archive()`.
 - `validate_epi_archive()` now follows the validator convention of operating on
   an "unvalidated" `epi_archive` (from `new_epi_archive`) rather than arguments.
+- `autoplot.epi_df()` argument `.max_facets` has been deprecated in favor of
+  `.facet_filter` which allows explicit selection of facets to show.
 
 ## Improvements
 - `revision_summary()` now supports all `time_type`s.

From f336e097db2f3d2360fa06cf1df4033b348f7ff8 Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Fri, 28 Mar 2025 12:32:27 -0700
Subject: [PATCH 023/117] usethis::use_lifecycle() to add necessary deprecation
 svgs

---
 man/figures/lifecycle-deprecated.svg   | 21 +++++++++++++++++++
 man/figures/lifecycle-experimental.svg | 21 +++++++++++++++++++
 man/figures/lifecycle-stable.svg       | 29 ++++++++++++++++++++++++++
 man/figures/lifecycle-superseded.svg   | 21 +++++++++++++++++++
 4 files changed, 92 insertions(+)
 create mode 100644 man/figures/lifecycle-deprecated.svg
 create mode 100644 man/figures/lifecycle-experimental.svg
 create mode 100644 man/figures/lifecycle-stable.svg
 create mode 100644 man/figures/lifecycle-superseded.svg

diff --git a/man/figures/lifecycle-deprecated.svg b/man/figures/lifecycle-deprecated.svg
new file mode 100644
index 000000000..b61c57c3f
--- /dev/null
+++ b/man/figures/lifecycle-deprecated.svg
@@ -0,0 +1,21 @@
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="126" height="20" role="img" aria-label="lifecycle: deprecated">
+    <title>lifecycle: deprecated</title>
+    <linearGradient id="s" x2="0" y2="100%">
+        <stop offset="0" stop-color="#bbb" stop-opacity=".1" />
+        <stop offset="1" stop-opacity=".1" />
+    </linearGradient>
+    <clipPath id="r">
+        <rect width="126" height="20" rx="3" fill="#fff" />
+    </clipPath>
+    <g clip-path="url(#r)">
+        <rect width="55" height="20" fill="#555" />
+        <rect x="55" width="71" height="20" fill="#fe7d37" />
+        <rect width="126" height="20" fill="url(#s)" />
+    </g>
+    <g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110">
+        <text aria-hidden="true" x="285" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="450">lifecycle</text>
+        <text x="285" y="140" transform="scale(.1)" fill="#fff" textLength="450">lifecycle</text>
+        <text aria-hidden="true" x="895" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="610">deprecated</text>
+        <text x="895" y="140" transform="scale(.1)" fill="#fff" textLength="610">deprecated</text>
+    </g>
+</svg>
diff --git a/man/figures/lifecycle-experimental.svg b/man/figures/lifecycle-experimental.svg
new file mode 100644
index 000000000..5d88fc2c6
--- /dev/null
+++ b/man/figures/lifecycle-experimental.svg
@@ -0,0 +1,21 @@
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="138" height="20" role="img" aria-label="lifecycle: experimental">
+    <title>lifecycle: experimental</title>
+    <linearGradient id="s" x2="0" y2="100%">
+        <stop offset="0" stop-color="#bbb" stop-opacity=".1" />
+        <stop offset="1" stop-opacity=".1" />
+    </linearGradient>
+    <clipPath id="r">
+        <rect width="138" height="20" rx="3" fill="#fff" />
+    </clipPath>
+    <g clip-path="url(#r)">
+        <rect width="55" height="20" fill="#555" />
+        <rect x="55" width="83" height="20" fill="#fe7d37" />
+        <rect width="138" height="20" fill="url(#s)" />
+    </g>
+    <g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110">
+        <text aria-hidden="true" x="285" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="450">lifecycle</text>
+        <text x="285" y="140" transform="scale(.1)" fill="#fff" textLength="450">lifecycle</text>
+        <text aria-hidden="true" x="955" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="730">experimental</text>
+        <text x="955" y="140" transform="scale(.1)" fill="#fff" textLength="730">experimental</text>
+    </g>
+</svg>
diff --git a/man/figures/lifecycle-stable.svg b/man/figures/lifecycle-stable.svg
new file mode 100644
index 000000000..9bf21e76b
--- /dev/null
+++ b/man/figures/lifecycle-stable.svg
@@ -0,0 +1,29 @@
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="98" height="20" role="img" aria-label="lifecycle: stable">
+    <title>lifecycle: stable</title>
+    <linearGradient id="s" x2="0" y2="100%">
+        <stop offset="0" stop-color="#bbb" stop-opacity=".1" />
+        <stop offset="1" stop-opacity=".1" />
+    </linearGradient>
+    <clipPath id="r">
+        <rect width="98" height="20" rx="3" fill="#fff" />
+    </clipPath>
+    <g clip-path="url(#r)">
+        <rect width="55" height="20" fill="#555" />
+        <rect x="55" width="43" height="20" fill="#4c1" />
+        <rect width="98" height="20" fill="url(#s)" />
+    </g>
+    <g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110">
+        <text aria-hidden="true" x="285" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="450">
+      lifecycle
+    </text>
+        <text x="285" y="140" transform="scale(.1)" fill="#fff" textLength="450">
+      lifecycle
+    </text>
+        <text aria-hidden="true" x="755" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">
+      stable
+    </text>
+        <text x="755" y="140" transform="scale(.1)" fill="#fff" textLength="330">
+      stable
+    </text>
+    </g>
+</svg>
diff --git a/man/figures/lifecycle-superseded.svg b/man/figures/lifecycle-superseded.svg
new file mode 100644
index 000000000..db8d757f7
--- /dev/null
+++ b/man/figures/lifecycle-superseded.svg
@@ -0,0 +1,21 @@
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="128" height="20" role="img" aria-label="lifecycle: superseded">
+    <title>lifecycle: superseded</title>
+    <linearGradient id="s" x2="0" y2="100%">
+        <stop offset="0" stop-color="#bbb" stop-opacity=".1" />
+        <stop offset="1" stop-opacity=".1" />
+    </linearGradient>
+    <clipPath id="r">
+        <rect width="128" height="20" rx="3" fill="#fff" />
+    </clipPath>
+    <g clip-path="url(#r)">
+        <rect width="55" height="20" fill="#555" />
+        <rect x="55" width="73" height="20" fill="#007ec6" />
+        <rect width="128" height="20" fill="url(#s)" />
+    </g>
+    <g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110">
+        <text aria-hidden="true" x="285" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="450">lifecycle</text>
+        <text x="285" y="140" transform="scale(.1)" fill="#fff" textLength="450">lifecycle</text>
+        <text aria-hidden="true" x="905" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="630">superseded</text>
+        <text x="905" y="140" transform="scale(.1)" fill="#fff" textLength="630">superseded</text>
+    </g>
+</svg>

From 067ed22732e0a802d6a64306d573a11a57cc6841 Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Fri, 28 Mar 2025 12:32:51 -0700
Subject: [PATCH 024/117] allow autoplot to filter facets instead of arbitrary
 limit

---
 R/autoplot.R           | 48 ++++++++++++++++++++++++++++++------------
 man/autoplot.epi_df.Rd | 23 +++++++++++++++++---
 2 files changed, 55 insertions(+), 16 deletions(-)

diff --git a/R/autoplot.R b/R/autoplot.R
index ecfe5f1c9..69a956add 100644
--- a/R/autoplot.R
+++ b/R/autoplot.R
@@ -19,8 +19,17 @@
 #' @param .base_color Lines will be shown with this color. For example, with a
 #'   single numeric variable and faceting by `geo_value`, all locations would
 #'   share the same color line.
-#' @param .max_facets Cut down of the number of facets displayed. Especially
-#'   useful for testing when there are many `geo_value`'s or keys.
+#' @param .max_facets `r lifecycle::badge("deprecated")`
+#' @param .facet_filter Select which facets will be displayed. Especially
+#'   useful for when there are many `geo_value`'s or keys. This is a 
+#'   <[`rlang`][args_data_masking]> expression along the lines of `[dplyr::filter()]`.
+#'   However, it must be a single expression combined with the `&` operator. This
+#'   contrasts to the typical use case which allows multiple comma-separated expressions
+#'   which are implicitly combined with `&`. When multiple variables are selected
+#'   with `...`, their names can be filtered in combination with other factors
+#'   by using `.response_name`. See the examples below.
+#'   
+#'   
 #'
 #' @return A ggplot object
 #' @export
@@ -41,16 +50,33 @@
 #' autoplot(cases_deaths_subset, case_rate_7d_av,
 #'   .base_color = "red", .facet_by = "geo_value"
 #' )
+#' 
+#' # filter to only some facets, must be explicitly combined
+#' autoplot(cases_deaths_subset, cases, death_rate_7d_av,
+#'   .facet_by = "all", 
+#'   .facet_filter = (.response_name == "cases" & geo_value %in% c("tx", "pa")) |
+#'     (.response_name == "death_rate_7d_av" & 
+#'       geo_value %in% c("ca", "fl", "ga", "ny"))
+#' )
 autoplot.epi_df <- function(
     object, ...,
     .color_by = c("all_keys", "geo_value", "other_keys", ".response", "all", "none"),
     .facet_by = c(".response", "other_keys", "all_keys", "geo_value", "all", "none"),
     .base_color = "#3A448F",
-    .max_facets = Inf) {
+    .facet_filter = NULL,
+    .max_facets = deprecated()
+) {
   .color_by <- rlang::arg_match(.color_by)
   .facet_by <- rlang::arg_match(.facet_by)
+  .facet_filter <- rlang::enquo(.facet_filter)
 
-  assert(anyInfinite(.max_facets), checkInt(.max_facets), combine = "or")
+  if (lifecycle::is_present(.max_facets)) {
+    lifecycle::deprecate_warn(
+      "0.11.1", 
+      "autoplot.epi_df(.max_facets = )", 
+      "autoplot.epi_df(.facet_filter = )"
+    )
+  }
   assert_character(.base_color, len = 1)
 
   key_cols <- key_colnames(object)
@@ -126,15 +152,11 @@ autoplot.epi_df <- function(
       )
     )
 
-  if (.max_facets < Inf && ".facets" %in% names(object)) {
-    n_facets <- nlevels(object$.facets)
-    if (n_facets > .max_facets) {
-      top_n <- levels(as.factor(object$.facets))[seq_len(.max_facets)]
-      object <- dplyr::filter(object, .data$.facets %in% top_n) %>%
-        dplyr::mutate(.facets = droplevels(.data$.facets))
-      if (".colours" %in% names(object)) {
-        object <- dplyr::mutate(object, .colours = droplevels(.data$.colours))
-      }
+  if (!rlang::quo_is_null(.facet_filter) && ".facets" %in% names(object)) {
+    object <- dplyr::filter(object, !!.facet_filter) %>%
+      dplyr::mutate(.facets = droplevels(.data$.facets))
+    if (".colours" %in% names(object)) {
+      object <- dplyr::mutate(object, .colours = droplevels(.data$.colours))
     }
   }
 
diff --git a/man/autoplot.epi_df.Rd b/man/autoplot.epi_df.Rd
index d53335c14..427c372c2 100644
--- a/man/autoplot.epi_df.Rd
+++ b/man/autoplot.epi_df.Rd
@@ -10,7 +10,8 @@
   .color_by = c("all_keys", "geo_value", "other_keys", ".response", "all", "none"),
   .facet_by = c(".response", "other_keys", "all_keys", "geo_value", "all", "none"),
   .base_color = "#3A448F",
-  .max_facets = Inf
+  .facet_filter = NULL,
+  .max_facets = deprecated()
 )
 }
 \arguments{
@@ -40,8 +41,16 @@ each numeric variable on a separate facet}
 single numeric variable and faceting by \code{geo_value}, all locations would
 share the same color line.}
 
-\item{.max_facets}{Cut down of the number of facets displayed. Especially
-useful for testing when there are many \code{geo_value}'s or keys.}
+\item{.facet_filter}{Select which facets will be displayed. Especially
+useful for when there are many \code{geo_value}'s or keys. This is a
+<\code{\link[=args_data_masking]{rlang}}> expression along the lines of \verb{[dplyr::filter()]}.
+However, it must be a single expression combined with the \code{&} operator. This
+contrasts to the typical use case which allows multiple comma-separated expressions
+which are implicitly combined with \code{&}. When multiple variables are selected
+with \code{...}, their names can be filtered in combination with other factors
+by using \code{.response_name}. See the examples below.}
+
+\item{.max_facets}{\ifelse{html}{\href{https://lifecycle.r-lib.org/articles/stages.html#deprecated}{\figure{lifecycle-deprecated.svg}{options: alt='[Deprecated]'}}}{\strong{[Deprecated]}}}
 }
 \value{
 A ggplot object
@@ -65,4 +74,12 @@ autoplot(cases_deaths_subset, case_rate_7d_av,
 autoplot(cases_deaths_subset, case_rate_7d_av,
   .base_color = "red", .facet_by = "geo_value"
 )
+
+# filter to only some facets, must be explicitly combined
+autoplot(cases_deaths_subset, cases, death_rate_7d_av,
+  .facet_by = "all", 
+  .facet_filter = (.response_name == "cases" & geo_value \%in\% c("tx", "pa")) |
+    (.response_name == "death_rate_7d_av" & 
+      geo_value \%in\% c("ca", "fl", "ga", "ny"))
+)
 }

From 74e08432c20e4639669f50ae705e4c8d6f768031 Mon Sep 17 00:00:00 2001
From: dajmcdon <dajmcdon@users.noreply.github.com>
Date: Fri, 28 Mar 2025 19:42:57 +0000
Subject: [PATCH 025/117] style: styler (GHA)

---
 R/autoplot.R | 19 +++++++++----------
 1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/R/autoplot.R b/R/autoplot.R
index 69a956add..95b8c5899 100644
--- a/R/autoplot.R
+++ b/R/autoplot.R
@@ -21,15 +21,15 @@
 #'   share the same color line.
 #' @param .max_facets `r lifecycle::badge("deprecated")`
 #' @param .facet_filter Select which facets will be displayed. Especially
-#'   useful for when there are many `geo_value`'s or keys. This is a 
+#'   useful for when there are many `geo_value`'s or keys. This is a
 #'   <[`rlang`][args_data_masking]> expression along the lines of `[dplyr::filter()]`.
 #'   However, it must be a single expression combined with the `&` operator. This
 #'   contrasts to the typical use case which allows multiple comma-separated expressions
 #'   which are implicitly combined with `&`. When multiple variables are selected
 #'   with `...`, their names can be filtered in combination with other factors
 #'   by using `.response_name`. See the examples below.
-#'   
-#'   
+#'
+#'
 #'
 #' @return A ggplot object
 #' @export
@@ -50,12 +50,12 @@
 #' autoplot(cases_deaths_subset, case_rate_7d_av,
 #'   .base_color = "red", .facet_by = "geo_value"
 #' )
-#' 
+#'
 #' # filter to only some facets, must be explicitly combined
 #' autoplot(cases_deaths_subset, cases, death_rate_7d_av,
-#'   .facet_by = "all", 
+#'   .facet_by = "all",
 #'   .facet_filter = (.response_name == "cases" & geo_value %in% c("tx", "pa")) |
-#'     (.response_name == "death_rate_7d_av" & 
+#'     (.response_name == "death_rate_7d_av" &
 #'       geo_value %in% c("ca", "fl", "ga", "ny"))
 #' )
 autoplot.epi_df <- function(
@@ -64,16 +64,15 @@ autoplot.epi_df <- function(
     .facet_by = c(".response", "other_keys", "all_keys", "geo_value", "all", "none"),
     .base_color = "#3A448F",
     .facet_filter = NULL,
-    .max_facets = deprecated()
-) {
+    .max_facets = deprecated()) {
   .color_by <- rlang::arg_match(.color_by)
   .facet_by <- rlang::arg_match(.facet_by)
   .facet_filter <- rlang::enquo(.facet_filter)
 
   if (lifecycle::is_present(.max_facets)) {
     lifecycle::deprecate_warn(
-      "0.11.1", 
-      "autoplot.epi_df(.max_facets = )", 
+      "0.11.1",
+      "autoplot.epi_df(.max_facets = )",
       "autoplot.epi_df(.facet_filter = )"
     )
   }

From f640a2184b03f0ec1504467d212d7308dcd1ad53 Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Fri, 28 Mar 2025 16:04:33 -0700
Subject: [PATCH 026/117] basic idea works, adding docs

---
 R/autoplot.R | 186 +++++++++++++++++++++++++++++++++++++++------------
 1 file changed, 142 insertions(+), 44 deletions(-)

diff --git a/R/autoplot.R b/R/autoplot.R
index 69a956add..ebc4f4fc8 100644
--- a/R/autoplot.R
+++ b/R/autoplot.R
@@ -16,9 +16,9 @@
 #'   * `none` - no coloring aesthetic is applied
 #' @param .facet_by Similar to `.color_by` except that the default is to display
 #'   each numeric variable on a separate facet
-#' @param .base_color Lines will be shown with this color. For example, with a
-#'   single numeric variable and faceting by `geo_value`, all locations would
-#'   share the same color line.
+#' @param .base_color Lines will be shown with this color if `.color_by == "none"`.
+#'  For example, with a single numeric variable and faceting by `geo_value`, all 
+#'  locations would share the same color line.
 #' @param .max_facets `r lifecycle::badge("deprecated")`
 #' @param .facet_filter Select which facets will be displayed. Especially
 #'   useful for when there are many `geo_value`'s or keys. This is a 
@@ -68,7 +68,7 @@ autoplot.epi_df <- function(
 ) {
   .color_by <- rlang::arg_match(.color_by)
   .facet_by <- rlang::arg_match(.facet_by)
-  .facet_filter <- rlang::enquo(.facet_filter)
+  if (!rlang::is_quosure(.facet_filter)) .facet_filter <- rlang::enquo(.facet_filter)
 
   if (lifecycle::is_present(.max_facets)) {
     lifecycle::deprecate_warn(
@@ -84,44 +84,14 @@ autoplot.epi_df <- function(
   geo_and_other_keys <- key_colnames(object, exclude = "time_value")
 
   # --- check for numeric variables
-  allowed <- purrr::map_lgl(object[non_key_cols], is.numeric)
-  allowed <- allowed[allowed]
-  if (length(allowed) == 0 && rlang::dots_n(...) == 0L) {
-    cli::cli_abort("No numeric variables were available to plot automatically.",
-      class = "epiprocess__no_numeric_vars_available"
-    )
-  }
-  vars <- tidyselect::eval_select(rlang::expr(c(...)), object)
-  if (rlang::is_empty(vars)) { # find them automatically if unspecified
-    vars <- tidyselect::eval_select(names(allowed)[1], object)
-    cli::cli_warn(
-      "Plot variable was unspecified. Automatically selecting {.var {names(allowed)[1]}}.",
-      class = "epiprocess__unspecified_plot_var"
-    )
-  } else { # if variables were specified, ensure that they are numeric
-    ok <- names(vars) %in% names(allowed)
-    if (!any(ok)) {
-      cli::cli_abort(
-        "None of the requested variables {.var {names(vars)}} are numeric.",
-        class = "epiprocess__all_requested_vars_not_numeric"
-      )
-    } else if (!all(ok)) {
-      cli::cli_warn(
-        c(
-          "Only the requested variables {.var {names(vars)[ok]}} are numeric.",
-          i = "`autoplot()` cannot display {.var {names(vars)[!ok]}}."
-        ),
-        class = "epiprocess__some_requested_vars_not_numeric"
-      )
-      vars <- vars[ok]
-    }
-  }
+  vars <- autoplot_check_viable_response_vars(object, ..., non_key_cols = non_key_cols)
+  nvars <- length(vars)
 
   # --- create a viable df to plot
   pos <- tidyselect::eval_select(
     rlang::expr(c("time_value", tidyselect::all_of(geo_and_other_keys), names(vars))), object
   )
-  if (length(vars) > 1) {
+  if (nvars > 1) {
     object <- tidyr::pivot_longer(
       object[pos], tidyselect::all_of(names(vars)),
       values_to = ".response",
@@ -132,22 +102,25 @@ autoplot.epi_df <- function(
   }
   all_keys <- rlang::syms(as.list(geo_and_other_keys))
   other_keys <- rlang::syms(as.list(setdiff(geo_and_other_keys, "geo_value")))
-  all_avail <- rlang::syms(as.list(c(geo_and_other_keys, ".response_name")))
+  all_avail <- rlang::syms(as.list(c(
+    geo_and_other_keys, 
+    if (nvars > 1) ".response_name" else NULL
+  )))
 
   object <- object %>%
     dplyr::mutate(
       .colours = switch(.color_by,
-        all_keys = interaction(!!!all_keys, sep = "/"),
+        all_keys = interaction(!!!all_keys, sep = " / "),
         geo_value = .data$geo_value,
-        other_keys = interaction(!!!other_keys, sep = "/"),
-        all = interaction(!!!all_avail, sep = "/"),
+        other_keys = interaction(!!!other_keys, sep = " / "),
+        all = interaction(!!!all_avail, sep = " / "),
         NULL
       ),
       .facets = switch(.facet_by,
-        all_keys = interaction(!!!all_keys, sep = "/"),
+        all_keys = interaction(!!!all_keys, sep = " / "),
         geo_value = as.factor(.data$geo_value),
-        other_keys = interaction(!!!other_keys, sep = "/"),
-        all = interaction(!!!all_avail, sep = "/"),
+        other_keys = interaction(!!!other_keys, sep = " / "),
+        all = interaction(!!!all_avail, sep = " / "),
         NULL
       )
     )
@@ -192,3 +165,128 @@ autoplot.epi_df <- function(
   }
   p
 }
+
+autoplot_check_viable_response_vars <- function(
+    object, ..., non_key_cols, call = caller_env()
+) {
+  allowed <- purrr::map_lgl(object[non_key_cols], is.numeric)
+  allowed <- allowed[allowed]
+  if (length(allowed) == 0 && rlang::dots_n(...) == 0L) {
+    cli::cli_abort("No numeric variables were available to plot automatically.",
+                   class = "epiprocess__no_numeric_vars_available",
+                   call = call
+    )
+  }
+  vars <- tidyselect::eval_select(rlang::expr(c(...)), object)
+  if (rlang::is_empty(vars)) { # find them automatically if unspecified
+    vars <- tidyselect::eval_select(names(allowed)[1], object)
+    cli::cli_warn(
+      "Plot variable was unspecified. Automatically selecting {.var {names(allowed)[1]}}.",
+      class = "epiprocess__unspecified_plot_var",
+      call = call
+    )
+  } else { # if variables were specified, ensure that they are numeric
+    ok <- names(vars) %in% names(allowed)
+    if (!any(ok)) {
+      cli::cli_abort(
+        "None of the requested variables {.var {names(vars)}} are numeric.",
+        class = "epiprocess__all_requested_vars_not_numeric",
+        call = call
+      )
+    } else if (!all(ok)) {
+      cli::cli_warn(
+        c(
+          "Only the requested variables {.var {names(vars)[ok]}} are numeric.",
+          i = "`autoplot()` cannot display {.var {names(vars)[!ok]}}."
+        ),
+        class = "epiprocess__some_requested_vars_not_numeric",
+        call = call
+      )
+      vars <- vars[ok]
+    }
+  }
+  vars
+}
+
+
+
+autoplot.epi_archive <- function(object, ..., 
+                                 .base_color = "black",
+                                 .versions = NULL,
+                                 .facet_filter = NULL) {
+  .facet_filter <- rlang::enquo(.facet_filter)
+  time_type <- object$time_type
+  if (time_type == "custom") {
+    cli_abort(
+      "This `epi_archive` has custom `time_type`. This is currently unsupported.",
+      class = "epiprocess__autoplot_archive_custom_time_type"
+    )
+  }
+  
+  max_version <- max(object$DT$version)
+  min_version <- min(object$DT$version)
+  
+  tt_lookup <- c("day" = "day", "week" = "week", "yearmonth" = "month")
+  .versions <- .versions %||% ifelse(time_type == "integer", 1L, unname(tt_lookup[time_type]))
+  if (is.character(.versions) || .versions == 1L) {
+    .versions <- seq(min_version, max_version - 1, by = .versions)
+  } else if (is(.versions, "Date") || is.numeric(.versions)) {
+    .versions <- .versions[.versions >= min_version & .versions <= max_version]
+  } else {
+    cli_abort(
+      "Requested `.versions` don't appear to match the available `time_type`.",
+      class = "epiprocess__autoplot_archive_bad_versions"
+    )
+  }
+  
+  
+  finalized <- epix_as_of(object, max_version)
+  key_cols <- key_colnames(finalized)
+  non_key_cols <- setdiff(names(finalized), key_cols)
+  vars <- autoplot_check_viable_response_vars(finalized, ..., non_key_cols = non_key_cols)
+  nvars <- length(vars)
+  
+  bp <- autoplot(
+    finalized, ..., .base_color = .base_color, .facet_by = "all", 
+    .facet_filter = .facet_filter, .color_by = "none"
+  )
+  geo_and_other_keys <- key_colnames(object, exclude = c("time_value", "version"))
+  all_avail <- rlang::syms(as.list(c(
+    geo_and_other_keys, 
+    if (nvars > 1) ".response_name" else NULL
+  )))
+  
+  snapshots <- purrr::map(
+    .versions,
+    function(v) {dplyr::mutate(epix_as_of(object, v), version = v)}
+  ) %>%
+    purrr::list_rbind() %>%
+    dplyr::mutate(.facets = interaction(!!!all_avail, sep = " / "))
+  
+  if (nvars > 1) {
+    snapshots <- tidyr::pivot_longer(
+      snapshots, tidyselect::all_of(names(vars)),
+      values_to = ".response",
+      names_to = ".response_name"
+    )
+  } else {
+    snapshots <- dplyr::rename(snapshots, .response := !!names(vars)) # nolint: object_usage_linter
+  }
+  snapshots <- dplyr::filter(snapshots, !is.na(.response))
+  
+  bp <- bp + 
+    ggplot2::geom_line(
+      data = snapshots, 
+      mapping = ggplot2::aes(y = .response, color = version, group = factor(version))
+    )
+  
+  if (is(.versions, "Date")) {
+    bp <- bp + ggplot2::scale_color_viridis_c(name = "Version", trans = "date")
+  } else {
+    bp <- bp + ggplot2::scale_color_viridis_c(name = "Version")
+  }
+  
+  # make the finalized layer last
+  bp$layers <- rev(bp$layers)
+  bp
+}

From 9ffd7679b90f14a26c3d10519676e175b1c631f8 Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Fri, 28 Mar 2025 17:13:12 -0700
Subject: [PATCH 027/117] bugs squashed

---
 NAMESPACE                   |  1 +
 R/autoplot.R                | 42 ++++++++++++++++++++-----
 man/autoplot.epi_archive.Rd | 61 +++++++++++++++++++++++++++++++++++++
 man/autoplot.epi_df.Rd      |  8 ++---
 4 files changed, 101 insertions(+), 11 deletions(-)
 create mode 100644 man/autoplot.epi_archive.Rd

diff --git a/NAMESPACE b/NAMESPACE
index 2f97e5b3d..b216cda72 100644
--- a/NAMESPACE
+++ b/NAMESPACE
@@ -19,6 +19,7 @@ S3method(as_epi_df,tbl_df)
 S3method(as_epi_df,tbl_ts)
 S3method(as_tibble,epi_df)
 S3method(as_tsibble,epi_df)
+S3method(autoplot,epi_archive)
 S3method(autoplot,epi_df)
 S3method(clone,epi_archive)
 S3method(clone,grouped_epi_archive)
diff --git a/R/autoplot.R b/R/autoplot.R
index ebc4f4fc8..90f33437f 100644
--- a/R/autoplot.R
+++ b/R/autoplot.R
@@ -31,7 +31,7 @@
 #'   
 #'   
 #'
-#' @return A ggplot object
+#' @return A [ggplot2::ggplot] object
 #' @export
 #'
 #' @examples
@@ -68,7 +68,7 @@ autoplot.epi_df <- function(
 ) {
   .color_by <- rlang::arg_match(.color_by)
   .facet_by <- rlang::arg_match(.facet_by)
-  if (!rlang::is_quosure(.facet_filter)) .facet_filter <- rlang::enquo(.facet_filter)
+  .facet_filter <- rlang::enquo(.facet_filter)
 
   if (lifecycle::is_present(.max_facets)) {
     lifecycle::deprecate_warn(
@@ -210,11 +210,37 @@ autoplot_check_viable_response_vars <- function(
 
 
 
+#' Automatically plot an epi_archive
+#'
+#' @param object An `epi_archive`
+#' @inheritParams autoplot.epi_df
+#' @param .versions Select which versions will be displayed. By default, every
+#'   a separate line will be shown with the data as it would have appeared on
+#'   every day in the archive. This can sometimes become overwhelming. For 
+#'   example, daily data would display a line for what the data would have looked
+#'   like on every single day. To override this, you can select specific dates, 
+#'   by passing a vector of values here. Alternatively, a sequence can be 
+#'   automatically created by passing a string like `"2 weeks"` or `"month"`.
+#'   For time types where the `time_value` is a date object, any string that
+#'   is interpretable by `[base::seq.Date()]` is allowed.
+#'   
+#'   For `time_type = "integer"`, an integer larger than 1 will give a subset
+#'   of versions.
+#'
+#' @return A [ggplot2::ggplot] object
+#' @export
+#'
+#' @examples
+#' autoplot(archive_cases_dv_subset, percent_cli, .versions = "week")
+#' autoplot(archive_cases_dv_subset_all_states, percent_cli, 
+#'   .versions = "week", 
+#'   .facet_filter = geo_value %in% c("or", "az", "vt", "ms")
+#' )
+#' autoplot(archive_cases_dv_subset, percent_cli, .versions = "month", .facet_filter = geo_value == "ca")
 autoplot.epi_archive <- function(object, ..., 
                                  .base_color = "black",
                                  .versions = NULL,
                                  .facet_filter = NULL) {
-  .facet_filter <- rlang::enquo(.facet_filter)
   time_type <- object$time_type
   if (time_type == "custom") {
     cli_abort(
@@ -228,7 +254,8 @@ autoplot.epi_archive <- function(object, ...,
   
   tt_lookup <- c("day" = "day", "week" = "week", "yearmonth" = "month")
   .versions <- .versions %||% ifelse(time_type == "integer", 1L, unname(tt_lookup[time_type]))
-  if (is.character(.versions) || .versions == 1L) {
+  if (is.character(.versions) || length(.versions) == 1L) {
+    if (is.numeric(.versions)) .versions <- round(abs(.versions))
     .versions <- seq(min_version, max_version - 1, by = .versions)
   } else if (is(.versions, "Date") || is.numeric(.versions)) {
     .versions <- .versions[.versions >= min_version & .versions <= max_version]
@@ -246,9 +273,9 @@ autoplot.epi_archive <- function(object, ...,
   vars <- autoplot_check_viable_response_vars(finalized, ..., non_key_cols = non_key_cols)
   nvars <- length(vars)
   
-  bp <- autoplot(
+  bp <- autoplot.epi_df(
     finalized, ..., .base_color = .base_color, .facet_by = "all", 
-    .facet_filter = .facet_filter, .color_by = "none"
+    .facet_filter = {{ .facet_filter }}, .color_by = "none"
   )
   geo_and_other_keys <- key_colnames(object, exclude = c("time_value", "version"))
   all_avail <- rlang::syms(as.list(c(
@@ -272,7 +299,8 @@ autoplot.epi_archive <- function(object, ...,
   } else {
     snapshots <- dplyr::rename(snapshots, .response := !!names(vars)) # nolint: object_usage_linter
   }
-  snapshots <- dplyr::filter(snapshots, !is.na(.response))
+  snapshots <- snapshots %>%
+    dplyr::filter(!is.na(.response), .data$.facets %in% unique(bp$data$.facets))
   
   bp <- bp + 
     ggplot2::geom_line(
diff --git a/man/autoplot.epi_archive.Rd b/man/autoplot.epi_archive.Rd
new file mode 100644
index 000000000..d68697bec
--- /dev/null
+++ b/man/autoplot.epi_archive.Rd
@@ -0,0 +1,61 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/autoplot.R
+\name{autoplot.epi_archive}
+\alias{autoplot.epi_archive}
+\title{Automatically plot an epi_archive}
+\usage{
+\method{autoplot}{epi_archive}(
+  object,
+  ...,
+  .base_color = "black",
+  .versions = NULL,
+  .facet_filter = NULL
+)
+}
+\arguments{
+\item{object}{An \code{epi_archive}}
+
+\item{...}{<\code{\link[=dplyr_tidy_select]{tidy-select}}> One or more unquoted
+expressions separated by commas. Variable names can be used as if they
+were positions in the data frame, so expressions like \code{x:y} can
+be used to select a range of variables.}
+
+\item{.base_color}{Lines will be shown with this color if \code{.color_by == "none"}.
+For example, with a single numeric variable and faceting by \code{geo_value}, all
+locations would share the same color line.}
+
+\item{.versions}{Select which versions will be displayed. By default, every
+a separate line will be shown with the data as it would have appeared on
+every day in the archive. This can sometimes become overwhelming. For
+example, daily data would display a line for what the data would have looked
+like on every single day. To override this, you can select specific dates,
+by passing a vector of values here. Alternatively, a sequence can be
+automatically created by passing a string like \code{"2 weeks"} or \code{"month"}.
+For time types where the \code{time_value} is a date object, any string that
+is interpretable by \verb{[base::seq.Date()]} is allowed.
+
+For \code{time_type = "integer"}, an integer larger than 1 will give a subset
+of versions.}
+
+\item{.facet_filter}{Select which facets will be displayed. Especially
+useful for when there are many \code{geo_value}'s or keys. This is a
+<\code{\link[=args_data_masking]{rlang}}> expression along the lines of \verb{[dplyr::filter()]}.
+However, it must be a single expression combined with the \code{&} operator. This
+contrasts to the typical use case which allows multiple comma-separated expressions
+which are implicitly combined with \code{&}. When multiple variables are selected
+with \code{...}, their names can be filtered in combination with other factors
+by using \code{.response_name}. See the examples below.}
+}
+\value{
+A \link[ggplot2:ggplot]{ggplot2::ggplot} object
+}
+\description{
+Automatically plot an epi_archive
+}
+\examples{
+autoplot(archive_cases_dv_subset, percent_cli, .versions = "week")
+autoplot(archive_cases_dv_subset_all_states, percent_cli, 
+  .versions = "week", 
+  .facet_filter = geo_value \%in\% c("or", "az", "vt", "ms")
+)
+}
diff --git a/man/autoplot.epi_df.Rd b/man/autoplot.epi_df.Rd
index 427c372c2..e558148e5 100644
--- a/man/autoplot.epi_df.Rd
+++ b/man/autoplot.epi_df.Rd
@@ -37,9 +37,9 @@ including the \code{geo_value}
 \item{.facet_by}{Similar to \code{.color_by} except that the default is to display
 each numeric variable on a separate facet}
 
-\item{.base_color}{Lines will be shown with this color. For example, with a
-single numeric variable and faceting by \code{geo_value}, all locations would
-share the same color line.}
+\item{.base_color}{Lines will be shown with this color if \code{.color_by == "none"}.
+For example, with a single numeric variable and faceting by \code{geo_value}, all
+locations would share the same color line.}
 
 \item{.facet_filter}{Select which facets will be displayed. Especially
 useful for when there are many \code{geo_value}'s or keys. This is a
@@ -53,7 +53,7 @@ by using \code{.response_name}. See the examples below.}
 \item{.max_facets}{\ifelse{html}{\href{https://lifecycle.r-lib.org/articles/stages.html#deprecated}{\figure{lifecycle-deprecated.svg}{options: alt='[Deprecated]'}}}{\strong{[Deprecated]}}}
 }
 \value{
-A ggplot object
+A \link[ggplot2:ggplot]{ggplot2::ggplot} object
 }
 \description{
 Automatically plot an epi_df

From 6e9fbeda99ec6ad70a1a06589a40ddd9ca03d1ae Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Fri, 28 Mar 2025 17:22:34 -0700
Subject: [PATCH 028/117] rebuild docs

---
 R/autoplot.R                | 7 ++++---
 man/autoplot.epi_archive.Rd | 6 ++++--
 man/autoplot.epi_df.Rd      | 2 +-
 3 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/R/autoplot.R b/R/autoplot.R
index 90f33437f..2f73f3196 100644
--- a/R/autoplot.R
+++ b/R/autoplot.R
@@ -22,7 +22,7 @@
 #' @param .max_facets `r lifecycle::badge("deprecated")`
 #' @param .facet_filter Select which facets will be displayed. Especially
 #'   useful for when there are many `geo_value`'s or keys. This is a 
-#'   <[`rlang`][args_data_masking]> expression along the lines of `[dplyr::filter()]`.
+#'   <[`rlang`][args_data_masking]> expression along the lines of [dplyr::filter()].
 #'   However, it must be a single expression combined with the `&` operator. This
 #'   contrasts to the typical use case which allows multiple comma-separated expressions
 #'   which are implicitly combined with `&`. When multiple variables are selected
@@ -222,7 +222,7 @@ autoplot_check_viable_response_vars <- function(
 #'   by passing a vector of values here. Alternatively, a sequence can be 
 #'   automatically created by passing a string like `"2 weeks"` or `"month"`.
 #'   For time types where the `time_value` is a date object, any string that
-#'   is interpretable by `[base::seq.Date()]` is allowed.
+#'   is interpretable by [seq.Date()] is allowed.
 #'   
 #'   For `time_type = "integer"`, an integer larger than 1 will give a subset
 #'   of versions.
@@ -236,7 +236,8 @@ autoplot_check_viable_response_vars <- function(
 #'   .versions = "week", 
 #'   .facet_filter = geo_value %in% c("or", "az", "vt", "ms")
 #' )
-#' autoplot(archive_cases_dv_subset, percent_cli, .versions = "month", .facet_filter = geo_value == "ca")
+#' autoplot(archive_cases_dv_subset, percent_cli, .versions = "month", 
+#'   .facet_filter = geo_value == "ca")
 autoplot.epi_archive <- function(object, ..., 
                                  .base_color = "black",
                                  .versions = NULL,
diff --git a/man/autoplot.epi_archive.Rd b/man/autoplot.epi_archive.Rd
index d68697bec..7578650d9 100644
--- a/man/autoplot.epi_archive.Rd
+++ b/man/autoplot.epi_archive.Rd
@@ -32,14 +32,14 @@ like on every single day. To override this, you can select specific dates,
 by passing a vector of values here. Alternatively, a sequence can be
 automatically created by passing a string like \code{"2 weeks"} or \code{"month"}.
 For time types where the \code{time_value} is a date object, any string that
-is interpretable by \verb{[base::seq.Date()]} is allowed.
+is interpretable by \code{\link[=seq.Date]{seq.Date()}} is allowed.
 
 For \code{time_type = "integer"}, an integer larger than 1 will give a subset
 of versions.}
 
 \item{.facet_filter}{Select which facets will be displayed. Especially
 useful for when there are many \code{geo_value}'s or keys. This is a
-<\code{\link[=args_data_masking]{rlang}}> expression along the lines of \verb{[dplyr::filter()]}.
+<\code{\link[=args_data_masking]{rlang}}> expression along the lines of \code{\link[dplyr:filter]{dplyr::filter()}}.
 However, it must be a single expression combined with the \code{&} operator. This
 contrasts to the typical use case which allows multiple comma-separated expressions
 which are implicitly combined with \code{&}. When multiple variables are selected
@@ -58,4 +58,6 @@ autoplot(archive_cases_dv_subset_all_states, percent_cli,
   .versions = "week", 
   .facet_filter = geo_value \%in\% c("or", "az", "vt", "ms")
 )
+autoplot(archive_cases_dv_subset, percent_cli, .versions = "month", 
+  .facet_filter = geo_value == "ca")
 }
diff --git a/man/autoplot.epi_df.Rd b/man/autoplot.epi_df.Rd
index e558148e5..d618f74f1 100644
--- a/man/autoplot.epi_df.Rd
+++ b/man/autoplot.epi_df.Rd
@@ -43,7 +43,7 @@ locations would share the same color line.}
 
 \item{.facet_filter}{Select which facets will be displayed. Especially
 useful for when there are many \code{geo_value}'s or keys. This is a
-<\code{\link[=args_data_masking]{rlang}}> expression along the lines of \verb{[dplyr::filter()]}.
+<\code{\link[=args_data_masking]{rlang}}> expression along the lines of \code{\link[dplyr:filter]{dplyr::filter()}}.
 However, it must be a single expression combined with the \code{&} operator. This
 contrasts to the typical use case which allows multiple comma-separated expressions
 which are implicitly combined with \code{&}. When multiple variables are selected

From ac6d69dc10800c9259a6365bd97228b3c7b18e9b Mon Sep 17 00:00:00 2001
From: dajmcdon <dajmcdon@users.noreply.github.com>
Date: Sat, 29 Mar 2025 00:25:44 +0000
Subject: [PATCH 029/117] style: styler (GHA)

---
 R/autoplot.R | 66 ++++++++++++++++++++++++++++------------------------
 1 file changed, 35 insertions(+), 31 deletions(-)

diff --git a/R/autoplot.R b/R/autoplot.R
index 1fc67e2bf..7baa2cebb 100644
--- a/R/autoplot.R
+++ b/R/autoplot.R
@@ -17,11 +17,11 @@
 #' @param .facet_by Similar to `.color_by` except that the default is to display
 #'   each numeric variable on a separate facet
 #' @param .base_color Lines will be shown with this color if `.color_by == "none"`.
-#'  For example, with a single numeric variable and faceting by `geo_value`, all 
+#'  For example, with a single numeric variable and faceting by `geo_value`, all
 #'  locations would share the same color line.
 #' @param .max_facets `r lifecycle::badge("deprecated")`
 #' @param .facet_filter Select which facets will be displayed. Especially
-#'   useful for when there are many `geo_value`'s or keys. This is a 
+#'   useful for when there are many `geo_value`'s or keys. This is a
 #'   <[`rlang`][args_data_masking]> expression along the lines of [dplyr::filter()].
 #'   However, it must be a single expression combined with the `&` operator. This
 #'   contrasts to the typical use case which allows multiple comma-separated expressions
@@ -102,7 +102,7 @@ autoplot.epi_df <- function(
   all_keys <- rlang::syms(as.list(geo_and_other_keys))
   other_keys <- rlang::syms(as.list(setdiff(geo_and_other_keys, "geo_value")))
   all_avail <- rlang::syms(as.list(c(
-    geo_and_other_keys, 
+    geo_and_other_keys,
     if (nvars > 1) ".response_name" else NULL
   )))
 
@@ -166,14 +166,13 @@ autoplot.epi_df <- function(
 }
 
 autoplot_check_viable_response_vars <- function(
-    object, ..., non_key_cols, call = caller_env()
-) {
+    object, ..., non_key_cols, call = caller_env()) {
   allowed <- purrr::map_lgl(object[non_key_cols], is.numeric)
   allowed <- allowed[allowed]
   if (length(allowed) == 0 && rlang::dots_n(...) == 0L) {
     cli::cli_abort("No numeric variables were available to plot automatically.",
-                   class = "epiprocess__no_numeric_vars_available",
-                   call = call
+      class = "epiprocess__no_numeric_vars_available",
+      call = call
     )
   }
   vars <- tidyselect::eval_select(rlang::expr(c(...)), object)
@@ -215,14 +214,14 @@ autoplot_check_viable_response_vars <- function(
 #' @inheritParams autoplot.epi_df
 #' @param .versions Select which versions will be displayed. By default, every
 #'   a separate line will be shown with the data as it would have appeared on
-#'   every day in the archive. This can sometimes become overwhelming. For 
+#'   every day in the archive. This can sometimes become overwhelming. For
 #'   example, daily data would display a line for what the data would have looked
-#'   like on every single day. To override this, you can select specific dates, 
-#'   by passing a vector of values here. Alternatively, a sequence can be 
+#'   like on every single day. To override this, you can select specific dates,
+#'   by passing a vector of values here. Alternatively, a sequence can be
 #'   automatically created by passing a string like `"2 weeks"` or `"month"`.
 #'   For time types where the `time_value` is a date object, any string that
 #'   is interpretable by [seq.Date()] is allowed.
-#'   
+#'
 #'   For `time_type = "integer"`, an integer larger than 1 will give a subset
 #'   of versions.
 #'
@@ -231,13 +230,15 @@ autoplot_check_viable_response_vars <- function(
 #'
 #' @examples
 #' autoplot(archive_cases_dv_subset, percent_cli, .versions = "week")
-#' autoplot(archive_cases_dv_subset_all_states, percent_cli, 
-#'   .versions = "week", 
+#' autoplot(archive_cases_dv_subset_all_states, percent_cli,
+#'   .versions = "week",
 #'   .facet_filter = geo_value %in% c("or", "az", "vt", "ms")
 #' )
-#' autoplot(archive_cases_dv_subset, percent_cli, .versions = "month", 
-#'   .facet_filter = geo_value == "ca")
-autoplot.epi_archive <- function(object, ..., 
+#' autoplot(archive_cases_dv_subset, percent_cli,
+#'   .versions = "month",
+#'   .facet_filter = geo_value == "ca"
+#' )
+autoplot.epi_archive <- function(object, ...,
                                  .base_color = "black",
                                  .versions = NULL,
                                  .facet_filter = NULL) {
@@ -248,10 +249,10 @@ autoplot.epi_archive <- function(object, ...,
       class = "epiprocess__autoplot_archive_custom_time_type"
     )
   }
-  
+
   max_version <- max(object$DT$version)
   min_version <- min(object$DT$version)
-  
+
   tt_lookup <- c("day" = "day", "week" = "week", "yearmonth" = "month")
   .versions <- .versions %||% ifelse(time_type == "integer", 1L, unname(tt_lookup[time_type]))
   if (is.character(.versions) || length(.versions) == 1L) {
@@ -265,31 +266,34 @@ autoplot.epi_archive <- function(object, ...,
       class = "epiprocess__autoplot_archive_bad_versions"
     )
   }
-  
-  
+
+
   finalized <- epix_as_of(object, max_version)
   key_cols <- key_colnames(finalized)
   non_key_cols <- setdiff(names(finalized), key_cols)
   vars <- autoplot_check_viable_response_vars(finalized, ..., non_key_cols = non_key_cols)
   nvars <- length(vars)
-  
+
   bp <- autoplot.epi_df(
-    finalized, ..., .base_color = .base_color, .facet_by = "all", 
+    finalized, ...,
+    .base_color = .base_color, .facet_by = "all",
     .facet_filter = {{ .facet_filter }}, .color_by = "none"
   )
   geo_and_other_keys <- key_colnames(object, exclude = c("time_value", "version"))
   all_avail <- rlang::syms(as.list(c(
-    geo_and_other_keys, 
+    geo_and_other_keys,
     if (nvars > 1) ".response_name" else NULL
   )))
-  
+
   snapshots <- purrr::map(
     .versions,
-    function(v) {dplyr::mutate(epix_as_of(object, v), version = v)}
+    function(v) {
+      dplyr::mutate(epix_as_of(object, v), version = v)
+    }
   ) %>%
     purrr::list_rbind() %>%
     dplyr::mutate(.facets = interaction(!!!all_avail, sep = " / "))
-  
+
   if (nvars > 1) {
     snapshots <- tidyr::pivot_longer(
       snapshots, tidyselect::all_of(names(vars)),
@@ -301,19 +305,19 @@ autoplot.epi_archive <- function(object, ...,
   }
   snapshots <- snapshots %>%
     dplyr::filter(!is.na(.response), .data$.facets %in% unique(bp$data$.facets))
-  
-  bp <- bp + 
+
+  bp <- bp +
     ggplot2::geom_line(
-      data = snapshots, 
+      data = snapshots,
       mapping = ggplot2::aes(y = .response, color = version, group = factor(version))
     )
-  
+
   if (is(.versions, "Date")) {
     bp <- bp + ggplot2::scale_color_viridis_c(name = "Version", trans = "date")
   } else {
     bp <- bp + ggplot2::scale_color_viridis_c(name = "Version")
   }
-  
+
   # make the finalized layer last
   bp$layers <- rev(bp$layers)
   bp

From 1115a9f0937820106f67ff263be9f3dfbafe410c Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Fri, 28 Mar 2025 17:29:33 -0700
Subject: [PATCH 030/117] needs prefix

---
 R/autoplot.R | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/R/autoplot.R b/R/autoplot.R
index 1fc67e2bf..fc4d59785 100644
--- a/R/autoplot.R
+++ b/R/autoplot.R
@@ -308,7 +308,7 @@ autoplot.epi_archive <- function(object, ...,
       mapping = ggplot2::aes(y = .response, color = version, group = factor(version))
     )
   
-  if (is(.versions, "Date")) {
+  if (methods::is(.versions, "Date")) {
     bp <- bp + ggplot2::scale_color_viridis_c(name = "Version", trans = "date")
   } else {
     bp <- bp + ggplot2::scale_color_viridis_c(name = "Version")

From 6e773748e207caa2b9d1b0051d46c0392d8dd737 Mon Sep 17 00:00:00 2001
From: dajmcdon <dajmcdon@users.noreply.github.com>
Date: Sat, 29 Mar 2025 00:32:47 +0000
Subject: [PATCH 031/117] docs: document (GHA)

---
 man/autoplot.epi_archive.Rd | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/man/autoplot.epi_archive.Rd b/man/autoplot.epi_archive.Rd
index 7578650d9..289c777b8 100644
--- a/man/autoplot.epi_archive.Rd
+++ b/man/autoplot.epi_archive.Rd
@@ -54,10 +54,12 @@ Automatically plot an epi_archive
 }
 \examples{
 autoplot(archive_cases_dv_subset, percent_cli, .versions = "week")
-autoplot(archive_cases_dv_subset_all_states, percent_cli, 
-  .versions = "week", 
+autoplot(archive_cases_dv_subset_all_states, percent_cli,
+  .versions = "week",
   .facet_filter = geo_value \%in\% c("or", "az", "vt", "ms")
 )
-autoplot(archive_cases_dv_subset, percent_cli, .versions = "month", 
-  .facet_filter = geo_value == "ca")
+autoplot(archive_cases_dv_subset, percent_cli,
+  .versions = "month",
+  .facet_filter = geo_value == "ca"
+)
 }

From a1dfe1e306da1fc756ce4ee2e5ab97adfa587e4e Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Fri, 28 Mar 2025 17:41:13 -0700
Subject: [PATCH 032/117] add to pkgdown

---
 NEWS.md      | 1 +
 _pkgdown.yml | 1 +
 2 files changed, 2 insertions(+)

diff --git a/NEWS.md b/NEWS.md
index 5ec1522ee..ed8b8b8f6 100644
--- a/NEWS.md
+++ b/NEWS.md
@@ -50,6 +50,7 @@ Pre-1.0.0 numbering scheme: 0.x will indicate releases, while 0.x.y will indicat
 - The compactification tolerance setting now works with integer-type columns.
 - Various functions are now faster, using faster variants of core operations and
   avoiding reconstructing grouped `epi_df`s when unnecessary.
+- Add `autoplot.epi_archive()` to display revision patterns.
 
 ## Bug fixes
 
diff --git a/_pkgdown.yml b/_pkgdown.yml
index 3742bd416..3fee4418f 100644
--- a/_pkgdown.yml
+++ b/_pkgdown.yml
@@ -74,6 +74,7 @@ reference:
       - print.epi_archive
       - clone
       - group_by.epi_archive
+      - autoplot.epi_archive
 
   - title: "`epi_archive` manipulation"
     desc: Functions operating on `epi_archive` objects.

From ad0b1c5414b8998435a57638d2e7886794ee81a8 Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Fri, 28 Mar 2025 18:10:39 -0700
Subject: [PATCH 033/117] methods prefixing

---
 DESCRIPTION  | 1 +
 R/autoplot.R | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/DESCRIPTION b/DESCRIPTION
index 18ae8e424..76ca97a1c 100755
--- a/DESCRIPTION
+++ b/DESCRIPTION
@@ -49,6 +49,7 @@ Imports:
     lifecycle,
     lubridate,
     magrittr,
+    methods,
     pkgconfig,
     purrr,
     rlang,
diff --git a/R/autoplot.R b/R/autoplot.R
index 15575db5e..4dad088e3 100644
--- a/R/autoplot.R
+++ b/R/autoplot.R
@@ -258,7 +258,7 @@ autoplot.epi_archive <- function(object, ...,
   if (is.character(.versions) || length(.versions) == 1L) {
     if (is.numeric(.versions)) .versions <- round(abs(.versions))
     .versions <- seq(min_version, max_version - 1, by = .versions)
-  } else if (is(.versions, "Date") || is.numeric(.versions)) {
+  } else if (methods::is(.versions, "Date") || is.numeric(.versions)) {
     .versions <- .versions[.versions >= min_version & .versions <= max_version]
   } else {
     cli_abort(

From 3ad49712ea581f818234363331951bc6b32cef00 Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Fri, 28 Mar 2025 12:31:54 -0700
Subject: [PATCH 034/117] bump version, mention removed autoplot arg

---
 DESCRIPTION | 2 +-
 NEWS.md     | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/DESCRIPTION b/DESCRIPTION
index abb0cf86c..b81625185 100755
--- a/DESCRIPTION
+++ b/DESCRIPTION
@@ -46,7 +46,7 @@ Imports:
     dplyr (>= 1.1.0),
     ggplot2,
     glue,
-    lifecycle (>= 1.0.1),
+    lifecycle,
     lubridate,
     magrittr,
     pkgconfig,
diff --git a/NEWS.md b/NEWS.md
index 3ac814aa2..5ec1522ee 100644
--- a/NEWS.md
+++ b/NEWS.md
@@ -42,6 +42,8 @@ Pre-1.0.0 numbering scheme: 0.x will indicate releases, while 0.x.y will indicat
   `new_epi_archive()`.
 - `validate_epi_archive()` now follows the validator convention of operating on
   an "unvalidated" `epi_archive` (from `new_epi_archive`) rather than arguments.
+- `autoplot.epi_df()` argument `.max_facets` has been deprecated in favor of
+  `.facet_filter` which allows explicit selection of facets to show.
 
 ## Improvements
 - `revision_summary()` now supports all `time_type`s.

From 4fe293752a735d7b34c5fadea06e4c0382ef608a Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Fri, 28 Mar 2025 12:32:27 -0700
Subject: [PATCH 035/117] usethis::use_lifecycle() to add necessary deprecation
 svgs

---
 man/figures/lifecycle-deprecated.svg   | 21 +++++++++++++++++++
 man/figures/lifecycle-experimental.svg | 21 +++++++++++++++++++
 man/figures/lifecycle-stable.svg       | 29 ++++++++++++++++++++++++++
 man/figures/lifecycle-superseded.svg   | 21 +++++++++++++++++++
 4 files changed, 92 insertions(+)
 create mode 100644 man/figures/lifecycle-deprecated.svg
 create mode 100644 man/figures/lifecycle-experimental.svg
 create mode 100644 man/figures/lifecycle-stable.svg
 create mode 100644 man/figures/lifecycle-superseded.svg

diff --git a/man/figures/lifecycle-deprecated.svg b/man/figures/lifecycle-deprecated.svg
new file mode 100644
index 000000000..b61c57c3f
--- /dev/null
+++ b/man/figures/lifecycle-deprecated.svg
@@ -0,0 +1,21 @@
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="126" height="20" role="img" aria-label="lifecycle: deprecated">
+    <title>lifecycle: deprecated</title>
+    <linearGradient id="s" x2="0" y2="100%">
+        <stop offset="0" stop-color="#bbb" stop-opacity=".1" />
+        <stop offset="1" stop-opacity=".1" />
+    </linearGradient>
+    <clipPath id="r">
+        <rect width="126" height="20" rx="3" fill="#fff" />
+    </clipPath>
+    <g clip-path="url(#r)">
+        <rect width="55" height="20" fill="#555" />
+        <rect x="55" width="71" height="20" fill="#fe7d37" />
+        <rect width="126" height="20" fill="url(#s)" />
+    </g>
+    <g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110">
+        <text aria-hidden="true" x="285" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="450">lifecycle</text>
+        <text x="285" y="140" transform="scale(.1)" fill="#fff" textLength="450">lifecycle</text>
+        <text aria-hidden="true" x="895" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="610">deprecated</text>
+        <text x="895" y="140" transform="scale(.1)" fill="#fff" textLength="610">deprecated</text>
+    </g>
+</svg>
diff --git a/man/figures/lifecycle-experimental.svg b/man/figures/lifecycle-experimental.svg
new file mode 100644
index 000000000..5d88fc2c6
--- /dev/null
+++ b/man/figures/lifecycle-experimental.svg
@@ -0,0 +1,21 @@
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="138" height="20" role="img" aria-label="lifecycle: experimental">
+    <title>lifecycle: experimental</title>
+    <linearGradient id="s" x2="0" y2="100%">
+        <stop offset="0" stop-color="#bbb" stop-opacity=".1" />
+        <stop offset="1" stop-opacity=".1" />
+    </linearGradient>
+    <clipPath id="r">
+        <rect width="138" height="20" rx="3" fill="#fff" />
+    </clipPath>
+    <g clip-path="url(#r)">
+        <rect width="55" height="20" fill="#555" />
+        <rect x="55" width="83" height="20" fill="#fe7d37" />
+        <rect width="138" height="20" fill="url(#s)" />
+    </g>
+    <g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110">
+        <text aria-hidden="true" x="285" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="450">lifecycle</text>
+        <text x="285" y="140" transform="scale(.1)" fill="#fff" textLength="450">lifecycle</text>
+        <text aria-hidden="true" x="955" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="730">experimental</text>
+        <text x="955" y="140" transform="scale(.1)" fill="#fff" textLength="730">experimental</text>
+    </g>
+</svg>
diff --git a/man/figures/lifecycle-stable.svg b/man/figures/lifecycle-stable.svg
new file mode 100644
index 000000000..9bf21e76b
--- /dev/null
+++ b/man/figures/lifecycle-stable.svg
@@ -0,0 +1,29 @@
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="98" height="20" role="img" aria-label="lifecycle: stable">
+    <title>lifecycle: stable</title>
+    <linearGradient id="s" x2="0" y2="100%">
+        <stop offset="0" stop-color="#bbb" stop-opacity=".1" />
+        <stop offset="1" stop-opacity=".1" />
+    </linearGradient>
+    <clipPath id="r">
+        <rect width="98" height="20" rx="3" fill="#fff" />
+    </clipPath>
+    <g clip-path="url(#r)">
+        <rect width="55" height="20" fill="#555" />
+        <rect x="55" width="43" height="20" fill="#4c1" />
+        <rect width="98" height="20" fill="url(#s)" />
+    </g>
+    <g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110">
+        <text aria-hidden="true" x="285" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="450">
+      lifecycle
+    </text>
+        <text x="285" y="140" transform="scale(.1)" fill="#fff" textLength="450">
+      lifecycle
+    </text>
+        <text aria-hidden="true" x="755" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="330">
+      stable
+    </text>
+        <text x="755" y="140" transform="scale(.1)" fill="#fff" textLength="330">
+      stable
+    </text>
+    </g>
+</svg>
diff --git a/man/figures/lifecycle-superseded.svg b/man/figures/lifecycle-superseded.svg
new file mode 100644
index 000000000..db8d757f7
--- /dev/null
+++ b/man/figures/lifecycle-superseded.svg
@@ -0,0 +1,21 @@
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="128" height="20" role="img" aria-label="lifecycle: superseded">
+    <title>lifecycle: superseded</title>
+    <linearGradient id="s" x2="0" y2="100%">
+        <stop offset="0" stop-color="#bbb" stop-opacity=".1" />
+        <stop offset="1" stop-opacity=".1" />
+    </linearGradient>
+    <clipPath id="r">
+        <rect width="128" height="20" rx="3" fill="#fff" />
+    </clipPath>
+    <g clip-path="url(#r)">
+        <rect width="55" height="20" fill="#555" />
+        <rect x="55" width="73" height="20" fill="#007ec6" />
+        <rect width="128" height="20" fill="url(#s)" />
+    </g>
+    <g fill="#fff" text-anchor="middle" font-family="Verdana,Geneva,DejaVu Sans,sans-serif" text-rendering="geometricPrecision" font-size="110">
+        <text aria-hidden="true" x="285" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="450">lifecycle</text>
+        <text x="285" y="140" transform="scale(.1)" fill="#fff" textLength="450">lifecycle</text>
+        <text aria-hidden="true" x="905" y="150" fill="#010101" fill-opacity=".3" transform="scale(.1)" textLength="630">superseded</text>
+        <text x="905" y="140" transform="scale(.1)" fill="#fff" textLength="630">superseded</text>
+    </g>
+</svg>

From 19ecb0a1e4d12092379391a008e7b6086389e244 Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Fri, 28 Mar 2025 12:32:51 -0700
Subject: [PATCH 036/117] allow autoplot to filter facets instead of arbitrary
 limit

---
 R/autoplot.R           | 48 ++++++++++++++++++++++++++++++------------
 man/autoplot.epi_df.Rd | 23 +++++++++++++++++---
 2 files changed, 55 insertions(+), 16 deletions(-)

diff --git a/R/autoplot.R b/R/autoplot.R
index ecfe5f1c9..69a956add 100644
--- a/R/autoplot.R
+++ b/R/autoplot.R
@@ -19,8 +19,17 @@
 #' @param .base_color Lines will be shown with this color. For example, with a
 #'   single numeric variable and faceting by `geo_value`, all locations would
 #'   share the same color line.
-#' @param .max_facets Cut down of the number of facets displayed. Especially
-#'   useful for testing when there are many `geo_value`'s or keys.
+#' @param .max_facets `r lifecycle::badge("deprecated")`
+#' @param .facet_filter Select which facets will be displayed. Especially
+#'   useful for when there are many `geo_value`'s or keys. This is a 
+#'   <[`rlang`][args_data_masking]> expression along the lines of `[dplyr::filter()]`.
+#'   However, it must be a single expression combined with the `&` operator. This
+#'   contrasts to the typical use case which allows multiple comma-separated expressions
+#'   which are implicitly combined with `&`. When multiple variables are selected
+#'   with `...`, their names can be filtered in combination with other factors
+#'   by using `.response_name`. See the examples below.
+#'   
+#'   
 #'
 #' @return A ggplot object
 #' @export
@@ -41,16 +50,33 @@
 #' autoplot(cases_deaths_subset, case_rate_7d_av,
 #'   .base_color = "red", .facet_by = "geo_value"
 #' )
+#' 
+#' # filter to only some facets, must be explicitly combined
+#' autoplot(cases_deaths_subset, cases, death_rate_7d_av,
+#'   .facet_by = "all", 
+#'   .facet_filter = (.response_name == "cases" & geo_value %in% c("tx", "pa")) |
+#'     (.response_name == "death_rate_7d_av" & 
+#'       geo_value %in% c("ca", "fl", "ga", "ny"))
+#' )
 autoplot.epi_df <- function(
     object, ...,
     .color_by = c("all_keys", "geo_value", "other_keys", ".response", "all", "none"),
     .facet_by = c(".response", "other_keys", "all_keys", "geo_value", "all", "none"),
     .base_color = "#3A448F",
-    .max_facets = Inf) {
+    .facet_filter = NULL,
+    .max_facets = deprecated()
+) {
   .color_by <- rlang::arg_match(.color_by)
   .facet_by <- rlang::arg_match(.facet_by)
+  .facet_filter <- rlang::enquo(.facet_filter)
 
-  assert(anyInfinite(.max_facets), checkInt(.max_facets), combine = "or")
+  if (lifecycle::is_present(.max_facets)) {
+    lifecycle::deprecate_warn(
+      "0.11.1", 
+      "autoplot.epi_df(.max_facets = )", 
+      "autoplot.epi_df(.facet_filter = )"
+    )
+  }
   assert_character(.base_color, len = 1)
 
   key_cols <- key_colnames(object)
@@ -126,15 +152,11 @@ autoplot.epi_df <- function(
       )
     )
 
-  if (.max_facets < Inf && ".facets" %in% names(object)) {
-    n_facets <- nlevels(object$.facets)
-    if (n_facets > .max_facets) {
-      top_n <- levels(as.factor(object$.facets))[seq_len(.max_facets)]
-      object <- dplyr::filter(object, .data$.facets %in% top_n) %>%
-        dplyr::mutate(.facets = droplevels(.data$.facets))
-      if (".colours" %in% names(object)) {
-        object <- dplyr::mutate(object, .colours = droplevels(.data$.colours))
-      }
+  if (!rlang::quo_is_null(.facet_filter) && ".facets" %in% names(object)) {
+    object <- dplyr::filter(object, !!.facet_filter) %>%
+      dplyr::mutate(.facets = droplevels(.data$.facets))
+    if (".colours" %in% names(object)) {
+      object <- dplyr::mutate(object, .colours = droplevels(.data$.colours))
     }
   }
 
diff --git a/man/autoplot.epi_df.Rd b/man/autoplot.epi_df.Rd
index d53335c14..427c372c2 100644
--- a/man/autoplot.epi_df.Rd
+++ b/man/autoplot.epi_df.Rd
@@ -10,7 +10,8 @@
   .color_by = c("all_keys", "geo_value", "other_keys", ".response", "all", "none"),
   .facet_by = c(".response", "other_keys", "all_keys", "geo_value", "all", "none"),
   .base_color = "#3A448F",
-  .max_facets = Inf
+  .facet_filter = NULL,
+  .max_facets = deprecated()
 )
 }
 \arguments{
@@ -40,8 +41,16 @@ each numeric variable on a separate facet}
 single numeric variable and faceting by \code{geo_value}, all locations would
 share the same color line.}
 
-\item{.max_facets}{Cut down of the number of facets displayed. Especially
-useful for testing when there are many \code{geo_value}'s or keys.}
+\item{.facet_filter}{Select which facets will be displayed. Especially
+useful for when there are many \code{geo_value}'s or keys. This is a
+<\code{\link[=args_data_masking]{rlang}}> expression along the lines of \verb{[dplyr::filter()]}.
+However, it must be a single expression combined with the \code{&} operator. This
+contrasts to the typical use case which allows multiple comma-separated expressions
+which are implicitly combined with \code{&}. When multiple variables are selected
+with \code{...}, their names can be filtered in combination with other factors
+by using \code{.response_name}. See the examples below.}
+
+\item{.max_facets}{\ifelse{html}{\href{https://lifecycle.r-lib.org/articles/stages.html#deprecated}{\figure{lifecycle-deprecated.svg}{options: alt='[Deprecated]'}}}{\strong{[Deprecated]}}}
 }
 \value{
 A ggplot object
@@ -65,4 +74,12 @@ autoplot(cases_deaths_subset, case_rate_7d_av,
 autoplot(cases_deaths_subset, case_rate_7d_av,
   .base_color = "red", .facet_by = "geo_value"
 )
+
+# filter to only some facets, must be explicitly combined
+autoplot(cases_deaths_subset, cases, death_rate_7d_av,
+  .facet_by = "all", 
+  .facet_filter = (.response_name == "cases" & geo_value \%in\% c("tx", "pa")) |
+    (.response_name == "death_rate_7d_av" & 
+      geo_value \%in\% c("ca", "fl", "ga", "ny"))
+)
 }

From c79915ed3d6601496a2331be32de66ae2740ddf7 Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Fri, 28 Mar 2025 16:04:33 -0700
Subject: [PATCH 037/117] basic idea works, adding docs

---
 R/autoplot.R | 186 +++++++++++++++++++++++++++++++++++++++------------
 1 file changed, 142 insertions(+), 44 deletions(-)

diff --git a/R/autoplot.R b/R/autoplot.R
index 69a956add..ebc4f4fc8 100644
--- a/R/autoplot.R
+++ b/R/autoplot.R
@@ -16,9 +16,9 @@
 #'   * `none` - no coloring aesthetic is applied
 #' @param .facet_by Similar to `.color_by` except that the default is to display
 #'   each numeric variable on a separate facet
-#' @param .base_color Lines will be shown with this color. For example, with a
-#'   single numeric variable and faceting by `geo_value`, all locations would
-#'   share the same color line.
+#' @param .base_color Lines will be shown with this color if `.color_by == "none"`.
+#'  For example, with a single numeric variable and faceting by `geo_value`, all 
+#'  locations would share the same color line.
 #' @param .max_facets `r lifecycle::badge("deprecated")`
 #' @param .facet_filter Select which facets will be displayed. Especially
 #'   useful for when there are many `geo_value`'s or keys. This is a 
@@ -68,7 +68,7 @@ autoplot.epi_df <- function(
 ) {
   .color_by <- rlang::arg_match(.color_by)
   .facet_by <- rlang::arg_match(.facet_by)
-  .facet_filter <- rlang::enquo(.facet_filter)
+  if (!rlang::is_quosure(.facet_filter)) .facet_filter <- rlang::enquo(.facet_filter)
 
   if (lifecycle::is_present(.max_facets)) {
     lifecycle::deprecate_warn(
@@ -84,44 +84,14 @@ autoplot.epi_df <- function(
   geo_and_other_keys <- key_colnames(object, exclude = "time_value")
 
   # --- check for numeric variables
-  allowed <- purrr::map_lgl(object[non_key_cols], is.numeric)
-  allowed <- allowed[allowed]
-  if (length(allowed) == 0 && rlang::dots_n(...) == 0L) {
-    cli::cli_abort("No numeric variables were available to plot automatically.",
-      class = "epiprocess__no_numeric_vars_available"
-    )
-  }
-  vars <- tidyselect::eval_select(rlang::expr(c(...)), object)
-  if (rlang::is_empty(vars)) { # find them automatically if unspecified
-    vars <- tidyselect::eval_select(names(allowed)[1], object)
-    cli::cli_warn(
-      "Plot variable was unspecified. Automatically selecting {.var {names(allowed)[1]}}.",
-      class = "epiprocess__unspecified_plot_var"
-    )
-  } else { # if variables were specified, ensure that they are numeric
-    ok <- names(vars) %in% names(allowed)
-    if (!any(ok)) {
-      cli::cli_abort(
-        "None of the requested variables {.var {names(vars)}} are numeric.",
-        class = "epiprocess__all_requested_vars_not_numeric"
-      )
-    } else if (!all(ok)) {
-      cli::cli_warn(
-        c(
-          "Only the requested variables {.var {names(vars)[ok]}} are numeric.",
-          i = "`autoplot()` cannot display {.var {names(vars)[!ok]}}."
-        ),
-        class = "epiprocess__some_requested_vars_not_numeric"
-      )
-      vars <- vars[ok]
-    }
-  }
+  vars <- autoplot_check_viable_response_vars(object, ..., non_key_cols = non_key_cols)
+  nvars <- length(vars)
 
   # --- create a viable df to plot
   pos <- tidyselect::eval_select(
     rlang::expr(c("time_value", tidyselect::all_of(geo_and_other_keys), names(vars))), object
   )
-  if (length(vars) > 1) {
+  if (nvars > 1) {
     object <- tidyr::pivot_longer(
       object[pos], tidyselect::all_of(names(vars)),
       values_to = ".response",
@@ -132,22 +102,25 @@ autoplot.epi_df <- function(
   }
   all_keys <- rlang::syms(as.list(geo_and_other_keys))
   other_keys <- rlang::syms(as.list(setdiff(geo_and_other_keys, "geo_value")))
-  all_avail <- rlang::syms(as.list(c(geo_and_other_keys, ".response_name")))
+  all_avail <- rlang::syms(as.list(c(
+    geo_and_other_keys, 
+    if (nvars > 1) ".response_name" else NULL
+  )))
 
   object <- object %>%
     dplyr::mutate(
       .colours = switch(.color_by,
-        all_keys = interaction(!!!all_keys, sep = "/"),
+        all_keys = interaction(!!!all_keys, sep = " / "),
         geo_value = .data$geo_value,
-        other_keys = interaction(!!!other_keys, sep = "/"),
-        all = interaction(!!!all_avail, sep = "/"),
+        other_keys = interaction(!!!other_keys, sep = " / "),
+        all = interaction(!!!all_avail, sep = " / "),
         NULL
       ),
       .facets = switch(.facet_by,
-        all_keys = interaction(!!!all_keys, sep = "/"),
+        all_keys = interaction(!!!all_keys, sep = " / "),
         geo_value = as.factor(.data$geo_value),
-        other_keys = interaction(!!!other_keys, sep = "/"),
-        all = interaction(!!!all_avail, sep = "/"),
+        other_keys = interaction(!!!other_keys, sep = " / "),
+        all = interaction(!!!all_avail, sep = " / "),
         NULL
       )
     )
@@ -192,3 +165,128 @@ autoplot.epi_df <- function(
   }
   p
 }
+
+autoplot_check_viable_response_vars <- function(
+    object, ..., non_key_cols, call = caller_env()
+) {
+  allowed <- purrr::map_lgl(object[non_key_cols], is.numeric)
+  allowed <- allowed[allowed]
+  if (length(allowed) == 0 && rlang::dots_n(...) == 0L) {
+    cli::cli_abort("No numeric variables were available to plot automatically.",
+                   class = "epiprocess__no_numeric_vars_available",
+                   call = call
+    )
+  }
+  vars <- tidyselect::eval_select(rlang::expr(c(...)), object)
+  if (rlang::is_empty(vars)) { # find them automatically if unspecified
+    vars <- tidyselect::eval_select(names(allowed)[1], object)
+    cli::cli_warn(
+      "Plot variable was unspecified. Automatically selecting {.var {names(allowed)[1]}}.",
+      class = "epiprocess__unspecified_plot_var",
+      call = call
+    )
+  } else { # if variables were specified, ensure that they are numeric
+    ok <- names(vars) %in% names(allowed)
+    if (!any(ok)) {
+      cli::cli_abort(
+        "None of the requested variables {.var {names(vars)}} are numeric.",
+        class = "epiprocess__all_requested_vars_not_numeric",
+        call = call
+      )
+    } else if (!all(ok)) {
+      cli::cli_warn(
+        c(
+          "Only the requested variables {.var {names(vars)[ok]}} are numeric.",
+          i = "`autoplot()` cannot display {.var {names(vars)[!ok]}}."
+        ),
+        class = "epiprocess__some_requested_vars_not_numeric",
+        call = call
+      )
+      vars <- vars[ok]
+    }
+  }
+  vars
+}
+
+
+
+autoplot.epi_archive <- function(object, ..., 
+                                 .base_color = "black",
+                                 .versions = NULL,
+                                 .facet_filter = NULL) {
+  .facet_filter <- rlang::enquo(.facet_filter)
+  time_type <- object$time_type
+  if (time_type == "custom") {
+    cli_abort(
+      "This `epi_archive` has custom `time_type`. This is currently unsupported.",
+      class = "epiprocess__autoplot_archive_custom_time_type"
+    )
+  }
+  
+  max_version <- max(object$DT$version)
+  min_version <- min(object$DT$version)
+  
+  tt_lookup <- c("day" = "day", "week" = "week", "yearmonth" = "month")
+  .versions <- .versions %||% ifelse(time_type == "integer", 1L, unname(tt_lookup[time_type]))
+  if (is.character(.versions) || .versions == 1L) {
+    .versions <- seq(min_version, max_version - 1, by = .versions)
+  } else if (is(.versions, "Date") || is.numeric(.versions)) {
+    .versions <- .versions[.versions >= min_version & .versions <= max_version]
+  } else {
+    cli_abort(
+      "Requested `.versions` don't appear to match the available `time_type`.",
+      class = "epiprocess__autoplot_archive_bad_versions"
+    )
+  }
+  
+  
+  finalized <- epix_as_of(object, max_version)
+  key_cols <- key_colnames(finalized)
+  non_key_cols <- setdiff(names(finalized), key_cols)
+  vars <- autoplot_check_viable_response_vars(finalized, ..., non_key_cols = non_key_cols)
+  nvars <- length(vars)
+  
+  bp <- autoplot(
+    finalized, ..., .base_color = .base_color, .facet_by = "all", 
+    .facet_filter = .facet_filter, .color_by = "none"
+  )
+  geo_and_other_keys <- key_colnames(object, exclude = c("time_value", "version"))
+  all_avail <- rlang::syms(as.list(c(
+    geo_and_other_keys, 
+    if (nvars > 1) ".response_name" else NULL
+  )))
+  
+  snapshots <- purrr::map(
+    .versions,
+    function(v) {dplyr::mutate(epix_as_of(object, v), version = v)}
+  ) %>%
+    purrr::list_rbind() %>%
+    dplyr::mutate(.facets = interaction(!!!all_avail, sep = " / "))
+  
+  if (nvars > 1) {
+    snapshots <- tidyr::pivot_longer(
+      snapshots, tidyselect::all_of(names(vars)),
+      values_to = ".response",
+      names_to = ".response_name"
+    )
+  } else {
+    snapshots <- dplyr::rename(snapshots, .response := !!names(vars)) # nolint: object_usage_linter
+  }
+  snapshots <- dplyr::filter(snapshots, !is.na(.response))
+  
+  bp <- bp + 
+    ggplot2::geom_line(
+      data = snapshots, 
+      mapping = ggplot2::aes(y = .response, color = version, group = factor(version))
+    )
+  
+  if (is(.versions, "Date")) {
+    bp <- bp + ggplot2::scale_color_viridis_c(name = "Version", trans = "date")
+  } else {
+    bp <- bp + ggplot2::scale_color_viridis_c(name = "Version")
+  }
+  
+  # make the finalized layer last
+  bp$layers <- rev(bp$layers)
+  bp
+}

From c7d443dacec66cd762f7997c5bccca3eadfc4345 Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Fri, 28 Mar 2025 17:13:12 -0700
Subject: [PATCH 038/117] bugs squashed

---
 NAMESPACE                   |  1 +
 R/autoplot.R                | 42 ++++++++++++++++++++-----
 man/autoplot.epi_archive.Rd | 61 +++++++++++++++++++++++++++++++++++++
 man/autoplot.epi_df.Rd      |  8 ++---
 4 files changed, 101 insertions(+), 11 deletions(-)
 create mode 100644 man/autoplot.epi_archive.Rd

diff --git a/NAMESPACE b/NAMESPACE
index 2f97e5b3d..b216cda72 100644
--- a/NAMESPACE
+++ b/NAMESPACE
@@ -19,6 +19,7 @@ S3method(as_epi_df,tbl_df)
 S3method(as_epi_df,tbl_ts)
 S3method(as_tibble,epi_df)
 S3method(as_tsibble,epi_df)
+S3method(autoplot,epi_archive)
 S3method(autoplot,epi_df)
 S3method(clone,epi_archive)
 S3method(clone,grouped_epi_archive)
diff --git a/R/autoplot.R b/R/autoplot.R
index ebc4f4fc8..90f33437f 100644
--- a/R/autoplot.R
+++ b/R/autoplot.R
@@ -31,7 +31,7 @@
 #'   
 #'   
 #'
-#' @return A ggplot object
+#' @return A [ggplot2::ggplot] object
 #' @export
 #'
 #' @examples
@@ -68,7 +68,7 @@ autoplot.epi_df <- function(
 ) {
   .color_by <- rlang::arg_match(.color_by)
   .facet_by <- rlang::arg_match(.facet_by)
-  if (!rlang::is_quosure(.facet_filter)) .facet_filter <- rlang::enquo(.facet_filter)
+  .facet_filter <- rlang::enquo(.facet_filter)
 
   if (lifecycle::is_present(.max_facets)) {
     lifecycle::deprecate_warn(
@@ -210,11 +210,37 @@ autoplot_check_viable_response_vars <- function(
 
 
 
+#' Automatically plot an epi_archive
+#'
+#' @param object An `epi_archive`
+#' @inheritParams autoplot.epi_df
+#' @param .versions Select which versions will be displayed. By default, every
+#'   a separate line will be shown with the data as it would have appeared on
+#'   every day in the archive. This can sometimes become overwhelming. For 
+#'   example, daily data would display a line for what the data would have looked
+#'   like on every single day. To override this, you can select specific dates, 
+#'   by passing a vector of values here. Alternatively, a sequence can be 
+#'   automatically created by passing a string like `"2 weeks"` or `"month"`.
+#'   For time types where the `time_value` is a date object, any string that
+#'   is interpretable by `[base::seq.Date()]` is allowed.
+#'   
+#'   For `time_type = "integer"`, an integer larger than 1 will give a subset
+#'   of versions.
+#'
+#' @return A [ggplot2::ggplot] object
+#' @export
+#'
+#' @examples
+#' autoplot(archive_cases_dv_subset, percent_cli, .versions = "week")
+#' autoplot(archive_cases_dv_subset_all_states, percent_cli, 
+#'   .versions = "week", 
+#'   .facet_filter = geo_value %in% c("or", "az", "vt", "ms")
+#' )
+#' autoplot(archive_cases_dv_subset, percent_cli, .versions = "month", .facet_filter = geo_value == "ca")
 autoplot.epi_archive <- function(object, ..., 
                                  .base_color = "black",
                                  .versions = NULL,
                                  .facet_filter = NULL) {
-  .facet_filter <- rlang::enquo(.facet_filter)
   time_type <- object$time_type
   if (time_type == "custom") {
     cli_abort(
@@ -228,7 +254,8 @@ autoplot.epi_archive <- function(object, ...,
   
   tt_lookup <- c("day" = "day", "week" = "week", "yearmonth" = "month")
   .versions <- .versions %||% ifelse(time_type == "integer", 1L, unname(tt_lookup[time_type]))
-  if (is.character(.versions) || .versions == 1L) {
+  if (is.character(.versions) || length(.versions) == 1L) {
+    if (is.numeric(.versions)) .versions <- round(abs(.versions))
     .versions <- seq(min_version, max_version - 1, by = .versions)
   } else if (is(.versions, "Date") || is.numeric(.versions)) {
     .versions <- .versions[.versions >= min_version & .versions <= max_version]
@@ -246,9 +273,9 @@ autoplot.epi_archive <- function(object, ...,
   vars <- autoplot_check_viable_response_vars(finalized, ..., non_key_cols = non_key_cols)
   nvars <- length(vars)
   
-  bp <- autoplot(
+  bp <- autoplot.epi_df(
     finalized, ..., .base_color = .base_color, .facet_by = "all", 
-    .facet_filter = .facet_filter, .color_by = "none"
+    .facet_filter = {{ .facet_filter }}, .color_by = "none"
   )
   geo_and_other_keys <- key_colnames(object, exclude = c("time_value", "version"))
   all_avail <- rlang::syms(as.list(c(
@@ -272,7 +299,8 @@ autoplot.epi_archive <- function(object, ...,
   } else {
     snapshots <- dplyr::rename(snapshots, .response := !!names(vars)) # nolint: object_usage_linter
   }
-  snapshots <- dplyr::filter(snapshots, !is.na(.response))
+  snapshots <- snapshots %>%
+    dplyr::filter(!is.na(.response), .data$.facets %in% unique(bp$data$.facets))
   
   bp <- bp + 
     ggplot2::geom_line(
diff --git a/man/autoplot.epi_archive.Rd b/man/autoplot.epi_archive.Rd
new file mode 100644
index 000000000..d68697bec
--- /dev/null
+++ b/man/autoplot.epi_archive.Rd
@@ -0,0 +1,61 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/autoplot.R
+\name{autoplot.epi_archive}
+\alias{autoplot.epi_archive}
+\title{Automatically plot an epi_archive}
+\usage{
+\method{autoplot}{epi_archive}(
+  object,
+  ...,
+  .base_color = "black",
+  .versions = NULL,
+  .facet_filter = NULL
+)
+}
+\arguments{
+\item{object}{An \code{epi_archive}}
+
+\item{...}{<\code{\link[=dplyr_tidy_select]{tidy-select}}> One or more unquoted
+expressions separated by commas. Variable names can be used as if they
+were positions in the data frame, so expressions like \code{x:y} can
+be used to select a range of variables.}
+
+\item{.base_color}{Lines will be shown with this color if \code{.color_by == "none"}.
+For example, with a single numeric variable and faceting by \code{geo_value}, all
+locations would share the same color line.}
+
+\item{.versions}{Select which versions will be displayed. By default, every
+a separate line will be shown with the data as it would have appeared on
+every day in the archive. This can sometimes become overwhelming. For
+example, daily data would display a line for what the data would have looked
+like on every single day. To override this, you can select specific dates,
+by passing a vector of values here. Alternatively, a sequence can be
+automatically created by passing a string like \code{"2 weeks"} or \code{"month"}.
+For time types where the \code{time_value} is a date object, any string that
+is interpretable by \verb{[base::seq.Date()]} is allowed.
+
+For \code{time_type = "integer"}, an integer larger than 1 will give a subset
+of versions.}
+
+\item{.facet_filter}{Select which facets will be displayed. Especially
+useful for when there are many \code{geo_value}'s or keys. This is a
+<\code{\link[=args_data_masking]{rlang}}> expression along the lines of \verb{[dplyr::filter()]}.
+However, it must be a single expression combined with the \code{&} operator. This
+contrasts to the typical use case which allows multiple comma-separated expressions
+which are implicitly combined with \code{&}. When multiple variables are selected
+with \code{...}, their names can be filtered in combination with other factors
+by using \code{.response_name}. See the examples below.}
+}
+\value{
+A \link[ggplot2:ggplot]{ggplot2::ggplot} object
+}
+\description{
+Automatically plot an epi_archive
+}
+\examples{
+autoplot(archive_cases_dv_subset, percent_cli, .versions = "week")
+autoplot(archive_cases_dv_subset_all_states, percent_cli, 
+  .versions = "week", 
+  .facet_filter = geo_value \%in\% c("or", "az", "vt", "ms")
+)
+}
diff --git a/man/autoplot.epi_df.Rd b/man/autoplot.epi_df.Rd
index 427c372c2..e558148e5 100644
--- a/man/autoplot.epi_df.Rd
+++ b/man/autoplot.epi_df.Rd
@@ -37,9 +37,9 @@ including the \code{geo_value}
 \item{.facet_by}{Similar to \code{.color_by} except that the default is to display
 each numeric variable on a separate facet}
 
-\item{.base_color}{Lines will be shown with this color. For example, with a
-single numeric variable and faceting by \code{geo_value}, all locations would
-share the same color line.}
+\item{.base_color}{Lines will be shown with this color if \code{.color_by == "none"}.
+For example, with a single numeric variable and faceting by \code{geo_value}, all
+locations would share the same color line.}
 
 \item{.facet_filter}{Select which facets will be displayed. Especially
 useful for when there are many \code{geo_value}'s or keys. This is a
@@ -53,7 +53,7 @@ by using \code{.response_name}. See the examples below.}
 \item{.max_facets}{\ifelse{html}{\href{https://lifecycle.r-lib.org/articles/stages.html#deprecated}{\figure{lifecycle-deprecated.svg}{options: alt='[Deprecated]'}}}{\strong{[Deprecated]}}}
 }
 \value{
-A ggplot object
+A \link[ggplot2:ggplot]{ggplot2::ggplot} object
 }
 \description{
 Automatically plot an epi_df

From 75bf8e8bdcd15aae32bb57da91f5f29652a08461 Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Fri, 28 Mar 2025 17:22:34 -0700
Subject: [PATCH 039/117] rebuild docs

---
 R/autoplot.R                | 7 ++++---
 man/autoplot.epi_archive.Rd | 6 ++++--
 man/autoplot.epi_df.Rd      | 2 +-
 3 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/R/autoplot.R b/R/autoplot.R
index 90f33437f..2f73f3196 100644
--- a/R/autoplot.R
+++ b/R/autoplot.R
@@ -22,7 +22,7 @@
 #' @param .max_facets `r lifecycle::badge("deprecated")`
 #' @param .facet_filter Select which facets will be displayed. Especially
 #'   useful for when there are many `geo_value`'s or keys. This is a 
-#'   <[`rlang`][args_data_masking]> expression along the lines of `[dplyr::filter()]`.
+#'   <[`rlang`][args_data_masking]> expression along the lines of [dplyr::filter()].
 #'   However, it must be a single expression combined with the `&` operator. This
 #'   contrasts to the typical use case which allows multiple comma-separated expressions
 #'   which are implicitly combined with `&`. When multiple variables are selected
@@ -222,7 +222,7 @@ autoplot_check_viable_response_vars <- function(
 #'   by passing a vector of values here. Alternatively, a sequence can be 
 #'   automatically created by passing a string like `"2 weeks"` or `"month"`.
 #'   For time types where the `time_value` is a date object, any string that
-#'   is interpretable by `[base::seq.Date()]` is allowed.
+#'   is interpretable by [seq.Date()] is allowed.
 #'   
 #'   For `time_type = "integer"`, an integer larger than 1 will give a subset
 #'   of versions.
@@ -236,7 +236,8 @@ autoplot_check_viable_response_vars <- function(
 #'   .versions = "week", 
 #'   .facet_filter = geo_value %in% c("or", "az", "vt", "ms")
 #' )
-#' autoplot(archive_cases_dv_subset, percent_cli, .versions = "month", .facet_filter = geo_value == "ca")
+#' autoplot(archive_cases_dv_subset, percent_cli, .versions = "month", 
+#'   .facet_filter = geo_value == "ca")
 autoplot.epi_archive <- function(object, ..., 
                                  .base_color = "black",
                                  .versions = NULL,
diff --git a/man/autoplot.epi_archive.Rd b/man/autoplot.epi_archive.Rd
index d68697bec..7578650d9 100644
--- a/man/autoplot.epi_archive.Rd
+++ b/man/autoplot.epi_archive.Rd
@@ -32,14 +32,14 @@ like on every single day. To override this, you can select specific dates,
 by passing a vector of values here. Alternatively, a sequence can be
 automatically created by passing a string like \code{"2 weeks"} or \code{"month"}.
 For time types where the \code{time_value} is a date object, any string that
-is interpretable by \verb{[base::seq.Date()]} is allowed.
+is interpretable by \code{\link[=seq.Date]{seq.Date()}} is allowed.
 
 For \code{time_type = "integer"}, an integer larger than 1 will give a subset
 of versions.}
 
 \item{.facet_filter}{Select which facets will be displayed. Especially
 useful for when there are many \code{geo_value}'s or keys. This is a
-<\code{\link[=args_data_masking]{rlang}}> expression along the lines of \verb{[dplyr::filter()]}.
+<\code{\link[=args_data_masking]{rlang}}> expression along the lines of \code{\link[dplyr:filter]{dplyr::filter()}}.
 However, it must be a single expression combined with the \code{&} operator. This
 contrasts to the typical use case which allows multiple comma-separated expressions
 which are implicitly combined with \code{&}. When multiple variables are selected
@@ -58,4 +58,6 @@ autoplot(archive_cases_dv_subset_all_states, percent_cli,
   .versions = "week", 
   .facet_filter = geo_value \%in\% c("or", "az", "vt", "ms")
 )
+autoplot(archive_cases_dv_subset, percent_cli, .versions = "month", 
+  .facet_filter = geo_value == "ca")
 }
diff --git a/man/autoplot.epi_df.Rd b/man/autoplot.epi_df.Rd
index e558148e5..d618f74f1 100644
--- a/man/autoplot.epi_df.Rd
+++ b/man/autoplot.epi_df.Rd
@@ -43,7 +43,7 @@ locations would share the same color line.}
 
 \item{.facet_filter}{Select which facets will be displayed. Especially
 useful for when there are many \code{geo_value}'s or keys. This is a
-<\code{\link[=args_data_masking]{rlang}}> expression along the lines of \verb{[dplyr::filter()]}.
+<\code{\link[=args_data_masking]{rlang}}> expression along the lines of \code{\link[dplyr:filter]{dplyr::filter()}}.
 However, it must be a single expression combined with the \code{&} operator. This
 contrasts to the typical use case which allows multiple comma-separated expressions
 which are implicitly combined with \code{&}. When multiple variables are selected

From 6baae73fce461d4984381df52b0b82e103749a72 Mon Sep 17 00:00:00 2001
From: dajmcdon <dajmcdon@users.noreply.github.com>
Date: Fri, 28 Mar 2025 19:42:57 +0000
Subject: [PATCH 040/117] style: styler (GHA)

---
 R/autoplot.R | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/R/autoplot.R b/R/autoplot.R
index 2f73f3196..1fc67e2bf 100644
--- a/R/autoplot.R
+++ b/R/autoplot.R
@@ -28,8 +28,8 @@
 #'   which are implicitly combined with `&`. When multiple variables are selected
 #'   with `...`, their names can be filtered in combination with other factors
 #'   by using `.response_name`. See the examples below.
-#'   
-#'   
+#'
+#'
 #'
 #' @return A [ggplot2::ggplot] object
 #' @export
@@ -50,12 +50,12 @@
 #' autoplot(cases_deaths_subset, case_rate_7d_av,
 #'   .base_color = "red", .facet_by = "geo_value"
 #' )
-#' 
+#'
 #' # filter to only some facets, must be explicitly combined
 #' autoplot(cases_deaths_subset, cases, death_rate_7d_av,
-#'   .facet_by = "all", 
+#'   .facet_by = "all",
 #'   .facet_filter = (.response_name == "cases" & geo_value %in% c("tx", "pa")) |
-#'     (.response_name == "death_rate_7d_av" & 
+#'     (.response_name == "death_rate_7d_av" &
 #'       geo_value %in% c("ca", "fl", "ga", "ny"))
 #' )
 autoplot.epi_df <- function(
@@ -64,16 +64,15 @@ autoplot.epi_df <- function(
     .facet_by = c(".response", "other_keys", "all_keys", "geo_value", "all", "none"),
     .base_color = "#3A448F",
     .facet_filter = NULL,
-    .max_facets = deprecated()
-) {
+    .max_facets = deprecated()) {
   .color_by <- rlang::arg_match(.color_by)
   .facet_by <- rlang::arg_match(.facet_by)
   .facet_filter <- rlang::enquo(.facet_filter)
 
   if (lifecycle::is_present(.max_facets)) {
     lifecycle::deprecate_warn(
-      "0.11.1", 
-      "autoplot.epi_df(.max_facets = )", 
+      "0.11.1",
+      "autoplot.epi_df(.max_facets = )",
       "autoplot.epi_df(.facet_filter = )"
     )
   }

From ecc23dd12e44779bfc2448be6ba6cb674fa720de Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Fri, 28 Mar 2025 17:29:33 -0700
Subject: [PATCH 041/117] needs prefix

---
 R/autoplot.R | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/R/autoplot.R b/R/autoplot.R
index 1fc67e2bf..fc4d59785 100644
--- a/R/autoplot.R
+++ b/R/autoplot.R
@@ -308,7 +308,7 @@ autoplot.epi_archive <- function(object, ...,
       mapping = ggplot2::aes(y = .response, color = version, group = factor(version))
     )
   
-  if (is(.versions, "Date")) {
+  if (methods::is(.versions, "Date")) {
     bp <- bp + ggplot2::scale_color_viridis_c(name = "Version", trans = "date")
   } else {
     bp <- bp + ggplot2::scale_color_viridis_c(name = "Version")

From 2546e36997b8fad2c69f9c774ca46a74e0cd883e Mon Sep 17 00:00:00 2001
From: dajmcdon <dajmcdon@users.noreply.github.com>
Date: Sat, 29 Mar 2025 00:25:44 +0000
Subject: [PATCH 042/117] style: styler (GHA)

---
 R/autoplot.R | 65 +++++++++++++++++++++++++++-------------------------
 1 file changed, 34 insertions(+), 31 deletions(-)

diff --git a/R/autoplot.R b/R/autoplot.R
index fc4d59785..6be032ae8 100644
--- a/R/autoplot.R
+++ b/R/autoplot.R
@@ -17,11 +17,11 @@
 #' @param .facet_by Similar to `.color_by` except that the default is to display
 #'   each numeric variable on a separate facet
 #' @param .base_color Lines will be shown with this color if `.color_by == "none"`.
-#'  For example, with a single numeric variable and faceting by `geo_value`, all 
+#'  For example, with a single numeric variable and faceting by `geo_value`, all
 #'  locations would share the same color line.
 #' @param .max_facets `r lifecycle::badge("deprecated")`
 #' @param .facet_filter Select which facets will be displayed. Especially
-#'   useful for when there are many `geo_value`'s or keys. This is a 
+#'   useful for when there are many `geo_value`'s or keys. This is a
 #'   <[`rlang`][args_data_masking]> expression along the lines of [dplyr::filter()].
 #'   However, it must be a single expression combined with the `&` operator. This
 #'   contrasts to the typical use case which allows multiple comma-separated expressions
@@ -102,7 +102,7 @@ autoplot.epi_df <- function(
   all_keys <- rlang::syms(as.list(geo_and_other_keys))
   other_keys <- rlang::syms(as.list(setdiff(geo_and_other_keys, "geo_value")))
   all_avail <- rlang::syms(as.list(c(
-    geo_and_other_keys, 
+    geo_and_other_keys,
     if (nvars > 1) ".response_name" else NULL
   )))
 
@@ -166,14 +166,13 @@ autoplot.epi_df <- function(
 }
 
 autoplot_check_viable_response_vars <- function(
-    object, ..., non_key_cols, call = caller_env()
-) {
+    object, ..., non_key_cols, call = caller_env()) {
   allowed <- purrr::map_lgl(object[non_key_cols], is.numeric)
   allowed <- allowed[allowed]
   if (length(allowed) == 0 && rlang::dots_n(...) == 0L) {
     cli::cli_abort("No numeric variables were available to plot automatically.",
-                   class = "epiprocess__no_numeric_vars_available",
-                   call = call
+      class = "epiprocess__no_numeric_vars_available",
+      call = call
     )
   }
   vars <- tidyselect::eval_select(rlang::expr(c(...)), object)
@@ -215,14 +214,14 @@ autoplot_check_viable_response_vars <- function(
 #' @inheritParams autoplot.epi_df
 #' @param .versions Select which versions will be displayed. By default, every
 #'   a separate line will be shown with the data as it would have appeared on
-#'   every day in the archive. This can sometimes become overwhelming. For 
+#'   every day in the archive. This can sometimes become overwhelming. For
 #'   example, daily data would display a line for what the data would have looked
-#'   like on every single day. To override this, you can select specific dates, 
-#'   by passing a vector of values here. Alternatively, a sequence can be 
+#'   like on every single day. To override this, you can select specific dates,
+#'   by passing a vector of values here. Alternatively, a sequence can be
 #'   automatically created by passing a string like `"2 weeks"` or `"month"`.
 #'   For time types where the `time_value` is a date object, any string that
 #'   is interpretable by [seq.Date()] is allowed.
-#'   
+#'
 #'   For `time_type = "integer"`, an integer larger than 1 will give a subset
 #'   of versions.
 #'
@@ -231,13 +230,15 @@ autoplot_check_viable_response_vars <- function(
 #'
 #' @examples
 #' autoplot(archive_cases_dv_subset, percent_cli, .versions = "week")
-#' autoplot(archive_cases_dv_subset_all_states, percent_cli, 
-#'   .versions = "week", 
+#' autoplot(archive_cases_dv_subset_all_states, percent_cli,
+#'   .versions = "week",
 #'   .facet_filter = geo_value %in% c("or", "az", "vt", "ms")
 #' )
-#' autoplot(archive_cases_dv_subset, percent_cli, .versions = "month", 
-#'   .facet_filter = geo_value == "ca")
-autoplot.epi_archive <- function(object, ..., 
+#' autoplot(archive_cases_dv_subset, percent_cli,
+#'   .versions = "month",
+#'   .facet_filter = geo_value == "ca"
+#' )
+autoplot.epi_archive <- function(object, ...,
                                  .base_color = "black",
                                  .versions = NULL,
                                  .facet_filter = NULL) {
@@ -248,10 +249,10 @@ autoplot.epi_archive <- function(object, ...,
       class = "epiprocess__autoplot_archive_custom_time_type"
     )
   }
-  
+
   max_version <- max(object$DT$version)
   min_version <- min(object$DT$version)
-  
+
   tt_lookup <- c("day" = "day", "week" = "week", "yearmonth" = "month")
   .versions <- .versions %||% ifelse(time_type == "integer", 1L, unname(tt_lookup[time_type]))
   if (is.character(.versions) || length(.versions) == 1L) {
@@ -265,31 +266,34 @@ autoplot.epi_archive <- function(object, ...,
       class = "epiprocess__autoplot_archive_bad_versions"
     )
   }
-  
-  
+
+
   finalized <- epix_as_of(object, max_version)
   key_cols <- key_colnames(finalized)
   non_key_cols <- setdiff(names(finalized), key_cols)
   vars <- autoplot_check_viable_response_vars(finalized, ..., non_key_cols = non_key_cols)
   nvars <- length(vars)
-  
+
   bp <- autoplot.epi_df(
-    finalized, ..., .base_color = .base_color, .facet_by = "all", 
+    finalized, ...,
+    .base_color = .base_color, .facet_by = "all",
     .facet_filter = {{ .facet_filter }}, .color_by = "none"
   )
   geo_and_other_keys <- key_colnames(object, exclude = c("time_value", "version"))
   all_avail <- rlang::syms(as.list(c(
-    geo_and_other_keys, 
+    geo_and_other_keys,
     if (nvars > 1) ".response_name" else NULL
   )))
-  
+
   snapshots <- purrr::map(
     .versions,
-    function(v) {dplyr::mutate(epix_as_of(object, v), version = v)}
+    function(v) {
+      dplyr::mutate(epix_as_of(object, v), version = v)
+    }
   ) %>%
     purrr::list_rbind() %>%
     dplyr::mutate(.facets = interaction(!!!all_avail, sep = " / "))
-  
+
   if (nvars > 1) {
     snapshots <- tidyr::pivot_longer(
       snapshots, tidyselect::all_of(names(vars)),
@@ -301,19 +305,18 @@ autoplot.epi_archive <- function(object, ...,
   }
   snapshots <- snapshots %>%
     dplyr::filter(!is.na(.response), .data$.facets %in% unique(bp$data$.facets))
-  
-  bp <- bp + 
+
+  bp <- bp +
     ggplot2::geom_line(
-      data = snapshots, 
+      data = snapshots,
       mapping = ggplot2::aes(y = .response, color = version, group = factor(version))
     )
-  
   if (methods::is(.versions, "Date")) {
     bp <- bp + ggplot2::scale_color_viridis_c(name = "Version", trans = "date")
   } else {
     bp <- bp + ggplot2::scale_color_viridis_c(name = "Version")
   }
-  
+
   # make the finalized layer last
   bp$layers <- rev(bp$layers)
   bp

From 0b3f2e245723b8ec176b5313833152d7d58f4c05 Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Fri, 28 Mar 2025 17:41:13 -0700
Subject: [PATCH 043/117] add to pkgdown

---
 NEWS.md      | 1 +
 _pkgdown.yml | 1 +
 2 files changed, 2 insertions(+)

diff --git a/NEWS.md b/NEWS.md
index 5ec1522ee..ed8b8b8f6 100644
--- a/NEWS.md
+++ b/NEWS.md
@@ -50,6 +50,7 @@ Pre-1.0.0 numbering scheme: 0.x will indicate releases, while 0.x.y will indicat
 - The compactification tolerance setting now works with integer-type columns.
 - Various functions are now faster, using faster variants of core operations and
   avoiding reconstructing grouped `epi_df`s when unnecessary.
+- Add `autoplot.epi_archive()` to display revision patterns.
 
 ## Bug fixes
 
diff --git a/_pkgdown.yml b/_pkgdown.yml
index 3742bd416..3fee4418f 100644
--- a/_pkgdown.yml
+++ b/_pkgdown.yml
@@ -74,6 +74,7 @@ reference:
       - print.epi_archive
       - clone
       - group_by.epi_archive
+      - autoplot.epi_archive
 
   - title: "`epi_archive` manipulation"
     desc: Functions operating on `epi_archive` objects.

From 5a031ccb379a01d6ba01300706d8b77bdb5da89a Mon Sep 17 00:00:00 2001
From: dajmcdon <dajmcdon@users.noreply.github.com>
Date: Sat, 29 Mar 2025 00:32:47 +0000
Subject: [PATCH 044/117] docs: document (GHA)

---
 man/autoplot.epi_archive.Rd | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/man/autoplot.epi_archive.Rd b/man/autoplot.epi_archive.Rd
index 7578650d9..289c777b8 100644
--- a/man/autoplot.epi_archive.Rd
+++ b/man/autoplot.epi_archive.Rd
@@ -54,10 +54,12 @@ Automatically plot an epi_archive
 }
 \examples{
 autoplot(archive_cases_dv_subset, percent_cli, .versions = "week")
-autoplot(archive_cases_dv_subset_all_states, percent_cli, 
-  .versions = "week", 
+autoplot(archive_cases_dv_subset_all_states, percent_cli,
+  .versions = "week",
   .facet_filter = geo_value \%in\% c("or", "az", "vt", "ms")
 )
-autoplot(archive_cases_dv_subset, percent_cli, .versions = "month", 
-  .facet_filter = geo_value == "ca")
+autoplot(archive_cases_dv_subset, percent_cli,
+  .versions = "month",
+  .facet_filter = geo_value == "ca"
+)
 }

From 9cef8ff301e975b8f3bb20bfd0ce9fc7ad1b0386 Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Fri, 28 Mar 2025 18:10:39 -0700
Subject: [PATCH 045/117] methods prefixing

---
 DESCRIPTION  | 1 +
 R/autoplot.R | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/DESCRIPTION b/DESCRIPTION
index b81625185..b68249c17 100755
--- a/DESCRIPTION
+++ b/DESCRIPTION
@@ -49,6 +49,7 @@ Imports:
     lifecycle,
     lubridate,
     magrittr,
+    methods,
     pkgconfig,
     purrr,
     rlang,
diff --git a/R/autoplot.R b/R/autoplot.R
index 6be032ae8..47083cf57 100644
--- a/R/autoplot.R
+++ b/R/autoplot.R
@@ -258,7 +258,7 @@ autoplot.epi_archive <- function(object, ...,
   if (is.character(.versions) || length(.versions) == 1L) {
     if (is.numeric(.versions)) .versions <- round(abs(.versions))
     .versions <- seq(min_version, max_version - 1, by = .versions)
-  } else if (is(.versions, "Date") || is.numeric(.versions)) {
+  } else if (methods::is(.versions, "Date") || is.numeric(.versions)) {
     .versions <- .versions[.versions >= min_version & .versions <= max_version]
   } else {
     cli_abort(

From cfa9622f96fec6936e2bc91d1a26427d02fc8ade Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Fri, 28 Mar 2025 17:29:33 -0700
Subject: [PATCH 046/117] needs prefix

---
 R/autoplot.R | 1 +
 1 file changed, 1 insertion(+)

diff --git a/R/autoplot.R b/R/autoplot.R
index 47083cf57..4dad088e3 100644
--- a/R/autoplot.R
+++ b/R/autoplot.R
@@ -311,6 +311,7 @@ autoplot.epi_archive <- function(object, ...,
       data = snapshots,
       mapping = ggplot2::aes(y = .response, color = version, group = factor(version))
     )
+
   if (methods::is(.versions, "Date")) {
     bp <- bp + ggplot2::scale_color_viridis_c(name = "Version", trans = "date")
   } else {

From a5af0d8d95c391b0122cc1c0655c8c95e230bbbb Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Sat, 29 Mar 2025 13:07:51 -0700
Subject: [PATCH 047/117] complete rebase and redocument

---
 man/autoplot.epi_df.Rd | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/man/autoplot.epi_df.Rd b/man/autoplot.epi_df.Rd
index d618f74f1..23ca8e10b 100644
--- a/man/autoplot.epi_df.Rd
+++ b/man/autoplot.epi_df.Rd
@@ -77,9 +77,9 @@ autoplot(cases_deaths_subset, case_rate_7d_av,
 
 # filter to only some facets, must be explicitly combined
 autoplot(cases_deaths_subset, cases, death_rate_7d_av,
-  .facet_by = "all", 
+  .facet_by = "all",
   .facet_filter = (.response_name == "cases" & geo_value \%in\% c("tx", "pa")) |
-    (.response_name == "death_rate_7d_av" & 
+    (.response_name == "death_rate_7d_av" &
       geo_value \%in\% c("ca", "fl", "ga", "ny"))
 )
 }

From 2d5e221f87b9833e68cc93bdc9f19195ad76dcc4 Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Sat, 29 Mar 2025 13:58:47 -0700
Subject: [PATCH 048/117] bump version

---
 DESCRIPTION | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/DESCRIPTION b/DESCRIPTION
index b68249c17..4f7bf7992 100755
--- a/DESCRIPTION
+++ b/DESCRIPTION
@@ -1,7 +1,7 @@
 Type: Package
 Package: epiprocess
 Title: Tools for basic signal processing in epidemiology
-Version: 0.11.1
+Version: 0.11.2
 Authors@R: c(
     person("Jacob", "Bien", role = "ctb"),
     person("Logan", "Brooks", , "lcbrooks+github@andrew.cmu.edu", role = c("aut", "cre")),

From 3bfc32caf70238a61b0a3bb5ab831924458041f7 Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Mon, 31 Mar 2025 09:58:25 -0700
Subject: [PATCH 049/117] feat: mark versions with vertical if desired

---
 R/autoplot.R                | 21 ++++++++++++++++++++-
 man/autoplot.epi_archive.Rd |  9 +++++++++
 2 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/R/autoplot.R b/R/autoplot.R
index 4dad088e3..91b8eddbe 100644
--- a/R/autoplot.R
+++ b/R/autoplot.R
@@ -224,6 +224,8 @@ autoplot_check_viable_response_vars <- function(
 #'
 #'   For `time_type = "integer"`, an integer larger than 1 will give a subset
 #'   of versions.
+#' @param .mark_versions Logical. Indicate whether to mark each version with
+#'   a vertical line. Note that displaying many versions can become busy.
 #'
 #' @return A [ggplot2::ggplot] object
 #' @export
@@ -238,11 +240,18 @@ autoplot_check_viable_response_vars <- function(
 #'   .versions = "month",
 #'   .facet_filter = geo_value == "ca"
 #' )
+#' autoplot(archive_cases_dv_subset_all_states, percent_cli,
+#'   .versions = "1 month",
+#'   .facet_filter = geo_value %in% c("or", "az", "vt", "ms"),
+#'   .mark_versions = TRUE
+#' )
 autoplot.epi_archive <- function(object, ...,
                                  .base_color = "black",
                                  .versions = NULL,
+                                 .mark_versions = FALSE,
                                  .facet_filter = NULL) {
   time_type <- object$time_type
+  checkmate::assert_logical(.mark_versions, len = 1L)
   if (time_type == "custom") {
     cli_abort(
       "This `epi_archive` has custom `time_type`. This is currently unsupported.",
@@ -278,7 +287,7 @@ autoplot.epi_archive <- function(object, ...,
     finalized, ...,
     .base_color = .base_color, .facet_by = "all",
     .facet_filter = {{ .facet_filter }}, .color_by = "none"
-  )
+  ) + ggplot2::xlab("Date")
   geo_and_other_keys <- key_colnames(object, exclude = c("time_value", "version"))
   all_avail <- rlang::syms(as.list(c(
     geo_and_other_keys,
@@ -318,6 +327,16 @@ autoplot.epi_archive <- function(object, ...,
     bp <- bp + ggplot2::scale_color_viridis_c(name = "Version")
   }
 
+  if (.mark_versions) {
+    bp <- bp +
+      ggplot2::geom_vline(
+        data = snapshots,
+        ggplot2::aes(color = version, xintercept = version),
+        linewidth = .5,
+        linetype = 3,
+        show.legend = FALSE
+      )
+  }
   # make the finalized layer last
   bp$layers <- rev(bp$layers)
   bp
diff --git a/man/autoplot.epi_archive.Rd b/man/autoplot.epi_archive.Rd
index 289c777b8..5a9dfafc6 100644
--- a/man/autoplot.epi_archive.Rd
+++ b/man/autoplot.epi_archive.Rd
@@ -9,6 +9,7 @@
   ...,
   .base_color = "black",
   .versions = NULL,
+  .mark_versions = FALSE,
   .facet_filter = NULL
 )
 }
@@ -37,6 +38,9 @@ is interpretable by \code{\link[=seq.Date]{seq.Date()}} is allowed.
 For \code{time_type = "integer"}, an integer larger than 1 will give a subset
 of versions.}
 
+\item{.mark_versions}{Logical. Indicate whether to mark each version with
+a vertical line. Note that displaying many versions can become busy.}
+
 \item{.facet_filter}{Select which facets will be displayed. Especially
 useful for when there are many \code{geo_value}'s or keys. This is a
 <\code{\link[=args_data_masking]{rlang}}> expression along the lines of \code{\link[dplyr:filter]{dplyr::filter()}}.
@@ -62,4 +66,9 @@ autoplot(archive_cases_dv_subset, percent_cli,
   .versions = "month",
   .facet_filter = geo_value == "ca"
 )
+autoplot(archive_cases_dv_subset_all_states, percent_cli,
+  .versions = "1 month",
+  .facet_filter = geo_value \%in\% c("or", "az", "vt", "ms"),
+  .mark_versions = TRUE
+)
 }

From c4d90f21bca6dee9ecb478f4e6be1b7e389f9ce6 Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Mon, 31 Mar 2025 10:09:23 -0700
Subject: [PATCH 050/117] docs(?epi_archive): improve ordering, `@return`
 completeness

---
 R/archive.R        | 16 ++++++---
 man/epi_archive.Rd | 82 ++++++++++++++++++++++++++--------------------
 2 files changed, 58 insertions(+), 40 deletions(-)

diff --git a/R/archive.R b/R/archive.R
index d298a8e43..922371f1c 100644
--- a/R/archive.R
+++ b/R/archive.R
@@ -230,15 +230,12 @@ next_after.Date <- function(x) x + 1L
 #'   value of `clobberable_versions_start` does not fully trust these empty
 #'   updates, and assumes that any version `>= max(x$version)` could be
 #'   clobbered.) If `nrow(x) == 0`, then this argument is mandatory.
-#' @return An `epi_archive` object.
+#' @return * Of `new_epi_archive`: an (unvalidated) `epi_archive`
 #'
 #' @seealso [`epix_as_of`] [`epix_merge`] [`epix_slide`]
 #' @importFrom dplyr if_any if_all everything
 #' @importFrom utils capture.output
 #'
-#' @name epi_archive
-#' @export
-#'
 #' @examples
 #' # Simple ex. with necessary keys
 #' tib <- tibble::tibble(
@@ -277,6 +274,9 @@ next_after.Date <- function(x) x + 1L
 #'
 #' x <- df %>% as_epi_archive(other_keys = "county")
 #'
+#' @name epi_archive
+#' @order 3
+#' @export
 new_epi_archive <- function(
     x,
     geo_type,
@@ -329,7 +329,11 @@ new_epi_archive <- function(
 
 #' Perform second (costly) round of validation that `x` is a proper `epi_archive`
 #'
+#' @return * Of `validate_epi_archive`: an `epi_archive`,
+#'   [invisibly][base::invisible] (or raises an error if `x` was invalid)
+#'
 #' @rdname epi_archive
+#' @order 4
 #' @export
 validate_epi_archive <- function(x) {
   assert_class(x, "epi_archive")
@@ -515,8 +519,10 @@ is_locf <- function(vec, abs_tol, is_key) { # nolint: object_usage_linter
 #' @param .versions_end location based versions_end, used to avoid prefix
 #'   `version = issue` from being assigned to `versions_end` instead of being
 #'   used to rename columns.
+#' @return * Of `as_epi_archive`: an `epi_archive` object
 #'
 #' @rdname epi_archive
+#' @order 1
 #'
 #' @export
 as_epi_archive <- function(
@@ -816,7 +822,7 @@ clone.epi_archive <- function(x) {
 #'           otherwise `FALSE`.
 #'
 #' @rdname epi_archive
-#' @order 1
+#' @order 2
 #' @export
 is_epi_archive <- function(x) {
   inherits(x, "epi_archive")
diff --git a/man/epi_archive.Rd b/man/epi_archive.Rd
index 2ed9d15f8..b92cd5057 100644
--- a/man/epi_archive.Rd
+++ b/man/epi_archive.Rd
@@ -1,26 +1,14 @@
 % Generated by roxygen2: do not edit by hand
 % Please edit documentation in R/archive.R
-\name{is_epi_archive}
+\name{as_epi_archive}
+\alias{as_epi_archive}
 \alias{is_epi_archive}
 \alias{epi_archive}
 \alias{new_epi_archive}
 \alias{validate_epi_archive}
-\alias{as_epi_archive}
-\title{Test for \code{epi_archive} format}
+\title{\code{as_epi_archive} converts a data frame, data table, or tibble into an
+\code{epi_archive} object.}
 \usage{
-is_epi_archive(x)
-
-new_epi_archive(
-  x,
-  geo_type,
-  time_type,
-  other_keys,
-  clobberable_versions_start,
-  versions_end
-)
-
-validate_epi_archive(x)
-
 as_epi_archive(
   x,
   geo_type = deprecated(),
@@ -33,6 +21,19 @@ as_epi_archive(
   ...,
   versions_end = .versions_end
 )
+
+is_epi_archive(x)
+
+new_epi_archive(
+  x,
+  geo_type,
+  time_type,
+  other_keys,
+  clobberable_versions_start,
+  versions_end
+)
+
+validate_epi_archive(x)
 }
 \arguments{
 \item{x}{A data.frame, data.table, or tibble, with columns \code{geo_value},
@@ -50,6 +51,17 @@ that should be considered key variables (in the language of \code{data.table})
 apart from "geo_value", "time_value", and "version". Typical examples
 are "age" or more granular geographies.}
 
+\item{compactify}{Optional; \code{TRUE}, \code{FALSE}, or \code{"message"}. \code{TRUE} will
+remove some redundant rows, \code{FALSE} will not. \code{"message"} is like \code{TRUE}
+but will emit a message if anything was changed. Default is \code{TRUE}. See
+more information below under "Compactification:".}
+
+\item{compactify_abs_tol}{Optional; double. A tolerance level used to detect
+approximate equality for compactification. The default is 0, which
+corresponds to exact equality. Consider using this if your value columns
+undergo tiny nonmeaningful revisions and the archive object with the
+default setting is too large.}
+
 \item{clobberable_versions_start}{Optional; \code{length}-1; either a value of the
 same \code{class} and \code{typeof} as \code{x$version}, or an \code{NA} of any \code{class} and
 \code{typeof}: specifically, either (a) the earliest version that could be
@@ -67,6 +79,13 @@ causes vary between different data pipelines. The default value is \code{NA},
 which doesn't consider any versions to be clobberable. Another setting that
 may be appropriate for some pipelines is \code{max_version_with_row_in(x)}.}
 
+\item{.versions_end}{location based versions_end, used to avoid prefix
+\code{version = issue} from being assigned to \code{versions_end} instead of being
+used to rename columns.}
+
+\item{...}{used for specifying column names, as in \code{\link[dplyr:rename]{dplyr::rename}}. For
+example \code{version = release_date}}
+
 \item{versions_end}{Optional; length-1, same \code{class} and \code{typeof} as
 \code{x$version}: what is the last version we have observed? The default is
 \code{max_version_with_row_in(x)}, but values greater than this could also be
@@ -75,32 +94,25 @@ beyond \code{max(x$version)}, but they all contained empty updates. (The default
 value of \code{clobberable_versions_start} does not fully trust these empty
 updates, and assumes that any version \verb{>= max(x$version)} could be
 clobbered.) If \code{nrow(x) == 0}, then this argument is mandatory.}
-
-\item{compactify}{Optional; \code{TRUE}, \code{FALSE}, or \code{"message"}. \code{TRUE} will
-remove some redundant rows, \code{FALSE} will not. \code{"message"} is like \code{TRUE}
-but will emit a message if anything was changed. Default is \code{TRUE}. See
-more information below under "Compactification:".}
-
-\item{compactify_abs_tol}{Optional; double. A tolerance level used to detect
-approximate equality for compactification. The default is 0, which
-corresponds to exact equality. Consider using this if your value columns
-undergo tiny nonmeaningful revisions and the archive object with the
-default setting is too large.}
-
-\item{.versions_end}{location based versions_end, used to avoid prefix
-\code{version = issue} from being assigned to \code{versions_end} instead of being
-used to rename columns.}
-
-\item{...}{used for specifying column names, as in \code{\link[dplyr:rename]{dplyr::rename}}. For
-example \code{version = release_date}}
 }
 \value{
+\itemize{
+\item Of \code{as_epi_archive}: an \code{epi_archive} object
+}
+
 \itemize{
 \item Of \code{is_epi_archive}: \code{TRUE} if the object inherits from \code{epi_archive},
 otherwise \code{FALSE}.
 }
 
-An \code{epi_archive} object.
+\itemize{
+\item Of \code{new_epi_archive}: an (unvalidated) \code{epi_archive}
+}
+
+\itemize{
+\item Of \code{validate_epi_archive}: an \code{epi_archive},
+\link[base:invisible]{invisibly} (or raises an error if \code{x} was invalid)
+}
 }
 \description{
 The second main data structure for storing time series in

From 25a1d52f3d442149eb9897ca33be4b5cafedcabc Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Mon, 31 Mar 2025 10:12:02 -0700
Subject: [PATCH 051/117] docs: bump version, add NEWS.md entry

---
 DESCRIPTION | 2 +-
 NEWS.md     | 6 ++++++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/DESCRIPTION b/DESCRIPTION
index 57981a195..ae165ac9c 100755
--- a/DESCRIPTION
+++ b/DESCRIPTION
@@ -1,7 +1,7 @@
 Type: Package
 Package: epiprocess
 Title: Tools for basic signal processing in epidemiology
-Version: 0.11.0
+Version: 0.11.1
 Authors@R: c(
     person("Jacob", "Bien", role = "ctb"),
     person("Logan", "Brooks", , "lcbrooks+github@andrew.cmu.edu", role = c("aut", "cre")),
diff --git a/NEWS.md b/NEWS.md
index 3ac814aa2..39274ff4a 100644
--- a/NEWS.md
+++ b/NEWS.md
@@ -2,6 +2,12 @@
 
 Pre-1.0.0 numbering scheme: 0.x will indicate releases, while 0.x.y will indicate PR's.
 
+# epiprocess 0.12
+
+## New features
+
+- `is_epi_archive` function has been reintroduced.
+
 # epiprocess 0.11
 
 ## Breaking changes

From 153d4448e903d743af1c9cba2c41472e131d1632 Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Fri, 28 Mar 2025 12:32:51 -0700
Subject: [PATCH 052/117] allow autoplot to filter facets instead of arbitrary
 limit

---
 R/autoplot.R                |  5 +++--
 man/autoplot.epi_archive.Rd |  2 +-
 man/autoplot.epi_df.Rd      |  2 +-
 man/growth_rate_params.Rd   | 45 -------------------------------------
 4 files changed, 5 insertions(+), 49 deletions(-)

diff --git a/R/autoplot.R b/R/autoplot.R
index 4dad088e3..eb8224283 100644
--- a/R/autoplot.R
+++ b/R/autoplot.R
@@ -22,7 +22,7 @@
 #' @param .max_facets `r lifecycle::badge("deprecated")`
 #' @param .facet_filter Select which facets will be displayed. Especially
 #'   useful for when there are many `geo_value`'s or keys. This is a
-#'   <[`rlang`][args_data_masking]> expression along the lines of [dplyr::filter()].
+#'   <[`rlang`][args_data_masking]> expression along the lines of `[dplyr::filter()]`.
 #'   However, it must be a single expression combined with the `&` operator. This
 #'   contrasts to the typical use case which allows multiple comma-separated expressions
 #'   which are implicitly combined with `&`. When multiple variables are selected
@@ -64,7 +64,8 @@ autoplot.epi_df <- function(
     .facet_by = c(".response", "other_keys", "all_keys", "geo_value", "all", "none"),
     .base_color = "#3A448F",
     .facet_filter = NULL,
-    .max_facets = deprecated()) {
+    .max_facets = deprecated()
+) {
   .color_by <- rlang::arg_match(.color_by)
   .facet_by <- rlang::arg_match(.facet_by)
   .facet_filter <- rlang::enquo(.facet_filter)
diff --git a/man/autoplot.epi_archive.Rd b/man/autoplot.epi_archive.Rd
index 289c777b8..7f6fb41d3 100644
--- a/man/autoplot.epi_archive.Rd
+++ b/man/autoplot.epi_archive.Rd
@@ -39,7 +39,7 @@ of versions.}
 
 \item{.facet_filter}{Select which facets will be displayed. Especially
 useful for when there are many \code{geo_value}'s or keys. This is a
-<\code{\link[=args_data_masking]{rlang}}> expression along the lines of \code{\link[dplyr:filter]{dplyr::filter()}}.
+<\code{\link[=args_data_masking]{rlang}}> expression along the lines of \verb{[dplyr::filter()]}.
 However, it must be a single expression combined with the \code{&} operator. This
 contrasts to the typical use case which allows multiple comma-separated expressions
 which are implicitly combined with \code{&}. When multiple variables are selected
diff --git a/man/autoplot.epi_df.Rd b/man/autoplot.epi_df.Rd
index 23ca8e10b..0366390e6 100644
--- a/man/autoplot.epi_df.Rd
+++ b/man/autoplot.epi_df.Rd
@@ -43,7 +43,7 @@ locations would share the same color line.}
 
 \item{.facet_filter}{Select which facets will be displayed. Especially
 useful for when there are many \code{geo_value}'s or keys. This is a
-<\code{\link[=args_data_masking]{rlang}}> expression along the lines of \code{\link[dplyr:filter]{dplyr::filter()}}.
+<\code{\link[=args_data_masking]{rlang}}> expression along the lines of \verb{[dplyr::filter()]}.
 However, it must be a single expression combined with the \code{&} operator. This
 contrasts to the typical use case which allows multiple comma-separated expressions
 which are implicitly combined with \code{&}. When multiple variables are selected
diff --git a/man/growth_rate_params.Rd b/man/growth_rate_params.Rd
index b0bb00c80..81667df20 100644
--- a/man/growth_rate_params.Rd
+++ b/man/growth_rate_params.Rd
@@ -66,51 +66,6 @@ or \code{df}.}
 
 \item{penalty}{the coefficient of the penalty for degrees of freedom
    in the GCV criterion.}
-
-\item{k}{Integer. Degree of the piecewise polynomial curve to be
-estimated. For example, \code{k = 0} corresponds to a piecewise constant
-curve.}
-
-\item{family}{Character or function. Specifies the loss function
-to use. Valid options are:
-\itemize{
-\item \code{"gaussian"} - least squares loss (the default),
-\item \code{"binomial"} - logistic loss (classification),
-\item \code{"poisson"}  - Poisson loss for count data
-}
-
-For any other type, a valid \code{\link[stats:family]{stats::family()}} object may be passed. Note
-that these will generally be much slower to estimate than the built-in
-options passed as strings. So for example, \code{family = "gaussian"} and
-\code{family = gaussian()} will produce the same results, but the first
-will be much faster.character.}
-
-\item{nlambda}{Integer. Number of lambda values to use in the sequence.}
-
-\item{lambda_max}{Optional value for the largest \code{lambda} to use.}
-
-\item{lambda_min}{Optional value for the smallest \code{lambda} to use (> 0).}
-
-\item{lambda_min_ratio}{If neither \code{lambda} nor \code{lambda_min} is specified,
-\code{lambda_min = lambda_max * lambda_min_ratio}.
-A very small value will lead to the solution \code{theta = y} (for the Gaussian
-loss). This argument has no effect if there is a user-defined \code{lambda}
-sequence.}
-
-\item{error_measure}{Metric used to calculate cross validation scores. May
-be \code{mse}, \code{mae}, or \code{deviance}.}
-
-\item{nfolds}{Integer. The number of folds to use. For leave-vth-out cross
-validation, every vth \code{y} value and its corresponding position (and weight)
-are placed into the same fold. The first and last observations are not
-assigned to any folds. This value must be at least 2. As an example, with
-15 data points and \code{nfolds = 4}, the points are assigned to folds in the
-following way:
-\deqn{
-  0 \; 1 \; 2 \; 3 \; 4 \; 1 \; 2 \; 3 \;  4 \; 1 \; 2 \; 3 \; 4 \; 1 \; 0
-  }{0 1 2 3 4 1 2 3 4 1 2 3 4 1 0} where 0 indicates no assignment.
-Therefore, the folds are not random and running \code{cv_trendfilter()} twice
-will give the same result.}
 }
 \value{
 A list of parameter configurations.

From 384b03c883727a34d2f5aa2c816a8ec5fb10404e Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Fri, 28 Mar 2025 16:04:33 -0700
Subject: [PATCH 053/117] basic idea works, adding docs

---
 R/autoplot.R | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/R/autoplot.R b/R/autoplot.R
index eb8224283..f03bc4936 100644
--- a/R/autoplot.R
+++ b/R/autoplot.R
@@ -68,7 +68,7 @@ autoplot.epi_df <- function(
 ) {
   .color_by <- rlang::arg_match(.color_by)
   .facet_by <- rlang::arg_match(.facet_by)
-  .facet_filter <- rlang::enquo(.facet_filter)
+  if (!rlang::is_quosure(.facet_filter)) .facet_filter <- rlang::enquo(.facet_filter)
 
   if (lifecycle::is_present(.max_facets)) {
     lifecycle::deprecate_warn(
@@ -167,7 +167,8 @@ autoplot.epi_df <- function(
 }
 
 autoplot_check_viable_response_vars <- function(
-    object, ..., non_key_cols, call = caller_env()) {
+    object, ..., non_key_cols, call = caller_env()
+) {
   allowed <- purrr::map_lgl(object[non_key_cols], is.numeric)
   allowed <- allowed[allowed]
   if (length(allowed) == 0 && rlang::dots_n(...) == 0L) {
@@ -243,6 +244,7 @@ autoplot.epi_archive <- function(object, ...,
                                  .base_color = "black",
                                  .versions = NULL,
                                  .facet_filter = NULL) {
+  .facet_filter <- rlang::enquo(.facet_filter)
   time_type <- object$time_type
   if (time_type == "custom") {
     cli_abort(

From a7a26e54c46f5f71b2faa292c3b706d0143ca43b Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Fri, 28 Mar 2025 17:13:12 -0700
Subject: [PATCH 054/117] bugs squashed

---
 R/autoplot.R                | 5 ++---
 man/autoplot.epi_archive.Rd | 2 +-
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/R/autoplot.R b/R/autoplot.R
index f03bc4936..ad82eb5dc 100644
--- a/R/autoplot.R
+++ b/R/autoplot.R
@@ -68,7 +68,7 @@ autoplot.epi_df <- function(
 ) {
   .color_by <- rlang::arg_match(.color_by)
   .facet_by <- rlang::arg_match(.facet_by)
-  if (!rlang::is_quosure(.facet_filter)) .facet_filter <- rlang::enquo(.facet_filter)
+  .facet_filter <- rlang::enquo(.facet_filter)
 
   if (lifecycle::is_present(.max_facets)) {
     lifecycle::deprecate_warn(
@@ -222,7 +222,7 @@ autoplot_check_viable_response_vars <- function(
 #'   by passing a vector of values here. Alternatively, a sequence can be
 #'   automatically created by passing a string like `"2 weeks"` or `"month"`.
 #'   For time types where the `time_value` is a date object, any string that
-#'   is interpretable by [seq.Date()] is allowed.
+#'   is interpretable by `[base::seq.Date()]` is allowed.
 #'
 #'   For `time_type = "integer"`, an integer larger than 1 will give a subset
 #'   of versions.
@@ -244,7 +244,6 @@ autoplot.epi_archive <- function(object, ...,
                                  .base_color = "black",
                                  .versions = NULL,
                                  .facet_filter = NULL) {
-  .facet_filter <- rlang::enquo(.facet_filter)
   time_type <- object$time_type
   if (time_type == "custom") {
     cli_abort(
diff --git a/man/autoplot.epi_archive.Rd b/man/autoplot.epi_archive.Rd
index 7f6fb41d3..f9323f5ac 100644
--- a/man/autoplot.epi_archive.Rd
+++ b/man/autoplot.epi_archive.Rd
@@ -32,7 +32,7 @@ like on every single day. To override this, you can select specific dates,
 by passing a vector of values here. Alternatively, a sequence can be
 automatically created by passing a string like \code{"2 weeks"} or \code{"month"}.
 For time types where the \code{time_value} is a date object, any string that
-is interpretable by \code{\link[=seq.Date]{seq.Date()}} is allowed.
+is interpretable by \verb{[base::seq.Date()]} is allowed.
 
 For \code{time_type = "integer"}, an integer larger than 1 will give a subset
 of versions.}

From 65d50073397e5be953c06e641241041325ee9c10 Mon Sep 17 00:00:00 2001
From: dajmcdon <dajmcdon@users.noreply.github.com>
Date: Fri, 28 Mar 2025 19:42:57 +0000
Subject: [PATCH 055/117] style: styler (GHA)

---
 R/autoplot.R | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/R/autoplot.R b/R/autoplot.R
index ad82eb5dc..5bc971203 100644
--- a/R/autoplot.R
+++ b/R/autoplot.R
@@ -64,8 +64,7 @@ autoplot.epi_df <- function(
     .facet_by = c(".response", "other_keys", "all_keys", "geo_value", "all", "none"),
     .base_color = "#3A448F",
     .facet_filter = NULL,
-    .max_facets = deprecated()
-) {
+    .max_facets = deprecated()) {
   .color_by <- rlang::arg_match(.color_by)
   .facet_by <- rlang::arg_match(.facet_by)
   .facet_filter <- rlang::enquo(.facet_filter)

From 57a0e92ab60d7e152d75799811e2bb90d8a00bb3 Mon Sep 17 00:00:00 2001
From: dajmcdon <dajmcdon@users.noreply.github.com>
Date: Sat, 29 Mar 2025 00:25:44 +0000
Subject: [PATCH 056/117] style: styler (GHA)

---
 R/autoplot.R | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/R/autoplot.R b/R/autoplot.R
index 5bc971203..77ad05ae9 100644
--- a/R/autoplot.R
+++ b/R/autoplot.R
@@ -166,8 +166,7 @@ autoplot.epi_df <- function(
 }
 
 autoplot_check_viable_response_vars <- function(
-    object, ..., non_key_cols, call = caller_env()
-) {
+    object, ..., non_key_cols, call = caller_env()) {
   allowed <- purrr::map_lgl(object[non_key_cols], is.numeric)
   allowed <- allowed[allowed]
   if (length(allowed) == 0 && rlang::dots_n(...) == 0L) {

From 192a2ea3c774a307b34e7ce5b9684ffafacda360 Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Sat, 29 Mar 2025 13:58:47 -0700
Subject: [PATCH 057/117] bump version

---
 DESCRIPTION | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/DESCRIPTION b/DESCRIPTION
index b68249c17..4f7bf7992 100755
--- a/DESCRIPTION
+++ b/DESCRIPTION
@@ -1,7 +1,7 @@
 Type: Package
 Package: epiprocess
 Title: Tools for basic signal processing in epidemiology
-Version: 0.11.1
+Version: 0.11.2
 Authors@R: c(
     person("Jacob", "Bien", role = "ctb"),
     person("Logan", "Brooks", , "lcbrooks+github@andrew.cmu.edu", role = c("aut", "cre")),

From 1d5706203a24ea2c0b02cf59eb3bec8c66053eae Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Mon, 31 Mar 2025 09:58:25 -0700
Subject: [PATCH 058/117] feat: mark versions with vertical if desired

---
 R/autoplot.R                | 21 ++++++++++++++++++++-
 man/autoplot.epi_archive.Rd |  9 +++++++++
 2 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/R/autoplot.R b/R/autoplot.R
index 77ad05ae9..a6cf8ca7e 100644
--- a/R/autoplot.R
+++ b/R/autoplot.R
@@ -224,6 +224,8 @@ autoplot_check_viable_response_vars <- function(
 #'
 #'   For `time_type = "integer"`, an integer larger than 1 will give a subset
 #'   of versions.
+#' @param .mark_versions Logical. Indicate whether to mark each version with
+#'   a vertical line. Note that displaying many versions can become busy.
 #'
 #' @return A [ggplot2::ggplot] object
 #' @export
@@ -238,11 +240,18 @@ autoplot_check_viable_response_vars <- function(
 #'   .versions = "month",
 #'   .facet_filter = geo_value == "ca"
 #' )
+#' autoplot(archive_cases_dv_subset_all_states, percent_cli,
+#'   .versions = "1 month",
+#'   .facet_filter = geo_value %in% c("or", "az", "vt", "ms"),
+#'   .mark_versions = TRUE
+#' )
 autoplot.epi_archive <- function(object, ...,
                                  .base_color = "black",
                                  .versions = NULL,
+                                 .mark_versions = FALSE,
                                  .facet_filter = NULL) {
   time_type <- object$time_type
+  checkmate::assert_logical(.mark_versions, len = 1L)
   if (time_type == "custom") {
     cli_abort(
       "This `epi_archive` has custom `time_type`. This is currently unsupported.",
@@ -278,7 +287,7 @@ autoplot.epi_archive <- function(object, ...,
     finalized, ...,
     .base_color = .base_color, .facet_by = "all",
     .facet_filter = {{ .facet_filter }}, .color_by = "none"
-  )
+  ) + ggplot2::xlab("Date")
   geo_and_other_keys <- key_colnames(object, exclude = c("time_value", "version"))
   all_avail <- rlang::syms(as.list(c(
     geo_and_other_keys,
@@ -318,6 +327,16 @@ autoplot.epi_archive <- function(object, ...,
     bp <- bp + ggplot2::scale_color_viridis_c(name = "Version")
   }
 
+  if (.mark_versions) {
+    bp <- bp +
+      ggplot2::geom_vline(
+        data = snapshots,
+        ggplot2::aes(color = version, xintercept = version),
+        linewidth = .5,
+        linetype = 3,
+        show.legend = FALSE
+      )
+  }
   # make the finalized layer last
   bp$layers <- rev(bp$layers)
   bp
diff --git a/man/autoplot.epi_archive.Rd b/man/autoplot.epi_archive.Rd
index f9323f5ac..3a12b163e 100644
--- a/man/autoplot.epi_archive.Rd
+++ b/man/autoplot.epi_archive.Rd
@@ -9,6 +9,7 @@
   ...,
   .base_color = "black",
   .versions = NULL,
+  .mark_versions = FALSE,
   .facet_filter = NULL
 )
 }
@@ -37,6 +38,9 @@ is interpretable by \verb{[base::seq.Date()]} is allowed.
 For \code{time_type = "integer"}, an integer larger than 1 will give a subset
 of versions.}
 
+\item{.mark_versions}{Logical. Indicate whether to mark each version with
+a vertical line. Note that displaying many versions can become busy.}
+
 \item{.facet_filter}{Select which facets will be displayed. Especially
 useful for when there are many \code{geo_value}'s or keys. This is a
 <\code{\link[=args_data_masking]{rlang}}> expression along the lines of \verb{[dplyr::filter()]}.
@@ -62,4 +66,9 @@ autoplot(archive_cases_dv_subset, percent_cli,
   .versions = "month",
   .facet_filter = geo_value == "ca"
 )
+autoplot(archive_cases_dv_subset_all_states, percent_cli,
+  .versions = "1 month",
+  .facet_filter = geo_value \%in\% c("or", "az", "vt", "ms"),
+  .mark_versions = TRUE
+)
 }

From 5139e96f2c2e2341db22f98f86279eadb5cc374d Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Fri, 28 Mar 2025 12:32:51 -0700
Subject: [PATCH 059/117] allow autoplot to filter facets instead of arbitrary
 limit

---
 R/autoplot.R | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/R/autoplot.R b/R/autoplot.R
index a6cf8ca7e..70976ff63 100644
--- a/R/autoplot.R
+++ b/R/autoplot.R
@@ -64,7 +64,8 @@ autoplot.epi_df <- function(
     .facet_by = c(".response", "other_keys", "all_keys", "geo_value", "all", "none"),
     .base_color = "#3A448F",
     .facet_filter = NULL,
-    .max_facets = deprecated()) {
+    .max_facets = deprecated()
+) {
   .color_by <- rlang::arg_match(.color_by)
   .facet_by <- rlang::arg_match(.facet_by)
   .facet_filter <- rlang::enquo(.facet_filter)

From 9ae8f9c02c824f8198dba40422369f89e88effa6 Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Fri, 28 Mar 2025 16:04:33 -0700
Subject: [PATCH 060/117] basic idea works, adding docs

---
 R/autoplot.R | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/R/autoplot.R b/R/autoplot.R
index 70976ff63..8c25b1166 100644
--- a/R/autoplot.R
+++ b/R/autoplot.R
@@ -68,7 +68,7 @@ autoplot.epi_df <- function(
 ) {
   .color_by <- rlang::arg_match(.color_by)
   .facet_by <- rlang::arg_match(.facet_by)
-  .facet_filter <- rlang::enquo(.facet_filter)
+  if (!rlang::is_quosure(.facet_filter)) .facet_filter <- rlang::enquo(.facet_filter)
 
   if (lifecycle::is_present(.max_facets)) {
     lifecycle::deprecate_warn(
@@ -167,7 +167,8 @@ autoplot.epi_df <- function(
 }
 
 autoplot_check_viable_response_vars <- function(
-    object, ..., non_key_cols, call = caller_env()) {
+    object, ..., non_key_cols, call = caller_env()
+) {
   allowed <- purrr::map_lgl(object[non_key_cols], is.numeric)
   allowed <- allowed[allowed]
   if (length(allowed) == 0 && rlang::dots_n(...) == 0L) {
@@ -251,6 +252,7 @@ autoplot.epi_archive <- function(object, ...,
                                  .versions = NULL,
                                  .mark_versions = FALSE,
                                  .facet_filter = NULL) {
+  .facet_filter <- rlang::enquo(.facet_filter)
   time_type <- object$time_type
   checkmate::assert_logical(.mark_versions, len = 1L)
   if (time_type == "custom") {

From c53c10e9628a1fc0ca2e37da52502f4f1c56db9f Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Fri, 28 Mar 2025 17:13:12 -0700
Subject: [PATCH 061/117] bugs squashed

---
 R/autoplot.R | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/R/autoplot.R b/R/autoplot.R
index 8c25b1166..c7e38761a 100644
--- a/R/autoplot.R
+++ b/R/autoplot.R
@@ -68,7 +68,7 @@ autoplot.epi_df <- function(
 ) {
   .color_by <- rlang::arg_match(.color_by)
   .facet_by <- rlang::arg_match(.facet_by)
-  if (!rlang::is_quosure(.facet_filter)) .facet_filter <- rlang::enquo(.facet_filter)
+  .facet_filter <- rlang::enquo(.facet_filter)
 
   if (lifecycle::is_present(.max_facets)) {
     lifecycle::deprecate_warn(
@@ -216,10 +216,10 @@ autoplot_check_viable_response_vars <- function(
 #' @inheritParams autoplot.epi_df
 #' @param .versions Select which versions will be displayed. By default, every
 #'   a separate line will be shown with the data as it would have appeared on
-#'   every day in the archive. This can sometimes become overwhelming. For
+#'   every day in the archive. This can sometimes become overwhelming. For 
 #'   example, daily data would display a line for what the data would have looked
-#'   like on every single day. To override this, you can select specific dates,
-#'   by passing a vector of values here. Alternatively, a sequence can be
+#'   like on every single day. To override this, you can select specific dates, 
+#'   by passing a vector of values here. Alternatively, a sequence can be 
 #'   automatically created by passing a string like `"2 weeks"` or `"month"`.
 #'   For time types where the `time_value` is a date object, any string that
 #'   is interpretable by `[base::seq.Date()]` is allowed.
@@ -252,7 +252,6 @@ autoplot.epi_archive <- function(object, ...,
                                  .versions = NULL,
                                  .mark_versions = FALSE,
                                  .facet_filter = NULL) {
-  .facet_filter <- rlang::enquo(.facet_filter)
   time_type <- object$time_type
   checkmate::assert_logical(.mark_versions, len = 1L)
   if (time_type == "custom") {

From 52171ed873b6bb79c7a57969ae353b3ef509ee4e Mon Sep 17 00:00:00 2001
From: dajmcdon <dajmcdon@users.noreply.github.com>
Date: Fri, 28 Mar 2025 19:42:57 +0000
Subject: [PATCH 062/117] style: styler (GHA)

---
 R/autoplot.R | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/R/autoplot.R b/R/autoplot.R
index c7e38761a..2c16aa55f 100644
--- a/R/autoplot.R
+++ b/R/autoplot.R
@@ -64,8 +64,7 @@ autoplot.epi_df <- function(
     .facet_by = c(".response", "other_keys", "all_keys", "geo_value", "all", "none"),
     .base_color = "#3A448F",
     .facet_filter = NULL,
-    .max_facets = deprecated()
-) {
+    .max_facets = deprecated()) {
   .color_by <- rlang::arg_match(.color_by)
   .facet_by <- rlang::arg_match(.facet_by)
   .facet_filter <- rlang::enquo(.facet_filter)

From 33a735e437b85e4ef6cc1bc42ced991bc919203e Mon Sep 17 00:00:00 2001
From: dajmcdon <dajmcdon@users.noreply.github.com>
Date: Sat, 29 Mar 2025 00:25:44 +0000
Subject: [PATCH 063/117] style: styler (GHA)

---
 R/autoplot.R | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/R/autoplot.R b/R/autoplot.R
index 2c16aa55f..a6cf8ca7e 100644
--- a/R/autoplot.R
+++ b/R/autoplot.R
@@ -166,8 +166,7 @@ autoplot.epi_df <- function(
 }
 
 autoplot_check_viable_response_vars <- function(
-    object, ..., non_key_cols, call = caller_env()
-) {
+    object, ..., non_key_cols, call = caller_env()) {
   allowed <- purrr::map_lgl(object[non_key_cols], is.numeric)
   allowed <- allowed[allowed]
   if (length(allowed) == 0 && rlang::dots_n(...) == 0L) {
@@ -215,10 +214,10 @@ autoplot_check_viable_response_vars <- function(
 #' @inheritParams autoplot.epi_df
 #' @param .versions Select which versions will be displayed. By default, every
 #'   a separate line will be shown with the data as it would have appeared on
-#'   every day in the archive. This can sometimes become overwhelming. For 
+#'   every day in the archive. This can sometimes become overwhelming. For
 #'   example, daily data would display a line for what the data would have looked
-#'   like on every single day. To override this, you can select specific dates, 
-#'   by passing a vector of values here. Alternatively, a sequence can be 
+#'   like on every single day. To override this, you can select specific dates,
+#'   by passing a vector of values here. Alternatively, a sequence can be
 #'   automatically created by passing a string like `"2 weeks"` or `"month"`.
 #'   For time types where the `time_value` is a date object, any string that
 #'   is interpretable by `[base::seq.Date()]` is allowed.

From 31350b878679d7a7036e0668dbf5cd4b9330a066 Mon Sep 17 00:00:00 2001
From: dsweber2 <dsweber2@users.noreply.github.com>
Date: Mon, 31 Mar 2025 17:23:07 +0000
Subject: [PATCH 064/117] docs: document (GHA)

---
 man/growth_rate_params.Rd | 45 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 45 insertions(+)

diff --git a/man/growth_rate_params.Rd b/man/growth_rate_params.Rd
index 81667df20..b0bb00c80 100644
--- a/man/growth_rate_params.Rd
+++ b/man/growth_rate_params.Rd
@@ -66,6 +66,51 @@ or \code{df}.}
 
 \item{penalty}{the coefficient of the penalty for degrees of freedom
    in the GCV criterion.}
+
+\item{k}{Integer. Degree of the piecewise polynomial curve to be
+estimated. For example, \code{k = 0} corresponds to a piecewise constant
+curve.}
+
+\item{family}{Character or function. Specifies the loss function
+to use. Valid options are:
+\itemize{
+\item \code{"gaussian"} - least squares loss (the default),
+\item \code{"binomial"} - logistic loss (classification),
+\item \code{"poisson"}  - Poisson loss for count data
+}
+
+For any other type, a valid \code{\link[stats:family]{stats::family()}} object may be passed. Note
+that these will generally be much slower to estimate than the built-in
+options passed as strings. So for example, \code{family = "gaussian"} and
+\code{family = gaussian()} will produce the same results, but the first
+will be much faster.character.}
+
+\item{nlambda}{Integer. Number of lambda values to use in the sequence.}
+
+\item{lambda_max}{Optional value for the largest \code{lambda} to use.}
+
+\item{lambda_min}{Optional value for the smallest \code{lambda} to use (> 0).}
+
+\item{lambda_min_ratio}{If neither \code{lambda} nor \code{lambda_min} is specified,
+\code{lambda_min = lambda_max * lambda_min_ratio}.
+A very small value will lead to the solution \code{theta = y} (for the Gaussian
+loss). This argument has no effect if there is a user-defined \code{lambda}
+sequence.}
+
+\item{error_measure}{Metric used to calculate cross validation scores. May
+be \code{mse}, \code{mae}, or \code{deviance}.}
+
+\item{nfolds}{Integer. The number of folds to use. For leave-vth-out cross
+validation, every vth \code{y} value and its corresponding position (and weight)
+are placed into the same fold. The first and last observations are not
+assigned to any folds. This value must be at least 2. As an example, with
+15 data points and \code{nfolds = 4}, the points are assigned to folds in the
+following way:
+\deqn{
+  0 \; 1 \; 2 \; 3 \; 4 \; 1 \; 2 \; 3 \;  4 \; 1 \; 2 \; 3 \; 4 \; 1 \; 0
+  }{0 1 2 3 4 1 2 3 4 1 2 3 4 1 0} where 0 indicates no assignment.
+Therefore, the folds are not random and running \code{cv_trendfilter()} twice
+will give the same result.}
 }
 \value{
 A list of parameter configurations.

From f606b428054ff2ea5e786a7a130e12d88b1e1198 Mon Sep 17 00:00:00 2001
From: Daniel McDonald <dajmcdon@gmail.com>
Date: Mon, 31 Mar 2025 13:23:07 -0700
Subject: [PATCH 065/117] Apply suggestions from code review

Co-authored-by: David Weber <david.weber2@pm.me>
---
 R/autoplot.R | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/R/autoplot.R b/R/autoplot.R
index a6cf8ca7e..1d73e661a 100644
--- a/R/autoplot.R
+++ b/R/autoplot.R
@@ -212,7 +212,7 @@ autoplot_check_viable_response_vars <- function(
 #'
 #' @param object An `epi_archive`
 #' @inheritParams autoplot.epi_df
-#' @param .versions Select which versions will be displayed. By default, every
+#' @param .versions Select which versions will be displayed. By default
 #'   a separate line will be shown with the data as it would have appeared on
 #'   every day in the archive. This can sometimes become overwhelming. For
 #'   example, daily data would display a line for what the data would have looked
@@ -268,7 +268,7 @@ autoplot.epi_archive <- function(object, ...,
     if (is.numeric(.versions)) .versions <- round(abs(.versions))
     .versions <- seq(min_version, max_version - 1, by = .versions)
   } else if (methods::is(.versions, "Date") || is.numeric(.versions)) {
-    .versions <- .versions[.versions >= min_version & .versions <= max_version]
+    .versions <- .versions[min_version <= .versions & .versions <= max_version]
   } else {
     cli_abort(
       "Requested `.versions` don't appear to match the available `time_type`.",

From 748e188ceb25b2c6d5fcf9f28865bc361ad29c8c Mon Sep 17 00:00:00 2001
From: dajmcdon <dajmcdon@users.noreply.github.com>
Date: Mon, 31 Mar 2025 20:25:19 +0000
Subject: [PATCH 066/117] docs: document (GHA)

---
 man/autoplot.epi_archive.Rd | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/man/autoplot.epi_archive.Rd b/man/autoplot.epi_archive.Rd
index 3a12b163e..6692330aa 100644
--- a/man/autoplot.epi_archive.Rd
+++ b/man/autoplot.epi_archive.Rd
@@ -25,7 +25,7 @@ be used to select a range of variables.}
 For example, with a single numeric variable and faceting by \code{geo_value}, all
 locations would share the same color line.}
 
-\item{.versions}{Select which versions will be displayed. By default, every
+\item{.versions}{Select which versions will be displayed. By default
 a separate line will be shown with the data as it would have appeared on
 every day in the archive. This can sometimes become overwhelming. For
 example, daily data would display a line for what the data would have looked

From 4343549e7516eae2147cb927d534012ed34e0abc Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Mon, 31 Mar 2025 13:54:19 -0700
Subject: [PATCH 067/117] add aliases to plot(), consolidate docs

---
 NAMESPACE                                   |  2 +
 R/autoplot.R                                | 36 ++++++++--
 _pkgdown.yml                                | 10 +--
 man/{autoplot.epi_df.Rd => autoplot-epi.Rd} | 72 ++++++++++++++++++--
 man/autoplot.epi_archive.Rd                 | 74 ---------------------
 5 files changed, 105 insertions(+), 89 deletions(-)
 rename man/{autoplot.epi_df.Rd => autoplot-epi.Rd} (57%)
 delete mode 100644 man/autoplot.epi_archive.Rd

diff --git a/NAMESPACE b/NAMESPACE
index b216cda72..728de77d4 100644
--- a/NAMESPACE
+++ b/NAMESPACE
@@ -47,6 +47,8 @@ S3method(key_colnames,epi_archive)
 S3method(key_colnames,epi_df)
 S3method(key_colnames,tbl_ts)
 S3method(mean,epi_df)
+S3method(plot,epi_archive)
+S3method(plot,epi_df)
 S3method(print,epi_archive)
 S3method(print,epi_df)
 S3method(print,grouped_epi_archive)
diff --git a/R/autoplot.R b/R/autoplot.R
index 91b8eddbe..3b4b86800 100644
--- a/R/autoplot.R
+++ b/R/autoplot.R
@@ -1,6 +1,6 @@
-#' Automatically plot an epi_df
+#' Automatically plot an epi_df or epi_archive
 #'
-#' @param object An `epi_df`
+#' @param object,x An `epi_df` or `epi_archive`
 #' @param ... <[`tidy-select`][dplyr_tidy_select]> One or more unquoted
 #'   expressions separated by commas. Variable names can be used as if they
 #'   were positions in the data frame, so expressions like `x:y` can
@@ -33,8 +33,10 @@
 #'
 #' @return A [ggplot2::ggplot] object
 #' @export
+#' @name autoplot-epi
 #'
 #' @examples
+#' # -- Use it on an `epi_df`
 #' autoplot(cases_deaths_subset, cases, death_rate_7d_av)
 #' autoplot(cases_deaths_subset, case_rate_7d_av, .facet_by = "geo_value")
 #' autoplot(cases_deaths_subset, case_rate_7d_av,
@@ -58,6 +60,13 @@
 #'     (.response_name == "death_rate_7d_av" &
 #'       geo_value %in% c("ca", "fl", "ga", "ny"))
 #' )
+#' # Just an alias for convenience
+#' plot(cases_deaths_subset, cases, death_rate_7d_av,
+#'   .facet_by = "all",
+#'   .facet_filter = (.response_name == "cases" & geo_value %in% c("tx", "pa")) |
+#'     (.response_name == "death_rate_7d_av" &
+#'       geo_value %in% c("ca", "fl", "ga", "ny"))
+#' )
 autoplot.epi_df <- function(
     object, ...,
     .color_by = c("all_keys", "geo_value", "other_keys", ".response", "all", "none"),
@@ -208,10 +217,6 @@ autoplot_check_viable_response_vars <- function(
 
 
 
-#' Automatically plot an epi_archive
-#'
-#' @param object An `epi_archive`
-#' @inheritParams autoplot.epi_df
 #' @param .versions Select which versions will be displayed. By default, every
 #'   a separate line will be shown with the data as it would have appeared on
 #'   every day in the archive. This can sometimes become overwhelming. For
@@ -227,10 +232,13 @@ autoplot_check_viable_response_vars <- function(
 #' @param .mark_versions Logical. Indicate whether to mark each version with
 #'   a vertical line. Note that displaying many versions can become busy.
 #'
-#' @return A [ggplot2::ggplot] object
 #' @export
+#' @rdname autoplot-epi
 #'
 #' @examples
+#'
+#' # -- Use it on an archive
+#'
 #' autoplot(archive_cases_dv_subset, percent_cli, .versions = "week")
 #' autoplot(archive_cases_dv_subset_all_states, percent_cli,
 #'   .versions = "week",
@@ -245,6 +253,12 @@ autoplot_check_viable_response_vars <- function(
 #'   .facet_filter = geo_value %in% c("or", "az", "vt", "ms"),
 #'   .mark_versions = TRUE
 #' )
+#' # Just an alias for convenience
+#' plot(archive_cases_dv_subset_all_states, percent_cli,
+#'   .versions = "1 month",
+#'   .facet_filter = geo_value %in% c("or", "az", "vt", "ms"),
+#'   .mark_versions = TRUE
+#' )
 autoplot.epi_archive <- function(object, ...,
                                  .base_color = "black",
                                  .versions = NULL,
@@ -341,3 +355,11 @@ autoplot.epi_archive <- function(object, ...,
   bp$layers <- rev(bp$layers)
   bp
 }
+
+#' @export
+#' @rdname autoplot-epi
+plot.epi_df <- function(x, ...) { autoplot(x, ...) }
+
+#' @export
+#' @rdname autoplot-epi
+plot.epi_archive <- function(x, ...) { autoplot(x, ...) }
diff --git a/_pkgdown.yml b/_pkgdown.yml
index 3fee4418f..4dc391a9c 100644
--- a/_pkgdown.yml
+++ b/_pkgdown.yml
@@ -48,9 +48,7 @@ reference:
     desc: Constructors and information for `epi_df` objects.
   - contents:
       - epi_df
-      - print.epi_df
       - group_epi_df
-      - autoplot.epi_df
 
   - title: "`epi_df` manipulation"
     desc: Functions operating on `epi_df` objects.
@@ -71,10 +69,8 @@ reference:
     desc: Constructors and information for `epi_archive` objects.
   - contents:
       - epi_archive
-      - print.epi_archive
       - clone
       - group_by.epi_archive
-      - autoplot.epi_archive
 
   - title: "`epi_archive` manipulation"
     desc: Functions operating on `epi_archive` objects.
@@ -85,6 +81,12 @@ reference:
       - revision_summary
       - epix_fill_through_version
       - epix_truncate_versions_after
+  
+  - title: Basic analysis and visualization
+  - contents:
+      - starts_with("autoplot")
+      - starts_with("print")
+      - revision_summary
 
   - title: Example data
   - contents:
diff --git a/man/autoplot.epi_df.Rd b/man/autoplot-epi.Rd
similarity index 57%
rename from man/autoplot.epi_df.Rd
rename to man/autoplot-epi.Rd
index 23ca8e10b..470a76bdb 100644
--- a/man/autoplot.epi_df.Rd
+++ b/man/autoplot-epi.Rd
@@ -1,8 +1,12 @@
 % Generated by roxygen2: do not edit by hand
 % Please edit documentation in R/autoplot.R
-\name{autoplot.epi_df}
+\name{autoplot-epi}
+\alias{autoplot-epi}
 \alias{autoplot.epi_df}
-\title{Automatically plot an epi_df}
+\alias{autoplot.epi_archive}
+\alias{plot.epi_df}
+\alias{plot.epi_archive}
+\title{Automatically plot an epi_df or epi_archive}
 \usage{
 \method{autoplot}{epi_df}(
   object,
@@ -13,9 +17,22 @@
   .facet_filter = NULL,
   .max_facets = deprecated()
 )
+
+\method{autoplot}{epi_archive}(
+  object,
+  ...,
+  .base_color = "black",
+  .versions = NULL,
+  .mark_versions = FALSE,
+  .facet_filter = NULL
+)
+
+\method{plot}{epi_df}(x, ...)
+
+\method{plot}{epi_archive}(x, ...)
 }
 \arguments{
-\item{object}{An \code{epi_df}}
+\item{object, x}{An \code{epi_df} or \code{epi_archive}}
 
 \item{...}{<\code{\link[=dplyr_tidy_select]{tidy-select}}> One or more unquoted
 expressions separated by commas. Variable names can be used as if they
@@ -51,14 +68,31 @@ with \code{...}, their names can be filtered in combination with other factors
 by using \code{.response_name}. See the examples below.}
 
 \item{.max_facets}{\ifelse{html}{\href{https://lifecycle.r-lib.org/articles/stages.html#deprecated}{\figure{lifecycle-deprecated.svg}{options: alt='[Deprecated]'}}}{\strong{[Deprecated]}}}
+
+\item{.versions}{Select which versions will be displayed. By default, every
+a separate line will be shown with the data as it would have appeared on
+every day in the archive. This can sometimes become overwhelming. For
+example, daily data would display a line for what the data would have looked
+like on every single day. To override this, you can select specific dates,
+by passing a vector of values here. Alternatively, a sequence can be
+automatically created by passing a string like \code{"2 weeks"} or \code{"month"}.
+For time types where the \code{time_value} is a date object, any string that
+is interpretable by \code{\link[=seq.Date]{seq.Date()}} is allowed.
+
+For \code{time_type = "integer"}, an integer larger than 1 will give a subset
+of versions.}
+
+\item{.mark_versions}{Logical. Indicate whether to mark each version with
+a vertical line. Note that displaying many versions can become busy.}
 }
 \value{
 A \link[ggplot2:ggplot]{ggplot2::ggplot} object
 }
 \description{
-Automatically plot an epi_df
+Automatically plot an epi_df or epi_archive
 }
 \examples{
+# -- Use it on an `epi_df`
 autoplot(cases_deaths_subset, cases, death_rate_7d_av)
 autoplot(cases_deaths_subset, case_rate_7d_av, .facet_by = "geo_value")
 autoplot(cases_deaths_subset, case_rate_7d_av,
@@ -82,4 +116,34 @@ autoplot(cases_deaths_subset, cases, death_rate_7d_av,
     (.response_name == "death_rate_7d_av" &
       geo_value \%in\% c("ca", "fl", "ga", "ny"))
 )
+# Just an alias for convenience
+plot(cases_deaths_subset, cases, death_rate_7d_av,
+  .facet_by = "all",
+  .facet_filter = (.response_name == "cases" & geo_value \%in\% c("tx", "pa")) |
+    (.response_name == "death_rate_7d_av" &
+      geo_value \%in\% c("ca", "fl", "ga", "ny"))
+)
+
+# -- Use it on an archive
+
+autoplot(archive_cases_dv_subset, percent_cli, .versions = "week")
+autoplot(archive_cases_dv_subset_all_states, percent_cli,
+  .versions = "week",
+  .facet_filter = geo_value \%in\% c("or", "az", "vt", "ms")
+)
+autoplot(archive_cases_dv_subset, percent_cli,
+  .versions = "month",
+  .facet_filter = geo_value == "ca"
+)
+autoplot(archive_cases_dv_subset_all_states, percent_cli,
+  .versions = "1 month",
+  .facet_filter = geo_value \%in\% c("or", "az", "vt", "ms"),
+  .mark_versions = TRUE
+)
+# Just an alias for convenience
+plot(archive_cases_dv_subset_all_states, percent_cli,
+  .versions = "1 month",
+  .facet_filter = geo_value \%in\% c("or", "az", "vt", "ms"),
+  .mark_versions = TRUE
+)
 }
diff --git a/man/autoplot.epi_archive.Rd b/man/autoplot.epi_archive.Rd
deleted file mode 100644
index 5a9dfafc6..000000000
--- a/man/autoplot.epi_archive.Rd
+++ /dev/null
@@ -1,74 +0,0 @@
-% Generated by roxygen2: do not edit by hand
-% Please edit documentation in R/autoplot.R
-\name{autoplot.epi_archive}
-\alias{autoplot.epi_archive}
-\title{Automatically plot an epi_archive}
-\usage{
-\method{autoplot}{epi_archive}(
-  object,
-  ...,
-  .base_color = "black",
-  .versions = NULL,
-  .mark_versions = FALSE,
-  .facet_filter = NULL
-)
-}
-\arguments{
-\item{object}{An \code{epi_archive}}
-
-\item{...}{<\code{\link[=dplyr_tidy_select]{tidy-select}}> One or more unquoted
-expressions separated by commas. Variable names can be used as if they
-were positions in the data frame, so expressions like \code{x:y} can
-be used to select a range of variables.}
-
-\item{.base_color}{Lines will be shown with this color if \code{.color_by == "none"}.
-For example, with a single numeric variable and faceting by \code{geo_value}, all
-locations would share the same color line.}
-
-\item{.versions}{Select which versions will be displayed. By default, every
-a separate line will be shown with the data as it would have appeared on
-every day in the archive. This can sometimes become overwhelming. For
-example, daily data would display a line for what the data would have looked
-like on every single day. To override this, you can select specific dates,
-by passing a vector of values here. Alternatively, a sequence can be
-automatically created by passing a string like \code{"2 weeks"} or \code{"month"}.
-For time types where the \code{time_value} is a date object, any string that
-is interpretable by \code{\link[=seq.Date]{seq.Date()}} is allowed.
-
-For \code{time_type = "integer"}, an integer larger than 1 will give a subset
-of versions.}
-
-\item{.mark_versions}{Logical. Indicate whether to mark each version with
-a vertical line. Note that displaying many versions can become busy.}
-
-\item{.facet_filter}{Select which facets will be displayed. Especially
-useful for when there are many \code{geo_value}'s or keys. This is a
-<\code{\link[=args_data_masking]{rlang}}> expression along the lines of \code{\link[dplyr:filter]{dplyr::filter()}}.
-However, it must be a single expression combined with the \code{&} operator. This
-contrasts to the typical use case which allows multiple comma-separated expressions
-which are implicitly combined with \code{&}. When multiple variables are selected
-with \code{...}, their names can be filtered in combination with other factors
-by using \code{.response_name}. See the examples below.}
-}
-\value{
-A \link[ggplot2:ggplot]{ggplot2::ggplot} object
-}
-\description{
-Automatically plot an epi_archive
-}
-\examples{
-autoplot(archive_cases_dv_subset, percent_cli, .versions = "week")
-autoplot(archive_cases_dv_subset_all_states, percent_cli,
-  .versions = "week",
-  .facet_filter = geo_value \%in\% c("or", "az", "vt", "ms")
-)
-autoplot(archive_cases_dv_subset, percent_cli,
-  .versions = "month",
-  .facet_filter = geo_value == "ca"
-)
-autoplot(archive_cases_dv_subset_all_states, percent_cli,
-  .versions = "1 month",
-  .facet_filter = geo_value \%in\% c("or", "az", "vt", "ms"),
-  .mark_versions = TRUE
-)
-}

From 19a9ec6d03fb1113232c283fe1e7ce07cb9de5f4 Mon Sep 17 00:00:00 2001
From: dajmcdon <dajmcdon@users.noreply.github.com>
Date: Mon, 31 Mar 2025 20:59:31 +0000
Subject: [PATCH 068/117] style: styler (GHA)

---
 R/autoplot.R | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/R/autoplot.R b/R/autoplot.R
index 67d645f1d..15d577ec4 100644
--- a/R/autoplot.R
+++ b/R/autoplot.R
@@ -358,8 +358,12 @@ autoplot.epi_archive <- function(object, ...,
 
 #' @export
 #' @rdname autoplot-epi
-plot.epi_df <- function(x, ...) { autoplot(x, ...) }
+plot.epi_df <- function(x, ...) {
+  autoplot(x, ...)
+}
 
 #' @export
 #' @rdname autoplot-epi
-plot.epi_archive <- function(x, ...) { autoplot(x, ...) }
+plot.epi_archive <- function(x, ...) {
+  autoplot(x, ...)
+}

From 8546548612f8c09c61c11abcf694b0c012542695 Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Wed, 2 Apr 2025 10:38:39 -0700
Subject: [PATCH 069/117] feat: add filter.epi_archive

---
 NAMESPACE                 |   1 +
 NEWS.md                   |   6 ++
 R/methods-epi_archive.R   | 116 ++++++++++++++++++++++++++++++++++++++
 man/filter.epi_archive.Rd |  61 ++++++++++++++++++++
 4 files changed, 184 insertions(+)
 create mode 100644 man/filter.epi_archive.Rd

diff --git a/NAMESPACE b/NAMESPACE
index 2f97e5b3d..3ad5e252b 100644
--- a/NAMESPACE
+++ b/NAMESPACE
@@ -31,6 +31,7 @@ S3method(epix_slide,epi_archive)
 S3method(epix_slide,grouped_epi_archive)
 S3method(epix_truncate_versions_after,epi_archive)
 S3method(epix_truncate_versions_after,grouped_epi_archive)
+S3method(filter,epi_archive)
 S3method(group_by,epi_archive)
 S3method(group_by,epi_df)
 S3method(group_by,grouped_epi_archive)
diff --git a/NEWS.md b/NEWS.md
index 3ac814aa2..50af85357 100644
--- a/NEWS.md
+++ b/NEWS.md
@@ -2,6 +2,12 @@
 
 Pre-1.0.0 numbering scheme: 0.x will indicate releases, while 0.x.y will indicate PR's.
 
+# epiprocess 0.12
+
+## New features
+
+- Added `dplyr::filter` implementation for `epi_archive`s.
+
 # epiprocess 0.11
 
 ## Breaking changes
diff --git a/R/methods-epi_archive.R b/R/methods-epi_archive.R
index a8421efc1..5b93ba2bd 100644
--- a/R/methods-epi_archive.R
+++ b/R/methods-epi_archive.R
@@ -984,3 +984,119 @@ dplyr_col_modify.col_modify_recorder_df <- function(data, cols) {
   attr(data, "epiprocess::col_modify_recorder_df::cols") <- cols
   data
 }
+
+
+
+#' [`dplyr::filter`] for `epi_archive`s
+#'
+#' @param .data an `epi_archive`
+#' @param ... as in [`dplyr::filter`]; using the `version` column is not allowed
+#'   unless you use `.format_aware = TRUE`; see details.
+#' @param .by as in [`dplyr::filter`]
+#' @param .format_aware optional, `TRUE` or `FALSE`; default `FALSE`. See
+#'   details.
+#'
+#' @details
+#'
+#' By default, using the `version` column is disabled as it's easy to
+#' get unexpected results. See if either [`epix_as_of`] or [`epix_slide`]
+#' works as an alternative. If they don't cover your use case, then you can
+#' set `.format_aware = TRUE` to enable usage of the `version` column, but be
+#' careful to:
+#' * Factor in that `.data$DT` may be using a "compact" format based on diffing
+#'   consecutive versions; see details of [`as_epi_archive`]
+#' * Set `clobberable_versions_start` and `versions_end` of the result
+#'   appropriately after the `filter` call. They will be initialized with the
+#'   same values as in `.data`.
+#'
+#' `dplyr::filter` also has an optional argument `.preserve`, which should not
+#' have an impact on (ungrouped) `epi_archive`s, and `grouped_epi_archive`s do
+#' not currently support `dplyr::filter`.
+#'
+#' @examples
+#'
+#' # Filter to one location and a particular time range:
+#' archive_cases_dv_subset %>%
+#'   filter(geo_value == "fl", time_value >= as.Date("2020-10-01"))
+#'
+#' # Convert to weekly by taking the Saturday data for each week, so that
+#' # `case_rate_7d_av` represents a Sun--Sat average:
+#' archive_cases_dv_subset %>%
+#'   filter(as.POSIXlt(time_value)$wday == 6L)
+#'
+#' # Filtering involving versions requires extra care. See epix_as_of and
+#' # epix_slide instead for some common operations. One semi-common operation
+#' # that ends up being fairly simple is treating observations as finalized
+#' # after some amount of time, and ignoring any revisions that were made after
+#' # that point:
+#' archive_cases_dv_subset %>%
+#'   filter(version <= time_value + as.difftime(60, units = "days"),
+#'     .format_aware = TRUE
+#'   )
+#'
+#' @export
+filter.epi_archive <- function(.data, ..., .by = NULL, .format_aware = FALSE) {
+  in_tbl <- tibble::as_tibble(as.list(.data$DT), .name_repair = "minimal")
+  if (.format_aware) {
+    out_tbl <- in_tbl %>%
+      filter(..., .by = .by)
+  } else {
+    out_tbl <- in_tbl %>%
+      filter(
+        # Add our own fake filter arg to the user's ..., to update the data mask
+        # to prevent `version` column usage.
+        {
+          # We should be evaluating inside the data mask. To disable both
+          # `version` and `.data$version`, we need to go to the data mask's
+          # ------
+          e <- environment()
+          while (!identical(e, globalenv()) && !identical(e, emptyenv())) {
+            if ("version" %in% names(e)) {
+              # "version" is expected to be an active binding, and directly
+              # assigning over it has issues; explicitly `rm` first.
+              rm(list = "version", envir = e)
+              delayedAssign("version", cli::cli_abort(c(
+                "Using `version` in `filter` may produce unexpected results.",
+                ">" = "See if `epix_as_of` or `epix_slide` would work instead.",
+                ">" = "If not, see `?filter.epi_archive` details for how to proceed."
+              )), assign.env = e)
+              break
+            }
+            e <- parent.env(e)
+          }
+          TRUE
+        },
+        ...,
+        .by = .by
+      )
+  }
+  out_geo_type <-
+    if (.data$geo_type == "custom") {
+      # We might be going from a multi-resolution to single-resolution archive;
+      # e.g. national+state -> state; try to re-infer:
+      guess_geo_type(out_tbl$geo_value)
+    } else {
+      # We risk less-understandable inference failures such as inferring "hhs"
+      # from selecting hrr 10 data; just use the old geo_type:
+      .data$geo_type
+    }
+  # We might be going from daily to weekly; re-infer:
+  out_time_type <- guess_time_type(out_tbl$time_value)
+  # Even if they narrow down to just a single value of an other_keys column,
+  # it's probably still better (& simpler) to treat it as an other_keys column
+  # since it still exists in the result:
+  out_other_keys <- .data$other_keys
+  # `filter` makes no guarantees about not aliasing columns in its result when
+  # the filter condition is all TRUE, so don't setDT.
+  out_dtbl <- as.data.table(out_tbl, key = out_other_keys)
+  result <- new_epi_archive(
+    out_dtbl,
+    out_geo_type, out_time_type, out_other_keys,
+    # Assume version-related metadata unchanged; part of why we want to push
+    # back on filter expressions like `.data$version <= .env$as_of`:
+    .data$clobberable_versions_start, .data$versions_end
+  )
+  # Filtering down rows while keeping all (ukey) columns should preserve ukey
+  # uniqueness.
+  result
+}
diff --git a/man/filter.epi_archive.Rd b/man/filter.epi_archive.Rd
new file mode 100644
index 000000000..b0012bd21
--- /dev/null
+++ b/man/filter.epi_archive.Rd
@@ -0,0 +1,61 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/methods-epi_archive.R
+\name{filter.epi_archive}
+\alias{filter.epi_archive}
+\title{\code{\link[dplyr:filter]{dplyr::filter}} for \code{epi_archive}s}
+\usage{
+\method{filter}{epi_archive}(.data, ..., .by = NULL, .format_aware = FALSE)
+}
+\arguments{
+\item{.data}{an \code{epi_archive}}
+
+\item{...}{as in \code{\link[dplyr:filter]{dplyr::filter}}; using the \code{version} column is not allowed
+unless you use \code{.format_aware = TRUE}; see details.}
+
+\item{.by}{as in \code{\link[dplyr:filter]{dplyr::filter}}}
+
+\item{.format_aware}{optional, \code{TRUE} or \code{FALSE}; default \code{FALSE}. See
+details.}
+}
+\description{
+\code{\link[dplyr:filter]{dplyr::filter}} for \code{epi_archive}s
+}
+\details{
+By default, using the \code{version} column is disabled as it's easy to
+get unexpected results. See if either \code{\link{epix_as_of}} or \code{\link{epix_slide}}
+works as an alternative. If they don't cover your use case, then you can
+set \code{.format_aware = TRUE} to enable usage of the \code{version} column, but be
+careful to:
+\itemize{
+\item Factor in that \code{.data$DT} may be using a "compact" format based on diffing
+consecutive versions; see details of \code{\link{as_epi_archive}}
+\item Set \code{clobberable_versions_start} and \code{versions_end} of the result
+appropriately after the \code{filter} call. They will be initialized with the
+same values as in \code{.data}.
+}
+
+\code{dplyr::filter} also has an optional argument \code{.preserve}, which should not
+have an impact on (ungrouped) \code{epi_archive}s, and \code{grouped_epi_archive}s do
+not currently support \code{dplyr::filter}.
+}
+\examples{
+
+# Filter to one location and a particular time range:
+archive_cases_dv_subset \%>\%
+  filter(geo_value == "fl", time_value >= as.Date("2020-10-01"))
+
+# Convert to weekly by taking the Saturday data for each week, so that
+# `case_rate_7d_av` represents a Sun--Sat average:
+archive_cases_dv_subset \%>\%
+  filter(as.POSIXlt(time_value)$wday == 6L)
+
+# Filtering involving versions requires extra care. See epix_as_of and
+# epix_slide instead for some common operations. One semi-common operation
+# that ends up being fairly simple is treating observations as finalized
+# after some amount of time, and ignoring any revisions that were made after
+# that point:
+archive_cases_dv_subset \%>\%
+  filter(version <= time_value + as.difftime(60, units = "days"),
+         .format_aware = TRUE)
+
+}

From 9a4a6018ba0ddba0a939c47dd34f682e7c63a736 Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Wed, 2 Apr 2025 17:35:45 -0700
Subject: [PATCH 070/117] In filter.epi_archive, favor avoiding incorrect
 geo_type reinferences

---
 R/methods-epi_archive.R | 15 +++++----------
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/R/methods-epi_archive.R b/R/methods-epi_archive.R
index 5b93ba2bd..5c2534e34 100644
--- a/R/methods-epi_archive.R
+++ b/R/methods-epi_archive.R
@@ -1070,16 +1070,11 @@ filter.epi_archive <- function(.data, ..., .by = NULL, .format_aware = FALSE) {
         .by = .by
       )
   }
-  out_geo_type <-
-    if (.data$geo_type == "custom") {
-      # We might be going from a multi-resolution to single-resolution archive;
-      # e.g. national+state -> state; try to re-infer:
-      guess_geo_type(out_tbl$geo_value)
-    } else {
-      # We risk less-understandable inference failures such as inferring "hhs"
-      # from selecting hrr 10 data; just use the old geo_type:
-      .data$geo_type
-    }
+  # We could try to re-infer the geo_type, e.g., when filtering from
+  # national+state to just state. However, we risk inference failures such as
+  # "hrr" -> "hhs" from filtering to hrr 10, or "custom" -> USA-related when
+  # working with non-USA data:
+  out_geo_type <- .data$geo_type
   # We might be going from daily to weekly; re-infer:
   out_time_type <- guess_time_type(out_tbl$time_value)
   # Even if they narrow down to just a single value of an other_keys column,

From 1f2bd85386c6fe00a510efedbc32a27e054c3cc4 Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Wed, 2 Apr 2025 18:08:24 -0700
Subject: [PATCH 071/117] feat(filter.epi_archive): also guard against using
 measurement columns

---
 R/methods-epi_archive.R   | 53 +++++++++++++++++++++++++--------------
 man/filter.epi_archive.Rd | 32 ++++++++++++-----------
 2 files changed, 52 insertions(+), 33 deletions(-)

diff --git a/R/methods-epi_archive.R b/R/methods-epi_archive.R
index 5c2534e34..5357212f4 100644
--- a/R/methods-epi_archive.R
+++ b/R/methods-epi_archive.R
@@ -998,13 +998,15 @@ dplyr_col_modify.col_modify_recorder_df <- function(data, cols) {
 #'
 #' @details
 #'
-#' By default, using the `version` column is disabled as it's easy to
-#' get unexpected results. See if either [`epix_as_of`] or [`epix_slide`]
-#' works as an alternative. If they don't cover your use case, then you can
-#' set `.format_aware = TRUE` to enable usage of the `version` column, but be
-#' careful to:
-#' * Factor in that `.data$DT` may be using a "compact" format based on diffing
-#'   consecutive versions; see details of [`as_epi_archive`]
+#' By default, using the `version` column or measurement columns is disabled as
+#' it's easy to get unexpected results. See if either [`epix_as_of`] or
+#' [`epix_slide`] works as an alternative. If they don't cover your use case,
+#' then you can set `.format_aware = TRUE` to enable usage of these columns, but
+#' be careful to:
+#' * Factor in that `.data$DT` may have been converted into a compact format
+#'   based on diffing consecutive versions, and the last version of each
+#'   observation in `.data$DT` will always be carried forward to future
+#'   `version`s`; see details of [`as_epi_archive`].
 #' * Set `clobberable_versions_start` and `versions_end` of the result
 #'   appropriately after the `filter` call. They will be initialized with the
 #'   same values as in `.data`.
@@ -1024,13 +1026,14 @@ dplyr_col_modify.col_modify_recorder_df <- function(data, cols) {
 #' archive_cases_dv_subset %>%
 #'   filter(as.POSIXlt(time_value)$wday == 6L)
 #'
-#' # Filtering involving versions requires extra care. See epix_as_of and
-#' # epix_slide instead for some common operations. One semi-common operation
-#' # that ends up being fairly simple is treating observations as finalized
-#' # after some amount of time, and ignoring any revisions that were made after
-#' # that point:
+#' # Filtering involving the `version` column or measurement columns requires
+#' # extra care. See epix_as_of and epix_slide instead for some common
+#' # operations. One semi-common operation that ends up being fairly simple is
+#' # treating observations as finalized after some amount of time, and ignoring
+#' # any revisions that were made after that point:
 #' archive_cases_dv_subset %>%
-#'   filter(version <= time_value + as.difftime(60, units = "days"),
+#'   filter(
+#'     version <= time_value + as.difftime(60, units = "days"),
 #'     .format_aware = TRUE
 #'   )
 #'
@@ -1041,25 +1044,37 @@ filter.epi_archive <- function(.data, ..., .by = NULL, .format_aware = FALSE) {
     out_tbl <- in_tbl %>%
       filter(..., .by = .by)
   } else {
+    measurement_colnames <- setdiff(names(.data$DT), key_colnames(.data))
+    forbidden_colnames <- c("version", measurement_colnames)
     out_tbl <- in_tbl %>%
       filter(
         # Add our own fake filter arg to the user's ..., to update the data mask
         # to prevent `version` column usage.
         {
           # We should be evaluating inside the data mask. To disable both
-          # `version` and `.data$version`, we need to go to the data mask's
-          # ------
+          # `version` and `.data$version` etc., we need to go to the ancestor
+          # environment containing the data mask's column bindings. This is
+          # likely just the parent env, but search to make sure, in a way akin
+          # to `<<-`:
           e <- environment()
           while (!identical(e, globalenv()) && !identical(e, emptyenv())) {
             if ("version" %in% names(e)) {
-              # "version" is expected to be an active binding, and directly
-              # assigning over it has issues; explicitly `rm` first.
-              rm(list = "version", envir = e)
+              # This is where the column bindings are. Replace the forbidden ones.
+              # They are expected to be active bindings, so directly
+              # assigning has issues; `rm` first.
+              rm(list = forbidden_colnames, envir = e)
               delayedAssign("version", cli::cli_abort(c(
-                "Using `version` in `filter` may produce unexpected results.",
+                "Using `version` in `filter.epi_archive` may produce unexpected results.",
                 ">" = "See if `epix_as_of` or `epix_slide` would work instead.",
                 ">" = "If not, see `?filter.epi_archive` details for how to proceed."
               )), assign.env = e)
+              for (measurement_colname in measurement_colnames) {
+                delayedAssign(measurement_colname, cli::cli_abort(c(
+                  "Using `{format_varname(measurement_colname)}`
+                 in `filter.epi_archive` may produce unexpected results.",
+                  ">" = "See `?filter.epi_archive` details for how to proceed."
+                )), assign.env = e)
+              }
               break
             }
             e <- parent.env(e)
diff --git a/man/filter.epi_archive.Rd b/man/filter.epi_archive.Rd
index b0012bd21..a007bdc5e 100644
--- a/man/filter.epi_archive.Rd
+++ b/man/filter.epi_archive.Rd
@@ -21,14 +21,16 @@ details.}
 \code{\link[dplyr:filter]{dplyr::filter}} for \code{epi_archive}s
 }
 \details{
-By default, using the \code{version} column is disabled as it's easy to
-get unexpected results. See if either \code{\link{epix_as_of}} or \code{\link{epix_slide}}
-works as an alternative. If they don't cover your use case, then you can
-set \code{.format_aware = TRUE} to enable usage of the \code{version} column, but be
-careful to:
+By default, using the \code{version} column or measurement columns is disabled as
+it's easy to get unexpected results. See if either \code{\link{epix_as_of}} or
+\code{\link{epix_slide}} works as an alternative. If they don't cover your use case,
+then you can set \code{.format_aware = TRUE} to enable usage of these columns, but
+be careful to:
 \itemize{
-\item Factor in that \code{.data$DT} may be using a "compact" format based on diffing
-consecutive versions; see details of \code{\link{as_epi_archive}}
+\item Factor in that \code{.data$DT} may have been converted into a compact format
+based on diffing consecutive versions, and the last version of each
+observation in \code{.data$DT} will always be carried forward to future
+\code{version}s\verb{; see details of [}as_epi_archive`].
 \item Set \code{clobberable_versions_start} and \code{versions_end} of the result
 appropriately after the \code{filter} call. They will be initialized with the
 same values as in \code{.data}.
@@ -49,13 +51,15 @@ archive_cases_dv_subset \%>\%
 archive_cases_dv_subset \%>\%
   filter(as.POSIXlt(time_value)$wday == 6L)
 
-# Filtering involving versions requires extra care. See epix_as_of and
-# epix_slide instead for some common operations. One semi-common operation
-# that ends up being fairly simple is treating observations as finalized
-# after some amount of time, and ignoring any revisions that were made after
-# that point:
+# Filtering involving the `version` column or measurement columns requires
+# extra care. See epix_as_of and epix_slide instead for some common
+# operations. One semi-common operation that ends up being fairly simple is
+# treating observations as finalized after some amount of time, and ignoring
+# any revisions that were made after that point:
 archive_cases_dv_subset \%>\%
-  filter(version <= time_value + as.difftime(60, units = "days"),
-         .format_aware = TRUE)
+  filter(
+    version <= time_value + as.difftime(60, units = "days"),
+    .format_aware = TRUE
+  )
 
 }

From 7180b93ae3e8c5a711076f27f47fa8da301523b6 Mon Sep 17 00:00:00 2001
From: Dmitry Shemetov <dshemetov@ucdavis.edu>
Date: Thu, 3 Apr 2025 15:01:51 -0700
Subject: [PATCH 072/117] ci: replace auto-ci with commands (#650)

* style, document, preview-docs all in pr-commands and can now be triggered with PR comments
* update PR template to remind user to run these commands
---
 .github/pull_request_template.md   |   5 +
 .github/workflows/doc-preview.yaml |  65 -------------
 .github/workflows/document.yaml    |  58 -----------
 .github/workflows/lint.yaml        |  12 +--
 .github/workflows/pr-commands.yaml | 149 +++++++++++++++++++++++++++++
 .github/workflows/style.yml        |  87 -----------------
 6 files changed, 158 insertions(+), 218 deletions(-)
 delete mode 100644 .github/workflows/doc-preview.yaml
 delete mode 100644 .github/workflows/document.yaml
 create mode 100644 .github/workflows/pr-commands.yaml
 delete mode 100644 .github/workflows/style.yml

diff --git a/.github/pull_request_template.md b/.github/pull_request_template.md
index c8841c356..93b00f150 100644
--- a/.github/pull_request_template.md
+++ b/.github/pull_request_template.md
@@ -14,6 +14,11 @@ Please:
       (backwards-incompatible changes to the documented interface) are noted.
       Collect the changes under the next release number (e.g. if you are on
       1.7.2, then write your changes under the 1.8 heading).
+- [ ] Styling and documentation checks. Make a PR comment with:
+  - `/document` to check the package documentation and fix any issues.
+  - `/style` to check the style and fix any issues.
+  - `/preview-docs` to preview the docs.
+  - See Actions GitHub tab to track progress of these commands.
 - See [DEVELOPMENT.md](DEVELOPMENT.md) for more information on the development
   process.
 
diff --git a/.github/workflows/doc-preview.yaml b/.github/workflows/doc-preview.yaml
deleted file mode 100644
index 068184225..000000000
--- a/.github/workflows/doc-preview.yaml
+++ /dev/null
@@ -1,65 +0,0 @@
-on:
-  issue_comment:
-    types: [created]
-
-name: doc-preview.yaml
-
-permissions: read-all
-
-jobs:
-  preview:
-    if: ${{ github.event.issue.pull_request && (github.event.comment.author_association == 'COLLABORATOR' || github.event.comment.author_association == 'CONTRIBUTOR' || github.event.comment.author_association == 'MEMBER' || github.event.comment.author_association == 'OWNER') && startsWith(github.event.comment.body, '/preview-docs') }}
-
-    runs-on: ubuntu-latest
-    permissions:
-      # Needed to write a comment on the PR
-      pull-requests: write
-      # Needed to read the PR branch
-      contents: read
-    steps:
-      - uses: actions/checkout@v4
-        with:
-          # Checkout the PR branch
-          ref: refs/pull/${{ github.event.issue.number }}/head
-
-      - uses: r-lib/actions/setup-pandoc@v2
-
-      - uses: r-lib/actions/setup-r@v2
-        with:
-          use-public-rspm: true
-
-      - uses: r-lib/actions/setup-r-dependencies@v2
-        with:
-          extra-packages: any::pkgdown, local::.
-          needs: website
-
-      - name: Build site
-        run: pkgdown::build_site_github_pages(new_process = FALSE, install = FALSE)
-        shell: Rscript {0}
-
-      - name: Deploy to Netlify
-        uses: nwtgck/actions-netlify@v3.0
-        with:
-          # Standard config
-          github-token: ${{ secrets.GITHUB_TOKEN }}
-          deploy-message: "Deploy from GitHub Actions"
-          # 'docs/' is the default directory for pkgdown::build_site()
-          # we add 'dev' because _pkgdown.yml has 'development: mode: devel'
-          publish-dir: './docs/dev'
-          # Development deploys only
-          production-deploy: false
-          # Enable pull request comment (default)
-          enable-pull-request-comment: true
-          # Overwrite the pull request comment with updated link (default)
-          overwrites-pull-request-comment: true
-          # Don't deploy to GitHub
-          enable-github-deployment: false
-          # Don't update the status of the commit
-          enable-commit-status: false
-          # Don't comment on the commit
-          enable-commit-comment: false
-        env:
-          # Netlify credentials (currently from Dmitry's account)
-          NETLIFY_AUTH_TOKEN: ${{ secrets.NETLIFY_AUTH_TOKEN }}
-          NETLIFY_SITE_ID: ${{ secrets.NETLIFY_SITE_ID }}
-        timeout-minutes: 1
diff --git a/.github/workflows/document.yaml b/.github/workflows/document.yaml
deleted file mode 100644
index bfc3c43da..000000000
--- a/.github/workflows/document.yaml
+++ /dev/null
@@ -1,58 +0,0 @@
-# Workflow derived from https://github.com/r-lib/actions/tree/v2/examples
-# Need help debugging build failures? Start at https://github.com/r-lib/actions#where-to-find-help
-#
-# Modifications:
-# - devtools::build_readme()
-# - workflow_dispatch added to allow manual triggering of the workflow
-# - trigger branches changed
-# - API key secrets.DELPHI_GITHUB_ACTIONS_EPIDATA_API_KEY
-on:
-  push:
-    paths: ["R/**", "README.Rmd"]
-  workflow_dispatch:
-
-name: Document
-
-jobs:
-  document:
-    runs-on: ubuntu-latest
-    env:
-      GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
-      DELPHI_EPIDATA_KEY: ${{ secrets.DELPHI_GITHUB_ACTIONS_EPIDATA_API_KEY }}
-    steps:
-      - name: Checkout repo
-        uses: actions/checkout@v3
-        with:
-          fetch-depth: 0
-
-      - name: Setup R
-        uses: r-lib/actions/setup-r@v2
-        with:
-          use-public-rspm: true
-
-      - name: Install dependencies
-        uses: r-lib/actions/setup-r-dependencies@v2
-        with:
-          extra-packages: |
-            any::devtools
-            any::roxygen2
-          needs: |
-            devtools
-            roxygen2
-
-      - name: Document
-        run: roxygen2::roxygenise()
-        shell: Rscript {0}
-
-      - name: Build README.md from README.Rmd
-        run: Rscript -e 'if (file.exists("README.Rmd")) devtools::build_readme()'
-
-      - name: Commit and push changes
-        run: |
-          git config --local user.name "$GITHUB_ACTOR"
-          git config --local user.email "$GITHUB_ACTOR@users.noreply.github.com"
-          git add README.md
-          git add man/\* NAMESPACE DESCRIPTION
-          git commit -m "docs: document (GHA)" || echo "No changes to commit"
-          git pull --rebase
-          git push origin
diff --git a/.github/workflows/lint.yaml b/.github/workflows/lint.yaml
index 21d030c94..78eae465f 100644
--- a/.github/workflows/lint.yaml
+++ b/.github/workflows/lint.yaml
@@ -1,10 +1,5 @@
 # Workflow derived from https://github.com/r-lib/actions/tree/v2/examples
 # Need help debugging build failures? Start at https://github.com/r-lib/actions#where-to-find-help
-#
-# Modifications:
-#   * workflow_dispatch added to allow manual triggering of the workflow
-#   * trigger branches changed
-#   * API key secrets.DELPHI_GITHUB_ACTIONS_EPIDATA_API_KEY
 on:
   push:
     branches: [main, dev]
@@ -12,16 +7,17 @@ on:
     branches: [main, dev]
   workflow_dispatch:
 
-name: Lint
+name: lint.yaml
+
+permissions: read-all
 
 jobs:
   lint:
     runs-on: ubuntu-latest
     env:
       GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
-      DELPHI_EPIDATA_KEY: ${{ secrets.DELPHI_GITHUB_ACTIONS_EPIDATA_API_KEY }}
     steps:
-      - uses: actions/checkout@v3
+      - uses: actions/checkout@v4
 
       - uses: r-lib/actions/setup-r@v2
         with:
diff --git a/.github/workflows/pr-commands.yaml b/.github/workflows/pr-commands.yaml
new file mode 100644
index 000000000..6cf91e878
--- /dev/null
+++ b/.github/workflows/pr-commands.yaml
@@ -0,0 +1,149 @@
+# Workflow derived from https://github.com/r-lib/actions/tree/v2/examples
+# Need help debugging build failures? Start at https://github.com/r-lib/actions#where-to-find-help
+# Modifications:
+# - Allow more roles to trigger each PR command
+# - Document builds README.md from README.Rmd with devtools::build_readme()
+# - Include a doc-preview command (uses Netlify to preview the docs)
+on:
+  issue_comment:
+    types: [created]
+
+name: pr-commands.yaml
+
+permissions: read-all
+
+jobs:
+  document:
+    if: ${{ github.event.issue.pull_request && (github.event.comment.author_association == 'COLLABORATOR' || github.event.comment.author_association == 'CONTRIBUTOR' || github.event.comment.author_association == 'MEMBER' || github.event.comment.author_association == 'OWNER') && startsWith(github.event.comment.body, '/document') }}
+    name: document
+    runs-on: ubuntu-latest
+    env:
+      GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
+    permissions:
+      contents: write
+    steps:
+      - uses: actions/checkout@v4
+
+      - uses: r-lib/actions/pr-fetch@v2
+        with:
+          repo-token: ${{ secrets.GITHUB_TOKEN }}
+
+      - uses: r-lib/actions/setup-r@v2
+        with:
+          use-public-rspm: true
+
+      - uses: r-lib/actions/setup-r-dependencies@v2
+        with:
+          extra-packages: any::roxygen2
+          needs: pr-document
+
+      - name: Document
+        run: roxygen2::roxygenise()
+        shell: Rscript {0}
+
+      - name: Build README.md from README.Rmd
+        run: Rscript -e 'if (file.exists("README.Rmd")) devtools::build_readme()'
+
+      - name: commit
+        run: |
+          git config --local user.name "$GITHUB_ACTOR"
+          git config --local user.email "$GITHUB_ACTOR@users.noreply.github.com"
+          git add man/\* NAMESPACE
+          git commit -m 'Document'
+
+      - uses: r-lib/actions/pr-push@v2
+        with:
+          repo-token: ${{ secrets.GITHUB_TOKEN }}
+
+  style:
+    if: ${{ github.event.issue.pull_request && (github.event.comment.author_association == 'COLLABORATOR' || github.event.comment.author_association == 'CONTRIBUTOR' || github.event.comment.author_association == 'MEMBER' || github.event.comment.author_association == 'OWNER') && startsWith(github.event.comment.body, '/style') }}
+    name: style
+    runs-on: ubuntu-latest
+    env:
+      GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
+    permissions:
+      contents: write
+    steps:
+      - uses: actions/checkout@v4
+
+      - uses: r-lib/actions/pr-fetch@v2
+        with:
+          repo-token: ${{ secrets.GITHUB_TOKEN }}
+
+      - uses: r-lib/actions/setup-r@v2
+
+      - name: Install dependencies
+        run: install.packages("styler")
+        shell: Rscript {0}
+
+      - name: Style
+        run: styler::style_pkg()
+        shell: Rscript {0}
+
+      - name: commit
+        run: |
+          git config --local user.name "$GITHUB_ACTOR"
+          git config --local user.email "$GITHUB_ACTOR@users.noreply.github.com"
+          git add \*.R
+          git commit -m 'Style'
+
+      - uses: r-lib/actions/pr-push@v2
+        with:
+          repo-token: ${{ secrets.GITHUB_TOKEN }}
+
+  preview:
+    if: ${{ github.event.issue.pull_request && (github.event.comment.author_association == 'COLLABORATOR' || github.event.comment.author_association == 'CONTRIBUTOR' || github.event.comment.author_association == 'MEMBER' || github.event.comment.author_association == 'OWNER') && startsWith(github.event.comment.body, '/preview-docs') }}
+
+    runs-on: ubuntu-latest
+    permissions:
+      # Needed to write a comment on the PR
+      pull-requests: write
+      # Needed to read the PR branch
+      contents: read
+    steps:
+      - uses: actions/checkout@v4
+        with:
+          # Checkout the PR branch
+          ref: refs/pull/${{ github.event.issue.number }}/head
+
+      - uses: r-lib/actions/setup-pandoc@v2
+
+      - uses: r-lib/actions/setup-r@v2
+        with:
+          use-public-rspm: true
+
+      - uses: r-lib/actions/setup-r-dependencies@v2
+        with:
+          extra-packages: any::pkgdown, local::.
+          needs: website
+
+      - name: Build site
+        run: pkgdown::build_site_github_pages(new_process = FALSE, install = FALSE)
+        shell: Rscript {0}
+
+      - name: Deploy to Netlify
+        uses: nwtgck/actions-netlify@v3.0
+        with:
+          # Standard config
+          github-token: ${{ secrets.GITHUB_TOKEN }}
+          deploy-message: "Deploy from GitHub Actions"
+          # 'docs/' is the default directory for pkgdown::build_site()
+          # we add 'dev' because _pkgdown.yml has 'development: mode: devel'
+          publish-dir: './docs/dev'
+          # Development deploys only
+          production-deploy: false
+          # Enable pull request comment (default)
+          enable-pull-request-comment: true
+          # Overwrite the pull request comment with updated link (default)
+          overwrites-pull-request-comment: true
+          # Don't deploy to GitHub
+          enable-github-deployment: false
+          # Don't update the status of the commit
+          enable-commit-status: false
+          # Don't comment on the commit
+          enable-commit-comment: false
+        env:
+          # Netlify credentials (currently from Dmitry's account)
+          NETLIFY_AUTH_TOKEN: ${{ secrets.NETLIFY_AUTH_TOKEN }}
+          NETLIFY_SITE_ID: ${{ secrets.NETLIFY_SITE_ID }}
+        timeout-minutes: 1
diff --git a/.github/workflows/style.yml b/.github/workflows/style.yml
deleted file mode 100644
index 396200a5f..000000000
--- a/.github/workflows/style.yml
+++ /dev/null
@@ -1,87 +0,0 @@
-# Workflow derived from https://github.com/r-lib/actions/tree/v2/examples
-# Need help debugging build failures? Start at https://github.com/r-lib/actions#where-to-find-help
-#
-# Modifications:
-#   * workflow_dispatch added to allow manual triggering of the workflow
-#   * trigger branches changed
-#   * API key secrets.DELPHI_GITHUB_ACTIONS_EPIDATA_API_KEY
-on:
-  push:
-    paths:
-      [
-        "**.[rR]",
-        "**.[qrR]md",
-        "**.[rR]markdown",
-        "**.[rR]nw",
-        "**.[rR]profile",
-      ]
-  workflow_dispatch:
-
-name: Style
-
-jobs:
-  style:
-    runs-on: ubuntu-latest
-    env:
-      GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
-      DELPHI_EPIDATA_KEY: ${{ secrets.DELPHI_GITHUB_ACTIONS_EPIDATA_API_KEY }}
-    steps:
-      - name: Checkout repo
-        uses: actions/checkout@v3
-        with:
-          fetch-depth: 0
-
-      - name: Setup R
-        uses: r-lib/actions/setup-r@v2
-        with:
-          use-public-rspm: true
-
-      - name: Install dependencies
-        uses: r-lib/actions/setup-r-dependencies@v2
-        with:
-          extra-packages: any::styler, any::roxygen2
-          needs: styler
-
-      - name: Enable styler cache
-        run: styler::cache_activate()
-        shell: Rscript {0}
-
-      - name: Determine cache location
-        id: styler-location
-        run: |
-          cat(
-            "location=",
-            styler::cache_info(format = "tabular")$location,
-            "\n",
-            file = Sys.getenv("GITHUB_OUTPUT"),
-            append = TRUE,
-            sep = ""
-          )
-        shell: Rscript {0}
-
-      - name: Cache styler
-        uses: actions/cache@v3
-        with:
-          path: ${{ steps.styler-location.outputs.location }}
-          key: ${{ runner.os }}-styler-${{ github.sha }}
-          restore-keys: |
-            ${{ runner.os }}-styler-
-            ${{ runner.os }}-
-
-      - name: Style
-        run: styler::style_pkg()
-        shell: Rscript {0}
-
-      - name: Commit and push changes
-        run: |
-          if FILES_TO_COMMIT=($(git diff-index --name-only ${{ github.sha }} \
-              | egrep --ignore-case '\.(R|[qR]md|Rmarkdown|Rnw|Rprofile)$'))
-          then
-            git config --local user.name "$GITHUB_ACTOR"
-            git config --local user.email "$GITHUB_ACTOR@users.noreply.github.com"
-            git commit ${FILES_TO_COMMIT[*]} -m "style: styler (GHA)"
-            git pull --rebase
-            git push origin
-          else
-            echo "No changes to commit."
-          fi

From 319842952ddb4912d6a7650c505a324753838db4 Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Thu, 3 Apr 2025 15:19:35 -0700
Subject: [PATCH 073/117] Test filter.epi_archive, fix&tweak some behaviors

---
 R/methods-epi_archive.R                   |  19 ++--
 tests/testthat/test-methods-epi_archive.R | 100 ++++++++++++++++++++++
 2 files changed, 113 insertions(+), 6 deletions(-)

diff --git a/R/methods-epi_archive.R b/R/methods-epi_archive.R
index 5357212f4..8595e3165 100644
--- a/R/methods-epi_archive.R
+++ b/R/methods-epi_archive.R
@@ -1042,7 +1042,7 @@ filter.epi_archive <- function(.data, ..., .by = NULL, .format_aware = FALSE) {
   in_tbl <- tibble::as_tibble(as.list(.data$DT), .name_repair = "minimal")
   if (.format_aware) {
     out_tbl <- in_tbl %>%
-      filter(..., .by = .by)
+      filter(..., .by = {{.by}})
   } else {
     measurement_colnames <- setdiff(names(.data$DT), key_colnames(.data))
     forbidden_colnames <- c("version", measurement_colnames)
@@ -1067,13 +1067,13 @@ filter.epi_archive <- function(.data, ..., .by = NULL, .format_aware = FALSE) {
                 "Using `version` in `filter.epi_archive` may produce unexpected results.",
                 ">" = "See if `epix_as_of` or `epix_slide` would work instead.",
                 ">" = "If not, see `?filter.epi_archive` details for how to proceed."
-              )), assign.env = e)
+              ), class = "epiprocess__filter_archive__used_version"), assign.env = e)
               for (measurement_colname in measurement_colnames) {
                 delayedAssign(measurement_colname, cli::cli_abort(c(
                   "Using `{format_varname(measurement_colname)}`
                  in `filter.epi_archive` may produce unexpected results.",
                   ">" = "See `?filter.epi_archive` details for how to proceed."
-                )), assign.env = e)
+                ), class = "epiprocess__filter_archive__used_measurement"), assign.env = e)
               }
               break
             }
@@ -1082,7 +1082,7 @@ filter.epi_archive <- function(.data, ..., .by = NULL, .format_aware = FALSE) {
           TRUE
         },
         ...,
-        .by = .by
+        .by = {{.by}}
       )
   }
   # We could try to re-infer the geo_type, e.g., when filtering from
@@ -1090,8 +1090,15 @@ filter.epi_archive <- function(.data, ..., .by = NULL, .format_aware = FALSE) {
   # "hrr" -> "hhs" from filtering to hrr 10, or "custom" -> USA-related when
   # working with non-USA data:
   out_geo_type <- .data$geo_type
-  # We might be going from daily to weekly; re-infer:
-  out_time_type <- guess_time_type(out_tbl$time_value)
+  if (.data$time_type == "day") {
+    # We might be going from daily to weekly; re-infer:
+    out_time_type <- guess_time_type(out_tbl$time_value)
+  } else {
+    # We might be filtering weekly to a single time_value; avoid re-inferring to
+    # stay "week". Or in other cases, just skip inferring, as re-inferring is
+    # expected to match the input time_type:
+    out_time_type <- .data$time_type
+  }
   # Even if they narrow down to just a single value of an other_keys column,
   # it's probably still better (& simpler) to treat it as an other_keys column
   # since it still exists in the result:
diff --git a/tests/testthat/test-methods-epi_archive.R b/tests/testthat/test-methods-epi_archive.R
index 45ba6ea12..9c80ffa1f 100644
--- a/tests/testthat/test-methods-epi_archive.R
+++ b/tests/testthat/test-methods-epi_archive.R
@@ -128,3 +128,103 @@ test_that("group_vars works as expected", {
     "geo_value"
   )
 })
+
+test_that("filter.epi_archive works as expected", {
+
+  ea2 <- ea2_data %>%
+    as_epi_archive()
+
+  # Some basic output value checks:
+
+  expect_equal(
+    ea2 %>% filter(geo_value == "tn"),
+    new_epi_archive(
+      ea2$DT[FALSE],
+      ea2$geo_type, ea2$time_type, ea2$other_keys,
+      ea2$clobberable_versions_start, ea2$versions_end
+    )
+  )
+
+  expect_equal(
+    ea2 %>% filter(geo_value == "ca", time_value == as.Date("2020-06-02")),
+    new_epi_archive(
+      data.table::data.table(geo_value = "ca", time_value = as.Date("2020-06-02"),
+                             version = as.Date("2020-06-02") + 0:2, cases = 0:2),
+      ea2$geo_type, ea2$time_type, ea2$other_keys,
+      ea2$clobberable_versions_start, ea2$versions_end
+    )
+  )
+
+  # Output geo_type and time_type behavior:
+
+  hrr_day_ea <- tibble(
+    geo_value = c(rep(1, 14), 100),
+    time_value = as.Date("2020-01-01") - 1 + c(1:14, 14),
+    version = time_value + 3,
+    value = 1:15
+  ) %>%
+    as_epi_archive()
+
+  expect_equal(hrr_day_ea$geo_type, "hrr")
+  expect_equal(hrr_day_ea$time_type, "day")
+
+  hrr_week_ea <- hrr_day_ea %>%
+    filter(geo_value == 1, as.POSIXlt(time_value)$wday == 6L)
+
+  expect_equal(hrr_week_ea$geo_type, "hrr")
+  expect_equal(hrr_week_ea$time_type, "week")
+
+  hrr_one_week_ea <- hrr_week_ea %>%
+    filter(time_value == time_value[[1]])
+
+  expect_equal(hrr_one_week_ea$time_type, "week")
+
+  intcustom_day_ea <- hrr_day_ea
+  intcustom_day_ea$geo_type <- "custom"
+
+  intcustom_week_ea <- intcustom_day_ea %>%
+    filter(geo_value == 1, as.POSIXlt(time_value)$wday == 6L)
+
+  expect_equal(intcustom_week_ea$geo_type, "custom")
+  expect_equal(intcustom_week_ea$time_type, "week")
+
+  # Error-raising:
+  expect_error(
+    ea2 %>% filter(version == as.Date("2020-06-02")),
+    class = "epiprocess__filter_archive__used_version"
+  )
+  expect_error(
+    ea2 %>% filter(version <= as.Date("2020-06-02")),
+    class = "epiprocess__filter_archive__used_version"
+  )
+  expect_snapshot(
+    ea2 %>% filter(version <= as.Date("2020-06-02")),
+    error = TRUE, cnd_class = TRUE
+  )
+  expect_error(
+    ea2 %>% filter(time_value >= as.Date("2020-06-02"), cases >= 2L),
+    class = "epiprocess__filter_archive__used_measurement"
+  )
+  expect_snapshot(
+    ea2 %>% filter(time_value >= as.Date("2020-06-02"), cases >= 2L),
+    error = TRUE, cnd_class = TRUE
+  )
+  expect_error(
+    ea2 %>% filter(time_value >= as.Date("2020-06-02"), cases >= 2L),
+    class = "epiprocess__filter_archive__used_measurement"
+  )
+  expect_error(
+    ea2 %>% filter(cases >= median(cases), .by = geo_value),
+    class = "epiprocess__filter_archive__used_measurement"
+  )
+
+  # Escape hatch:
+  expect_equal(
+    ea2 %>%
+      filter(version <= time_value + as.difftime(1, units = "days"),
+             .format_aware = TRUE) %>%
+      .$DT,
+    ea2$DT[version <= time_value + as.difftime(1, units = "days"), ]
+  )
+
+})

From 24bda523f98b6a3c487d90180f37a93853586504 Mon Sep 17 00:00:00 2001
From: brookslogan <brookslogan@users.noreply.github.com>
Date: Thu, 3 Apr 2025 22:22:53 +0000
Subject: [PATCH 074/117] style: styler (GHA)

---
 R/methods-epi_archive.R                   |  4 ++--
 tests/testthat/test-methods-epi_archive.R | 11 ++++++-----
 2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/R/methods-epi_archive.R b/R/methods-epi_archive.R
index 8595e3165..02cf2cf30 100644
--- a/R/methods-epi_archive.R
+++ b/R/methods-epi_archive.R
@@ -1042,7 +1042,7 @@ filter.epi_archive <- function(.data, ..., .by = NULL, .format_aware = FALSE) {
   in_tbl <- tibble::as_tibble(as.list(.data$DT), .name_repair = "minimal")
   if (.format_aware) {
     out_tbl <- in_tbl %>%
-      filter(..., .by = {{.by}})
+      filter(..., .by = {{ .by }})
   } else {
     measurement_colnames <- setdiff(names(.data$DT), key_colnames(.data))
     forbidden_colnames <- c("version", measurement_colnames)
@@ -1082,7 +1082,7 @@ filter.epi_archive <- function(.data, ..., .by = NULL, .format_aware = FALSE) {
           TRUE
         },
         ...,
-        .by = {{.by}}
+        .by = {{ .by }}
       )
   }
   # We could try to re-infer the geo_type, e.g., when filtering from
diff --git a/tests/testthat/test-methods-epi_archive.R b/tests/testthat/test-methods-epi_archive.R
index 9c80ffa1f..6e08c990a 100644
--- a/tests/testthat/test-methods-epi_archive.R
+++ b/tests/testthat/test-methods-epi_archive.R
@@ -130,7 +130,6 @@ test_that("group_vars works as expected", {
 })
 
 test_that("filter.epi_archive works as expected", {
-
   ea2 <- ea2_data %>%
     as_epi_archive()
 
@@ -148,8 +147,10 @@ test_that("filter.epi_archive works as expected", {
   expect_equal(
     ea2 %>% filter(geo_value == "ca", time_value == as.Date("2020-06-02")),
     new_epi_archive(
-      data.table::data.table(geo_value = "ca", time_value = as.Date("2020-06-02"),
-                             version = as.Date("2020-06-02") + 0:2, cases = 0:2),
+      data.table::data.table(
+        geo_value = "ca", time_value = as.Date("2020-06-02"),
+        version = as.Date("2020-06-02") + 0:2, cases = 0:2
+      ),
       ea2$geo_type, ea2$time_type, ea2$other_keys,
       ea2$clobberable_versions_start, ea2$versions_end
     )
@@ -222,9 +223,9 @@ test_that("filter.epi_archive works as expected", {
   expect_equal(
     ea2 %>%
       filter(version <= time_value + as.difftime(1, units = "days"),
-             .format_aware = TRUE) %>%
+        .format_aware = TRUE
+      ) %>%
       .$DT,
     ea2$DT[version <= time_value + as.difftime(1, units = "days"), ]
   )
-
 })

From ec9feffad52b1f5a2b229dfdfb7ddfbf929735ac Mon Sep 17 00:00:00 2001
From: Dmitry Shemetov <dshemetov@ucdavis.edu>
Date: Fri, 4 Apr 2025 15:38:22 -0700
Subject: [PATCH 075/117] feat: introduce epix_as_of_current() for convenience
 (#645)

* feat+bug: add epix_as_of_current() and set_versions_end()
* fix yearmonth breaking in epix_as_of() and epix_slide()
---
 DESCRIPTION                               |  2 +-
 NAMESPACE                                 |  2 +
 NEWS.md                                   |  1 +
 R/grouped_epi_archive.R                   |  6 +-
 R/methods-epi_archive.R                   | 78 +++++++++++++++++------
 R/slide.R                                 |  6 +-
 _pkgdown.yml                              |  2 +
 man/epix_as_of_current.Rd                 | 17 +++++
 man/set_versions_end.Rd                   | 20 ++++++
 tests/testthat/test-methods-epi_archive.R | 37 +++++++++++
 10 files changed, 145 insertions(+), 26 deletions(-)
 create mode 100644 man/epix_as_of_current.Rd
 create mode 100644 man/set_versions_end.Rd

diff --git a/DESCRIPTION b/DESCRIPTION
index 4f7bf7992..19bc82e01 100755
--- a/DESCRIPTION
+++ b/DESCRIPTION
@@ -1,7 +1,7 @@
 Type: Package
 Package: epiprocess
 Title: Tools for basic signal processing in epidemiology
-Version: 0.11.2
+Version: 0.11.3
 Authors@R: c(
     person("Jacob", "Bien", role = "ctb"),
     person("Logan", "Brooks", , "lcbrooks+github@andrew.cmu.edu", role = c("aut", "cre")),
diff --git a/NAMESPACE b/NAMESPACE
index 2f7128c8c..3582fe9c7 100644
--- a/NAMESPACE
+++ b/NAMESPACE
@@ -76,6 +76,7 @@ export(epi_slide_mean)
 export(epi_slide_opt)
 export(epi_slide_sum)
 export(epix_as_of)
+export(epix_as_of_current)
 export(epix_fill_through_version)
 export(epix_merge)
 export(epix_slide)
@@ -99,6 +100,7 @@ export(new_epi_df)
 export(relocate)
 export(rename)
 export(revision_summary)
+export(set_versions_end)
 export(slice)
 export(sum_groups_epi_df)
 export(time_column_names)
diff --git a/NEWS.md b/NEWS.md
index 91316843c..88acabee4 100644
--- a/NEWS.md
+++ b/NEWS.md
@@ -7,6 +7,7 @@ Pre-1.0.0 numbering scheme: 0.x will indicate releases, while 0.x.y will indicat
 ## New features
 
 - `is_epi_archive` function has been reintroduced.
+- `epix_as_of_current()` introduced as an alias for `epix_as_of(.$versions_end)`.
 
 # epiprocess 0.11
 
diff --git a/R/grouped_epi_archive.R b/R/grouped_epi_archive.R
index 378ea13bc..5fb9d081e 100644
--- a/R/grouped_epi_archive.R
+++ b/R/grouped_epi_archive.R
@@ -437,9 +437,13 @@ epix_slide.grouped_epi_archive <- function(
   out <- lapply(.versions, function(.version) {
     # Ungrouped as-of data; `epi_df` if `all_versions` is `FALSE`,
     # `epi_archive` if `all_versions` is `TRUE`:
+    min_time_value <- .version - .before
+    if (is.na(min_time_value)) {
+      min_time_value <- -Inf
+    }
     as_of_raw <- .x$private$ungrouped %>% epix_as_of(
       .version,
-      min_time_value = .version - .before,
+      min_time_value = min_time_value,
       all_versions = .all_versions
     )
 
diff --git a/R/methods-epi_archive.R b/R/methods-epi_archive.R
index a8421efc1..8978bd62e 100644
--- a/R/methods-epi_archive.R
+++ b/R/methods-epi_archive.R
@@ -58,6 +58,7 @@
 #' epix_as_of(archive_cases_dv_subset2, max(archive_cases_dv_subset$DT$version))
 #'
 #' @importFrom data.table between key
+#' @importFrom checkmate assert_scalar assert_logical assert_class
 #' @export
 epix_as_of <- function(x, version, min_time_value = -Inf, all_versions = FALSE,
                        max_version = deprecated()) {
@@ -88,6 +89,14 @@ epix_as_of <- function(x, version, min_time_value = -Inf, all_versions = FALSE,
   if (version > x$versions_end) {
     cli_abort("`version` must be at most `epi_archive$versions_end`.")
   }
+  assert_scalar(min_time_value, na.ok = FALSE)
+  min_time_value_inf <- is.infinite(min_time_value) && min_time_value < 0
+  min_time_value_same_type <- typeof(min_time_value) == typeof(x$DT$time_value) &
+    class(min_time_value) == class(x$DT$time_value)
+  if (!min_time_value_inf && !min_time_value_same_type) {
+    cli_abort("`min_time_value` must be either -Inf or a time_value of the same type and
+      class as `epi_archive$time_value`.")
+  }
   assert_logical(all_versions, len = 1)
   if (!is.na(x$clobberable_versions_start) && version >= x$clobberable_versions_start) {
     cli_warn(
@@ -100,39 +109,63 @@ epix_as_of <- function(x, version, min_time_value = -Inf, all_versions = FALSE,
     )
   }
 
-  # We can't disable nonstandard evaluation nor use the `..` feature in the `i`
-  # argument of `[.data.table` below; try to avoid problematic names and abort
-  # if we fail to do so:
-  .min_time_value <- min_time_value
-  .version <- version
-  if (any(c(".min_time_value", ".version") %in% names(x$DT))) {
-    cli_abort("epi_archives can't contain a `.min_time_value` or `.version` column")
-  }
-
   # Filter by version and return
   if (all_versions) {
     # epi_archive is copied into result, so we can modify result directly
     result <- epix_truncate_versions_after(x, version)
-    result$DT <- result$DT[time_value >= .min_time_value, ] # nolint: object_usage_linter
+    if (!min_time_value_inf) {
+      # See below for why we need this branch.
+      filter_mask <- result$DT$time_value >= min_time_value
+      result$DT <- result$DT[filter_mask, ] # nolint: object_usage_linter
+    }
     return(result)
   }
 
   # Make sure to use data.table ways of filtering and selecting
-  as_of_epi_df <- x$DT[time_value >= .min_time_value & version <= .version, ] %>% # nolint: object_usage_linter
-    unique(
-      by = c("geo_value", "time_value", other_keys),
-      fromLast = TRUE
-    ) %>%
+  if (min_time_value_inf) {
+    # This branch is needed for `epix_as_of` to work with `yearmonth` time type
+    # to avoid time_value > .min_time_value, which is NA for `yearmonth`.
+    filter_mask <- x$DT$version <= version
+  } else {
+    filter_mask <- x$DT$time_value >= min_time_value & x$DT$version <= version
+  }
+  as_of_epi_df <- x$DT[filter_mask, ] %>%
+    unique(by = c("geo_value", "time_value", other_keys), fromLast = TRUE) %>%
+    as.data.frame() %>%
     tibble::as_tibble() %>%
     dplyr::select(-"version") %>%
-    as_epi_df(
-      as_of = version,
-      other_keys = other_keys
-    )
+    as_epi_df(as_of = version, other_keys = other_keys)
 
   return(as_of_epi_df)
 }
 
+#' Get the latest snapshot from an `epi_archive` object.
+#'
+#' The latest snapshot is the snapshot of the last known version.
+#'
+#' @param x An `epi_archive` object
+#' @return The latest snapshot from an `epi_archive` object
+#' @export
+epix_as_of_current <- function(x) {
+  assert_class(x, "epi_archive")
+  x %>% epix_as_of(.$versions_end)
+}
+
+#' Set the `versions_end` attribute of an `epi_archive` object
+#'
+#' An escape hatch for epix_as_of, which does not allow version >
+#' `$versions_end`.
+#'
+#' @param x An `epi_archive` object
+#' @param versions_end The new `versions_end` value
+#' @return An `epi_archive` object with the updated `versions_end` attribute
+#' @export
+set_versions_end <- function(x, versions_end) {
+  assert_class(x, "epi_archive")
+  validate_version_bound(versions_end, x$DT, na_ok = FALSE)
+  x$versions_end <- versions_end
+  x
+}
 
 #' Fill `epi_archive` unobserved history
 #'
@@ -880,10 +913,13 @@ epix_slide.epi_archive <- function(
 #' @noRd
 epix_slide_versions_default <- function(ea) {
   versions_with_updates <- c(ea$DT$version, ea$versions_end)
-  tidyr::full_seq(versions_with_updates, guess_period(versions_with_updates))
+  if (ea$time_type == "yearmonth") {
+    min(versions_with_updates) + seq(0, max(versions_with_updates) - min(versions_with_updates), by = 1)
+  } else {
+    tidyr::full_seq(versions_with_updates, guess_period(versions_with_updates))
+  }
 }
 
-
 #' Filter an `epi_archive` object to keep only older versions
 #'
 #' Generates a filtered `epi_archive` from an `epi_archive` object, keeping
diff --git a/R/slide.R b/R/slide.R
index abc7c3b77..be000f579 100644
--- a/R/slide.R
+++ b/R/slide.R
@@ -1108,7 +1108,7 @@ epi_slide_sum <- function(
 #' `before` and `after` args are assumed to have been validated by the calling
 #' function (using `validate_slide_window_arg`).
 #'
-#' @importFrom checkmate assert_function
+#' @importFrom checkmate assert_function anyInfinite
 #' @keywords internal
 full_date_seq <- function(x, before, after, time_type) {
   if (!time_type %in% c("day", "week", "yearmonth", "integer")) {
@@ -1126,7 +1126,7 @@ full_date_seq <- function(x, before, after, time_type) {
   if (time_type %in% c("yearmonth", "integer")) {
     all_dates <- seq(min(x$time_value), max(x$time_value), by = 1L)
 
-    if (before != 0 && before != Inf) {
+    if (before != 0 && !anyInfinite(before)) {
       pad_early_dates <- all_dates[1L] - before:1
     }
     if (after != 0) {
@@ -1139,7 +1139,7 @@ full_date_seq <- function(x, before, after, time_type) {
     )
 
     all_dates <- seq(min(x$time_value), max(x$time_value), by = by)
-    if (before != 0 && before != Inf) {
+    if (before != 0 && !anyInfinite(before)) {
       # The behavior is analogous to the branch with tsibble types above. For
       # more detail, note that the function `seq.Date(from, ..., length.out =
       # n)` returns `from + 0:n`. Since we want `from + 1:n`, we drop the first
diff --git a/_pkgdown.yml b/_pkgdown.yml
index 4dc391a9c..08a522579 100644
--- a/_pkgdown.yml
+++ b/_pkgdown.yml
@@ -76,11 +76,13 @@ reference:
     desc: Functions operating on `epi_archive` objects.
   - contents:
       - epix_as_of
+      - epix_as_of_current
       - epix_slide
       - epix_merge
       - revision_summary
       - epix_fill_through_version
       - epix_truncate_versions_after
+      - set_versions_end
   
   - title: Basic analysis and visualization
   - contents:
diff --git a/man/epix_as_of_current.Rd b/man/epix_as_of_current.Rd
new file mode 100644
index 000000000..8dcb53afc
--- /dev/null
+++ b/man/epix_as_of_current.Rd
@@ -0,0 +1,17 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/methods-epi_archive.R
+\name{epix_as_of_current}
+\alias{epix_as_of_current}
+\title{Get the latest snapshot from an \code{epi_archive} object.}
+\usage{
+epix_as_of_current(x)
+}
+\arguments{
+\item{x}{An \code{epi_archive} object}
+}
+\value{
+The latest snapshot from an \code{epi_archive} object
+}
+\description{
+The latest snapshot is the snapshot of the last known version.
+}
diff --git a/man/set_versions_end.Rd b/man/set_versions_end.Rd
new file mode 100644
index 000000000..3d38f23f2
--- /dev/null
+++ b/man/set_versions_end.Rd
@@ -0,0 +1,20 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/methods-epi_archive.R
+\name{set_versions_end}
+\alias{set_versions_end}
+\title{Set the \code{versions_end} attribute of an \code{epi_archive} object}
+\usage{
+set_versions_end(x, versions_end)
+}
+\arguments{
+\item{x}{An \code{epi_archive} object}
+
+\item{versions_end}{The new \code{versions_end} value}
+}
+\value{
+An \code{epi_archive} object with the updated \code{versions_end} attribute
+}
+\description{
+An escape hatch for epix_as_of, which does not allow version >
+\verb{$versions_end}.
+}
diff --git a/tests/testthat/test-methods-epi_archive.R b/tests/testthat/test-methods-epi_archive.R
index 45ba6ea12..26f0e7698 100644
--- a/tests/testthat/test-methods-epi_archive.R
+++ b/tests/testthat/test-methods-epi_archive.R
@@ -128,3 +128,40 @@ test_that("group_vars works as expected", {
     "geo_value"
   )
 })
+
+test_that("epix_as_of_now works as expected", {
+  expect_equal(
+    attr(ea2_data %>% as_epi_archive() %>% epix_as_of_current(), "metadata")$as_of,
+    as.Date("2020-06-04")
+  )
+  time_value <- as.Date("2020-06-01")
+  df <- dplyr::tribble(
+    ~geo_value, ~time_value, ~version, ~cases,
+    "ca", time_value, time_value, 1,
+    "ca", time_value + 7, time_value + 7, 2,
+  )
+  expect_equal(
+    attr(df %>% as_epi_archive() %>% epix_as_of_current(), "metadata")$as_of,
+    as.Date("2020-06-08")
+  )
+  time_value <- tsibble::yearmonth(as.Date("2020-06-01") - lubridate::month(1))
+  df <- dplyr::tribble(
+    ~geo_value, ~time_value, ~version, ~cases,
+    "ca", time_value, time_value, 1,
+    "ca", time_value + lubridate::month(1), time_value + lubridate::month(1), 2,
+  )
+  expect_equal(
+    attr(df %>% as_epi_archive() %>% epix_as_of_current(), "metadata")$as_of,
+    tsibble::yearmonth("2020-06")
+  )
+  time_value <- 2020
+  df <- dplyr::tribble(
+    ~geo_value, ~time_value, ~version, ~cases,
+    "ca", time_value, time_value, 1,
+    "ca", time_value + 7, time_value + 7, 2,
+  )
+  expect_equal(
+    attr(df %>% as_epi_archive() %>% epix_as_of_current(), "metadata")$as_of,
+    2027
+  )
+})

From d56f42702961c5bd3fecce77bf9021f5daf5b3d4 Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Tue, 8 Apr 2025 10:45:44 -0700
Subject: [PATCH 076/117] needs github.com for linking to source in pkgdown

---
 DESCRIPTION | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/DESCRIPTION b/DESCRIPTION
index 19bc82e01..f75879e97 100755
--- a/DESCRIPTION
+++ b/DESCRIPTION
@@ -35,7 +35,8 @@ Description: This package introduces common data structures for working
     package is designed to be used in conjunction with `epipredict` for
     building and evaluating epidemiological models.
 License: MIT + file LICENSE
-URL: https://cmu-delphi.github.io/epiprocess/
+URL: https://github.com/cmu-delphi/epiprocess,
+    https://cmu-delphi.github.io/epiprocess/
 Depends:
     epidatasets,
     R (>= 3.6)

From afa4217773524d8c8c597e5ad95a16947840ecab Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Tue, 8 Apr 2025 13:36:32 -0700
Subject: [PATCH 077/117] Remove repo: setting to try to fix source links

{epipredict} site has working GitHub and source links (to appropriate main/dev
branches) while excluding the repo: setting from its _pkgdown.yml, so see if it
works here as well.
---
 _pkgdown.yml | 8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/_pkgdown.yml b/_pkgdown.yml
index 08a522579..2953039cb 100644
--- a/_pkgdown.yml
+++ b/_pkgdown.yml
@@ -37,12 +37,6 @@ articles:
     contents:
       - compactify
 
-repo:
-  url:
-    home: https://github.com/cmu-delphi/epiprocess/
-    source: https://github.com/cmu-delphi/epiprocess/
-    issue: https://github.com/cmu-delphi/epiprocess/issues
-
 reference:
   - title: "`epi_df` basics"
     desc: Constructors and information for `epi_df` objects.
@@ -83,7 +77,7 @@ reference:
       - epix_fill_through_version
       - epix_truncate_versions_after
       - set_versions_end
-  
+
   - title: Basic analysis and visualization
   - contents:
       - starts_with("autoplot")

From 8d9f2de6d7791e3c063f2170cbd93c547887c1b2 Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Tue, 8 Apr 2025 16:10:19 -0700
Subject: [PATCH 078/117] switch revision_behavior to a better formatted print
 method

---
 NAMESPACE                                     |   2 +
 R/revision_analysis.R                         | 160 ++++++++++--------
 man/epiprocess-package.Rd                     |   1 +
 ...vision_summary.Rd => revision_analysis.Rd} |  29 +++-
 4 files changed, 115 insertions(+), 77 deletions(-)
 rename man/{revision_summary.Rd => revision_analysis.Rd} (89%)

diff --git a/NAMESPACE b/NAMESPACE
index 3582fe9c7..1c548309e 100644
--- a/NAMESPACE
+++ b/NAMESPACE
@@ -53,6 +53,7 @@ S3method(print,epi_archive)
 S3method(print,epi_df)
 S3method(print,grouped_epi_archive)
 S3method(print,growth_rate_params)
+S3method(print,revision_behavior)
 S3method(summary,epi_df)
 S3method(ungroup,epi_df)
 S3method(ungroup,grouped_epi_archive)
@@ -99,6 +100,7 @@ export(new_epi_archive)
 export(new_epi_df)
 export(relocate)
 export(rename)
+export(revision_analysis)
 export(revision_summary)
 export(set_versions_end)
 export(slice)
diff --git a/R/revision_analysis.R b/R/revision_analysis.R
index f36dcc16a..705d244b3 100644
--- a/R/revision_analysis.R
+++ b/R/revision_analysis.R
@@ -34,8 +34,6 @@
 #'   `NA`'s compactify is run again if `compactify` is `TRUE` to make
 #'   sure there are no duplicate values from occasions when the signal is
 #'   revised to `NA`, and then back to its immediately-preceding value.
-#' @param print_inform bool, determines whether to print summary information, or
-#'   only return the full summary tibble
 #' @param min_waiting_period `difftime`, integer or `NULL`. Sets a cutoff: any
 #'   time_values that have not had at least `min_waiting_period` to stabilize as
 #'   of the `versions_end` are removed. `min_waiting_period` should characterize
@@ -77,7 +75,7 @@
 #'   contains jumps at year boundaries.
 #'
 #' @examples
-#' revision_example <- revision_summary(archive_cases_dv_subset, percent_cli)
+#' revision_example <- revision_analysis(archive_cases_dv_subset, percent_cli)
 #' revision_example %>% arrange(desc(spread))
 #'
 #' @export
@@ -86,20 +84,20 @@
 #' @importFrom vctrs vec_cast
 #' @importFrom dplyr mutate group_by arrange filter if_any all_of across pull pick c_across
 #'   everything ungroup summarize if_else %>%
-revision_summary <- function(epi_arch,
-                             ...,
-                             drop_nas = TRUE,
-                             print_inform = TRUE,
-                             min_waiting_period = as.difftime(60, units = "days") %>%
-                               difftime_approx_ceiling_time_delta(epi_arch$time_type),
-                             within_latest = 0.2,
-                             quick_revision = as.difftime(3, units = "days") %>%
-                               difftime_approx_ceiling_time_delta(epi_arch$time_type),
-                             few_revisions = 3,
-                             abs_spread_threshold = NULL,
-                             rel_spread_threshold = 0.1,
-                             compactify = TRUE,
-                             compactify_abs_tol = 0) {
+revision_analysis <- function(epi_arch,
+                              ...,
+                              drop_nas = TRUE,
+                              min_waiting_period = as.difftime(60, units = "days") %>%
+                                difftime_approx_ceiling_time_delta(epi_arch$time_type),
+                              within_latest = 0.2,
+                              quick_revision = as.difftime(3, units = "days") %>%
+                                difftime_approx_ceiling_time_delta(epi_arch$time_type),
+                              few_revisions = 3,
+                              abs_spread_threshold = NULL,
+                              rel_spread_threshold = 0.1,
+                              compactify = TRUE,
+                              compactify_abs_tol = 0,
+                              return_only_tibble = FALSE) {
   assert_class(epi_arch, "epi_archive")
   # if the column to summarize isn't specified, use the only one if there is only one
   if (dots_n(...) == 0) {
@@ -193,63 +191,85 @@ revision_summary <- function(epi_arch,
       time_value, geo_value, all_of(epikey_names), n_revisions, min_lag, max_lag, # nolint: object_usage_linter
       lag_near_latest, spread, rel_spread, min_value, max_value, median_value # nolint: object_usage_linter
     )
-  if (print_inform) {
-    cli_inform("Min lag (time to first version):")
-    time_delta_summary(revision_behavior$min_lag, time_type) %>% print()
-    if (!drop_nas) {
-      total_na <- epi_arch$DT %>%
-        filter(is.na(c_across(!!arg))) %>% # nolint: object_usage_linter
-        nrow()
-      cli_inform("Fraction of all versions that are `NA`:")
-      cli_li(num_percent(total_na, nrow(epi_arch$DT), ""))
-      cli_inform("")
-    }
-    cli_inform("Fraction of epi_key+time_values with")
-    total_num <- nrow(revision_behavior) # nolint: object_usage_linter
-    total_num_unrevised <- sum(revision_behavior$n_revisions == 0) # nolint: object_usage_linter
-    cli_inform("No revisions:")
-    cli_li(num_percent(total_num_unrevised, total_num, ""))
-    total_quickly_revised <- sum( # nolint: object_usage_linter
-      time_delta_to_n_steps(revision_behavior$max_lag, time_type) <=
-        time_delta_to_n_steps(quick_revision, time_type)
-    )
-    cli_inform("Quick revisions (last revision within {format_time_delta(quick_revision, time_type)}
-                of the `time_value`):")
-    cli_li(num_percent(total_quickly_revised, total_num, ""))
-    total_barely_revised <- sum( # nolint: object_usage_linter
-      revision_behavior$n_revisions <=
-        few_revisions
-    )
-    cli_inform("Few revisions (At most {few_revisions} revisions for that `time_value`):")
-    cli_li(num_percent(total_barely_revised, total_num, ""))
-    cli_inform("")
-    cli_inform("Fraction of revised epi_key+time_values which have:")
-
-    real_revisions <- revision_behavior %>% filter(n_revisions > 0) # nolint: object_usage_linter
-    n_real_revised <- nrow(real_revisions) # nolint: object_usage_linter
-    rel_spread <- sum( # nolint: object_usage_linter
-      real_revisions$rel_spread <
-        rel_spread_threshold,
-      na.rm = TRUE
-    ) + sum(is.na(real_revisions$rel_spread))
-    cli_inform("Less than {rel_spread_threshold} spread in relative value:")
-    cli_li(num_percent(rel_spread, n_real_revised, ""))
-    abs_spread <- sum( # nolint: object_usage_linter
-      real_revisions$spread >
-        abs_spread_threshold
-    ) # nolint: object_usage_linter
-    cli_inform("Spread of more than {abs_spread_threshold} in actual value (when revised):")
-    cli_li(num_percent(abs_spread, n_real_revised, ""))
-
-    # time_type_unit_pluralizer[[time_type]] is a format string controlled by us
-    # and/or downstream devs, so we can paste it onto our format string safely:
-    units_plural <- pluralize(paste0("{qty(2)}", time_type_unit_pluralizer[[time_type]])) # nolint: object_usage_linter
-    cli_inform("{toTitleCase(units_plural)} until within {within_latest*100}% of the latest value:")
-    time_delta_summary(revision_behavior[["lag_near_latest"]], time_type) %>% print()
+  total_na <- epi_arch$DT %>%
+    filter(is.na(c_across(!!arg))) %>% # nolint: object_usage_linter
+    nrow()
+  if (!return_only_tibble) {
+    revision_behavior <- structure(list(
+      revision_behavior = revision_behavior,
+      range_time_values = range(epi_arch$DT$time_value), 
+      signal_variable = arg,
+      drop_nas = drop_nas,
+      time_type = time_type,
+      total_na = total_na,
+      n_obs = nrow(epi_arch$DT),
+      quick_revision = quick_revision,
+      few_revisions = few_revisions,
+      rel_spread_threshold = rel_spread_threshold,
+      abs_spread_threshold = abs_spread_threshold,
+      within_latest = within_latest
+    ), class = "revision_behavior")
   }
   return(revision_behavior)
 }
 
+#' @export
+print.revision_behavior <- function(x, ...) {
+  cli::cli_h2("An epi_archive spanning {.val {x$range_time_values[1]}} to {.val {x$range_time_values[1]}}.")
+  cli::cli_h3("Min lag (time to first version):")
+  time_delta_summary(x$revision_behavior$min_lag, x$time_type) %>% print()
+  if (!x$drop_nas) {
+    cli_inform("Fraction of all versions that are `NA`:")
+    cli_li(num_percent(x$total_na, x$n_obs, ""))
+    cli_inform("")
+  }
+  cli::cli_h3("Fraction of epi_key + time_values with")
+  total_num <- nrow(x$revision_behavior) # nolint: object_usage_linter
+  total_num_unrevised <- sum(x$n_revisions == 0) # nolint: object_usage_linter
+  cli_inform("No revisions:")
+  cli_li(num_percent(total_num_unrevised, total_num, ""))
+  total_quickly_revised <- sum( # nolint: object_usage_linter
+    time_delta_to_n_steps(x$revision_behavior$max_lag, x$time_type) <=
+      time_delta_to_n_steps(x$quick_revision, x$time_type)
+  )
+  cli_inform("Quick revisions (last revision within {format_time_delta(x$quick_revision, x$time_type)}
+                of the `time_value`):")
+  cli_li(num_percent(total_quickly_revised, total_num, ""))
+  total_barely_revised <- sum( # nolint: object_usage_linter
+    x$n_revisions <= x$few_revisions
+  )
+  cli_inform("Few revisions (At most {x$few_revisions} revisions for that `time_value`):")
+  cli_li(num_percent(total_barely_revised, total_num, ""))
+  
+  cli::cli_h3("Fraction of revised epi_key + time_values which have:")
+  
+  real_revisions <- x$revision_behavior %>% filter(n_revisions > 0) # nolint: object_usage_linter
+  n_real_revised <- nrow(real_revisions) # nolint: object_usage_linter
+  rel_spread <- sum( # nolint: object_usage_linter
+    real_revisions$rel_spread <
+      x$rel_spread_threshold,
+    na.rm = TRUE
+  ) + sum(is.na(real_revisions$rel_spread))
+  cli_inform("Less than {x$rel_spread_threshold} spread in relative value:")
+  cli_li(num_percent(rel_spread, n_real_revised, ""))
+  abs_spread <- sum( # nolint: object_usage_linter
+    real_revisions$spread >
+      x$abs_spread_threshold
+  ) # nolint: object_usage_linter
+  cli_inform("Spread of more than {x$abs_spread_threshold} in actual value (when revised):")
+  cli_li(num_percent(abs_spread, n_real_revised, ""))
+  
+  # time_type_unit_pluralizer[[time_type]] is a format string controlled by us
+  # and/or downstream devs, so we can paste it onto our format string safely:
+  units_plural <- pluralize(paste0("{qty(2)}", time_type_unit_pluralizer[[x$time_type]])) # nolint: object_usage_linter
+  cli::cli_h3("{toTitleCase(units_plural)} until within {x$within_latest*100}% of the latest value:")
+  time_delta_summary(x$revision_behavior[["lag_near_latest"]], x$time_type) %>% print()
+}
+
+#' @export
+#' @rdname revision_analysis
+revision_summary <- revision_analysis
+
 #' pull the value from lags when values starts indefinitely being within prop of its latest value.
 #' @param lags vector of lags; should be sorted
 #' @param values this should be a vector (e.g., a column) with length matching that of `lags`
diff --git a/man/epiprocess-package.Rd b/man/epiprocess-package.Rd
index fe79c01e8..12e8a3c42 100644
--- a/man/epiprocess-package.Rd
+++ b/man/epiprocess-package.Rd
@@ -11,6 +11,7 @@ This package introduces common data structures for working with epidemiological
 \seealso{
 Useful links:
 \itemize{
+  \item \url{https://github.com/cmu-delphi/epiprocess}
   \item \url{https://cmu-delphi.github.io/epiprocess/}
 }
 
diff --git a/man/revision_summary.Rd b/man/revision_analysis.Rd
similarity index 89%
rename from man/revision_summary.Rd
rename to man/revision_analysis.Rd
index 54abf098c..e15aa92a5 100644
--- a/man/revision_summary.Rd
+++ b/man/revision_analysis.Rd
@@ -1,14 +1,31 @@
 % Generated by roxygen2: do not edit by hand
 % Please edit documentation in R/revision_analysis.R
-\name{revision_summary}
+\name{revision_analysis}
+\alias{revision_analysis}
 \alias{revision_summary}
 \title{A function to describe revision behavior for an archive.}
 \usage{
+revision_analysis(
+  epi_arch,
+  ...,
+  drop_nas = TRUE,
+  min_waiting_period = as.difftime(60, units = "days") \%>\%
+    difftime_approx_ceiling_time_delta(epi_arch$time_type),
+  within_latest = 0.2,
+  quick_revision = as.difftime(3, units = "days") \%>\%
+    difftime_approx_ceiling_time_delta(epi_arch$time_type),
+  few_revisions = 3,
+  abs_spread_threshold = NULL,
+  rel_spread_threshold = 0.1,
+  compactify = TRUE,
+  compactify_abs_tol = 0,
+  return_only_tibble = FALSE
+)
+
 revision_summary(
   epi_arch,
   ...,
   drop_nas = TRUE,
-  print_inform = TRUE,
   min_waiting_period = as.difftime(60, units = "days") \%>\%
     difftime_approx_ceiling_time_delta(epi_arch$time_type),
   within_latest = 0.2,
@@ -18,7 +35,8 @@ revision_summary(
   abs_spread_threshold = NULL,
   rel_spread_threshold = 0.1,
   compactify = TRUE,
-  compactify_abs_tol = 0
+  compactify_abs_tol = 0,
+  return_only_tibble = FALSE
 )
 }
 \arguments{
@@ -34,9 +52,6 @@ If supplied, \code{...} must select exactly one column.}
 sure there are no duplicate values from occasions when the signal is
 revised to \code{NA}, and then back to its immediately-preceding value.}
 
-\item{print_inform}{bool, determines whether to print summary information, or
-only return the full summary tibble}
-
 \item{min_waiting_period}{\code{difftime}, integer or \code{NULL}. Sets a cutoff: any
 time_values that have not had at least \code{min_waiting_period} to stabilize as
 of the \code{versions_end} are removed. \code{min_waiting_period} should characterize
@@ -113,7 +128,7 @@ produce incorrect results for some calculations, since week numbering
 contains jumps at year boundaries.
 }
 \examples{
-revision_example <- revision_summary(archive_cases_dv_subset, percent_cli)
+revision_example <- revision_analysis(archive_cases_dv_subset, percent_cli)
 revision_example \%>\% arrange(desc(spread))
 
 }

From cc64f70a62c625c31d6e1d12edac7dfa06247b87 Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Tue, 8 Apr 2025 16:24:19 -0700
Subject: [PATCH 079/117] adjust tests to match

---
 .../_snaps/revision-latency-functions.md      | 132 +++++++++++++-----
 .../test-revision-latency-functions.R         |  31 ++--
 2 files changed, 115 insertions(+), 48 deletions(-)

diff --git a/tests/testthat/_snaps/revision-latency-functions.md b/tests/testthat/_snaps/revision-latency-functions.md
index 4f2bfe269..e04e6c1f8 100644
--- a/tests/testthat/_snaps/revision-latency-functions.md
+++ b/tests/testthat/_snaps/revision-latency-functions.md
@@ -1,29 +1,41 @@
 # revision_summary works for dummy datasets
 
     Code
-      dummy_ex %>% revision_summary() %>% print(n = 10, width = 300)
+      rs1
     Message
-      Min lag (time to first version):
+      
+      -- An epi_archive spanning 2020-01-01 to 2020-01-01. --
+      
+      -- Min lag (time to first version): 
     Output
            min median     mean    max
         0 days 1 days 1.6 days 4 days
     Message
-      Fraction of epi_key+time_values with
+      
+      -- Fraction of epi_key + time_values with 
       No revisions:
-      * 3 out of 7 (42.86%)
+      * 0 out of 7 (0%)
       Quick revisions (last revision within 3 days of the `time_value`):
       * 4 out of 7 (57.14%)
       Few revisions (At most 3 revisions for that `time_value`):
-      * 6 out of 7 (85.71%)
-      Fraction of revised epi_key+time_values which have:
+      * 0 out of 7 (0%)
+      
+      -- Fraction of revised epi_key + time_values which have: 
       Less than 0.1 spread in relative value:
       * 1 out of 4 (25%)
       Spread of more than 5.1 in actual value (when revised):
       * 3 out of 4 (75%)
-      Days until within 20% of the latest value:
+      
+      -- Days until within 20% of the latest value: 
     Output
            min median     mean     max
         0 days 3 days 6.9 days 19 days
+
+---
+
+    Code
+      rs1$revision_behavior %>% print(n = 10, width = 300)
+    Output
       # A tibble: 7 x 11
         time_value geo_value n_revisions min_lag max_lag lag_near_latest spread
         <date>     <chr>           <dbl> <drtn>  <drtn>  <drtn>           <dbl>
@@ -47,31 +59,43 @@
 ---
 
     Code
-      dummy_ex %>% revision_summary(drop_nas = FALSE) %>% print(n = 10, width = 300)
+      rs2
     Message
-      Min lag (time to first version):
+      
+      -- An epi_archive spanning 2020-01-01 to 2020-01-01. --
+      
+      -- Min lag (time to first version): 
     Output
            min median     mean    max
         0 days 1 days 1.4 days 4 days
     Message
       Fraction of all versions that are `NA`:
       * 2 out of 19 (10.53%)
-      Fraction of epi_key+time_values with
+      
+      -- Fraction of epi_key + time_values with 
       No revisions:
-      * 2 out of 7 (28.57%)
+      * 0 out of 7 (0%)
       Quick revisions (last revision within 3 days of the `time_value`):
       * 4 out of 7 (57.14%)
       Few revisions (At most 3 revisions for that `time_value`):
-      * 6 out of 7 (85.71%)
-      Fraction of revised epi_key+time_values which have:
+      * 0 out of 7 (0%)
+      
+      -- Fraction of revised epi_key + time_values which have: 
       Less than 0.1 spread in relative value:
       * 2 out of 5 (40%)
       Spread of more than 5.1 in actual value (when revised):
       * 3 out of 5 (60%)
-      Days until within 20% of the latest value:
+      
+      -- Days until within 20% of the latest value: 
     Output
            min median     mean     max
         0 days 3 days 6.9 days 19 days
+
+---
+
+    Code
+      rs2$revision_behavior %>% print(n = 10, width = 300)
+    Output
       # A tibble: 7 x 11
         time_value geo_value n_revisions min_lag max_lag lag_near_latest spread
         <date>     <chr>           <dbl> <drtn>  <drtn>  <drtn>           <dbl>
@@ -95,31 +119,43 @@
 ---
 
     Code
-      dummy_ex_weekly %>% revision_summary(drop_nas = FALSE) %>% print(n = 10, width = 300)
+      rs3
     Message
-      Min lag (time to first version):
+      
+      -- An epi_archive spanning 2020-01-01 to 2020-01-01. --
+      
+      -- Min lag (time to first version): 
     Output
             min  median      mean     max
         0 weeks 1 weeks 1.4 weeks 4 weeks
     Message
       Fraction of all versions that are `NA`:
       * 2 out of 19 (10.53%)
-      Fraction of epi_key+time_values with
+      
+      -- Fraction of epi_key + time_values with 
       No revisions:
-      * 2 out of 7 (28.57%)
+      * 0 out of 7 (0%)
       Quick revisions (last revision within 1 week of the `time_value`):
       * 2 out of 7 (28.57%)
       Few revisions (At most 3 revisions for that `time_value`):
-      * 6 out of 7 (85.71%)
-      Fraction of revised epi_key+time_values which have:
+      * 0 out of 7 (0%)
+      
+      -- Fraction of revised epi_key + time_values which have: 
       Less than 0.1 spread in relative value:
       * 2 out of 5 (40%)
       Spread of more than 5.1 in actual value (when revised):
       * 3 out of 5 (60%)
-      Weeks until within 20% of the latest value:
+      
+      -- Weeks until within 20% of the latest value: 
     Output
             min  median      mean      max
         0 weeks 3 weeks 6.9 weeks 19 weeks
+
+---
+
+    Code
+      rs3$revision_behavior %>% print(n = 10, width = 300)
+    Output
       # A tibble: 7 x 11
         time_value geo_value n_revisions min_lag max_lag  lag_near_latest spread
         <date>     <chr>           <dbl> <drtn>  <drtn>   <drtn>           <dbl>
@@ -143,32 +179,43 @@
 ---
 
     Code
-      dummy_ex_yearmonthly %>% revision_summary(drop_nas = FALSE) %>% print(n = 10,
-        width = 300)
+      rs4
     Message
-      Min lag (time to first version):
+      
+      -- An epi_archive spanning 2020 Jan to 2020 Jan. --
+      
+      -- Min lag (time to first version): 
     Output
         min median mean max
           0      1  1.4   4
     Message
       Fraction of all versions that are `NA`:
       * 2 out of 19 (10.53%)
-      Fraction of epi_key+time_values with
+      
+      -- Fraction of epi_key + time_values with 
       No revisions:
-      * 2 out of 7 (28.57%)
+      * 0 out of 7 (0%)
       Quick revisions (last revision within 1 month of the `time_value`):
       * 2 out of 7 (28.57%)
       Few revisions (At most 3 revisions for that `time_value`):
-      * 6 out of 7 (85.71%)
-      Fraction of revised epi_key+time_values which have:
+      * 0 out of 7 (0%)
+      
+      -- Fraction of revised epi_key + time_values which have: 
       Less than 0.1 spread in relative value:
       * 2 out of 5 (40%)
       Spread of more than 5.1 in actual value (when revised):
       * 3 out of 5 (60%)
-      Months until within 20% of the latest value:
+      
+      -- Months until within 20% of the latest value: 
     Output
         min median mean max
           0      3  6.9  19
+
+---
+
+    Code
+      rs4$revision_behavior %>% print(n = 10, width = 300)
+    Output
       # A tibble: 7 x 11
         time_value geo_value n_revisions min_lag max_lag lag_near_latest spread
              <mth> <chr>           <dbl>   <dbl>   <dbl>           <dbl>  <dbl>
@@ -192,32 +239,43 @@
 ---
 
     Code
-      dummy_ex_integerly %>% revision_summary(min_waiting_period = 60,
-        quick_revision = 3, drop_nas = FALSE) %>% print(n = 10, width = 300)
+      rs5
     Message
-      Min lag (time to first version):
+      
+      -- An epi_archive spanning 1 to 1. --
+      
+      -- Min lag (time to first version): 
     Output
         min median mean max
           0      1  1.4   4
     Message
       Fraction of all versions that are `NA`:
       * 2 out of 19 (10.53%)
-      Fraction of epi_key+time_values with
+      
+      -- Fraction of epi_key + time_values with 
       No revisions:
-      * 2 out of 7 (28.57%)
+      * 0 out of 7 (0%)
       Quick revisions (last revision within 3 time steps of the `time_value`):
       * 4 out of 7 (57.14%)
       Few revisions (At most 3 revisions for that `time_value`):
-      * 6 out of 7 (85.71%)
-      Fraction of revised epi_key+time_values which have:
+      * 0 out of 7 (0%)
+      
+      -- Fraction of revised epi_key + time_values which have: 
       Less than 0.1 spread in relative value:
       * 2 out of 5 (40%)
       Spread of more than 5.1 in actual value (when revised):
       * 3 out of 5 (60%)
-      Time Steps until within 20% of the latest value:
+      
+      -- Time Steps until within 20% of the latest value: 
     Output
         min median mean max
           0      3  6.9  19
+
+---
+
+    Code
+      rs5$revision_behavior %>% print(n = 10, width = 300)
+    Output
       # A tibble: 7 x 11
         time_value geo_value n_revisions min_lag max_lag lag_near_latest spread
              <dbl> <chr>           <dbl>   <dbl>   <dbl>           <dbl>  <dbl>
diff --git a/tests/testthat/test-revision-latency-functions.R b/tests/testthat/test-revision-latency-functions.R
index b636bf32e..031cf4ac3 100644
--- a/tests/testthat/test-revision-latency-functions.R
+++ b/tests/testthat/test-revision-latency-functions.R
@@ -64,27 +64,36 @@ dummy_ex_integerly <- dummy_ex$DT %>%
 stopifnot(dummy_ex_integerly$time_type == "integer")
 
 test_that("revision_summary works for dummy datasets", {
-  expect_snapshot(dummy_ex %>% revision_summary() %>% print(n = 10, width = 300))
-  expect_snapshot(dummy_ex %>% revision_summary(drop_nas = FALSE) %>% print(n = 10, width = 300))
+  rs1 <- dummy_ex %>% revision_summary()
+  rs2 <- dummy_ex %>% revision_summary(drop_nas = FALSE)
+  expect_snapshot(rs1)
+  expect_snapshot(rs1$revision_behavior %>% print(n = 10, width = 300))
+  expect_snapshot(rs2)
+  expect_snapshot(rs2$revision_behavior %>% print(n = 10, width = 300))
 
   # Weekly dummy is mostly just "day" -> "week", but quick-revision summary changes:
-  expect_snapshot(dummy_ex_weekly %>% revision_summary(drop_nas = FALSE) %>% print(n = 10, width = 300))
+  rs3 <- dummy_ex_weekly %>% revision_summary(drop_nas = FALSE)
+  expect_snapshot(rs3)
+  expect_snapshot(rs3$revision_behavior %>% print(n = 10, width = 300))
   # Yearmonthly has the same story. It would have been close to encountering
   # min_waiting_period-based filtering but we actually set its versions_end to
   # sometime in 2080 rather than 2022:
-  expect_snapshot(dummy_ex_yearmonthly %>% revision_summary(drop_nas = FALSE) %>% print(n = 10, width = 300))
+  rs4 <- dummy_ex_yearmonthly %>% revision_summary(drop_nas = FALSE) 
+  expect_snapshot(rs4)
+  expect_snapshot(rs4$revision_behavior %>% print(n = 10, width = 300))
   # Integer is very much like daily. We have to provide some of the
   # configuration arguments since we have no idea about what the integers
   # represent. If the possible integers being used have large jumps like
   # YYYYww-as-integer epiweek labels (e.g., 200053 jumps to 200101) or are
   # regularly spaced apart but by more than 1, we'll still be producing
   # something nonsensical, but we tried.
-  expect_snapshot(dummy_ex_integerly %>%
+  rs5 <- dummy_ex_integerly %>%
     revision_summary(
       min_waiting_period = 60, quick_revision = 3,
       drop_nas = FALSE
-    ) %>%
-    print(n = 10, width = 300))
+    ) 
+  expect_snapshot(rs5)
+  expect_snapshot(rs5$revision_behavior %>% print(n = 10, width = 300))
 })
 
 test_that("tidyselect is functional", {
@@ -133,7 +142,7 @@ test_that("revision_summary default min_waiting_period works as expected", {
       value = 1:2
     ) %>%
       as_epi_archive(versions_end = as.Date("2020-01-01") + 1 + 59) %>%
-      revision_summary(print_inform = FALSE) %>%
+      revision_summary(return_only_tibble = TRUE) %>%
       pull(time_value),
     as.Date("2020-01-01")
   )
@@ -146,7 +155,7 @@ test_that("revision_summary default min_waiting_period works as expected", {
       value = 1:2
     ) %>%
       as_epi_archive(versions_end = as.Date("2020-01-01") + 7 + 56) %>%
-      revision_summary(print_inform = FALSE) %>%
+      revision_summary(return_only_tibble = TRUE) %>%
       pull(time_value),
     as.Date("2020-01-01")
   )
@@ -159,7 +168,7 @@ test_that("revision_summary default min_waiting_period works as expected", {
       value = 1:2
     ) %>%
       as_epi_archive(versions_end = tsibble::make_yearmonth(2000, 3)) %>%
-      revision_summary(print_inform = FALSE) %>%
+      revision_summary(return_only_tibble = TRUE) %>%
       pull(time_value),
     tsibble::make_yearmonth(2000, 1)
   )
@@ -172,7 +181,7 @@ test_that("revision_summary default min_waiting_period works as expected", {
       value = 1:2
     ) %>%
       as_epi_archive(versions_end = 1 + 1 + 59) %>%
-      revision_summary(print_inform = FALSE),
+      revision_summary(return_only_tibble = TRUE),
     regexp = "Unsupported time_type"
   )
 })

From 03cf6e3967891c1ca29d5c6058ec4350e6e23333 Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Tue, 8 Apr 2025 16:33:25 -0700
Subject: [PATCH 080/117] describe changes in news and redocument

---
 DESCRIPTION              | 2 +-
 NEWS.md                  | 9 +++++++--
 R/revision_analysis.R    | 6 ++++++
 man/revision_analysis.Rd | 7 +++++++
 4 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/DESCRIPTION b/DESCRIPTION
index f75879e97..1b39d8b5b 100755
--- a/DESCRIPTION
+++ b/DESCRIPTION
@@ -1,7 +1,7 @@
 Type: Package
 Package: epiprocess
 Title: Tools for basic signal processing in epidemiology
-Version: 0.11.3
+Version: 0.11.5
 Authors@R: c(
     person("Jacob", "Bien", role = "ctb"),
     person("Logan", "Brooks", , "lcbrooks+github@andrew.cmu.edu", role = c("aut", "cre")),
diff --git a/NEWS.md b/NEWS.md
index 88acabee4..49685e7c2 100644
--- a/NEWS.md
+++ b/NEWS.md
@@ -22,7 +22,7 @@ Pre-1.0.0 numbering scheme: 0.x will indicate releases, while 0.x.y will indicat
   for users without a compiler, we have placed `{trendfilter}` in Suggests:; if
   you want to use `method = "trendfilter"` you will need to manually install
   this dependency (e.g., with `remotes::install_github("glmgen/trendfilter")`).
-- In `revision_summary()`:
+- In `revision_summary()` (aliased to `revision_analysis()`):
   - The `should_compactify` argument is now called `compactify`. To migrate,
     change any calls with `should_compactfiy =` to `compactify =`.
   - Output now uses the name `lag_near_latest` instead of `time_near_latest`. To
@@ -36,6 +36,11 @@ Pre-1.0.0 numbering scheme: 0.x will indicate releases, while 0.x.y will indicat
   - `min_waiting_period` now defines a nonstrict inequality instead of a strict
     one. To obtain the old bounds, bump the `min_waiting_period` up to the next
     possible value for your `time_type`.
+  - Added a `print()` method to take the place of `print_inform`.
+  - Removed the the argument `print_inform`. This is now always false. Replaced
+    with an option, `return_only_tibble` to return only the tibble of results 
+    rather than the full S3 object.
+    
 - In `key_colnames()`:
   - On regular (non-`epi_df`) data frames, now requires manual specification of
     `geo_keys`, `other_keys`, and `time_keys`.
@@ -53,7 +58,7 @@ Pre-1.0.0 numbering scheme: 0.x will indicate releases, while 0.x.y will indicat
   `.facet_filter` which allows explicit selection of facets to show.
 
 ## Improvements
-- `revision_summary()` now supports all `time_type`s.
+- `revision_summary()` now supports all `time_type`s. Printed summary is more attractive.
 - The compactification tolerance setting now works with integer-type columns.
 - Various functions are now faster, using faster variants of core operations and
   avoiding reconstructing grouped `epi_df`s when unnecessary.
diff --git a/R/revision_analysis.R b/R/revision_analysis.R
index 705d244b3..49d5b3d30 100644
--- a/R/revision_analysis.R
+++ b/R/revision_analysis.R
@@ -73,6 +73,12 @@
 #'   Using a `time_type` of `"integer"` with week numbers like 202501 will
 #'   produce incorrect results for some calculations, since week numbering
 #'   contains jumps at year boundaries.
+#' 
+#' @return An S3 object with class `revision_behavior`. This function is typically
+#'   called for the purposes of inspecting the printed output. The
+#'   results of the computations are available in 
+#'   `revision_analysis(...)$revision_behavior`. If you only want to access
+#'   the internal computations, use `return_only_tibble = TRUE`.
 #'
 #' @examples
 #' revision_example <- revision_analysis(archive_cases_dv_subset, percent_cli)
diff --git a/man/revision_analysis.Rd b/man/revision_analysis.Rd
index e15aa92a5..fc8ef217b 100644
--- a/man/revision_analysis.Rd
+++ b/man/revision_analysis.Rd
@@ -91,6 +91,13 @@ requested signal. The default is \code{TRUE}.}
 \item{compactify_abs_tol}{length-1 double, used if \code{compactify} is \code{TRUE}, it
 determines the threshold for when two doubles are considered identical.}
 }
+\value{
+An S3 object with class \code{revision_behavior}. This function is typically
+called for the purposes of inspecting the printed output. The
+results of the computations are available in
+\code{revision_analysis(...)$revision_behavior}. If you only want to access
+the internal computations, use \code{return_only_tibble = TRUE}.
+}
 \description{
 \code{revision_summary} removes all missing values (if requested), and then
 computes some basic statistics about the revision behavior of an archive,

From 226ba492475e0476ac5038ae714ae598a2111a30 Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Tue, 8 Apr 2025 17:05:56 -0700
Subject: [PATCH 081/117] fix: doc missing arg, fix vignette

---
 R/revision_analysis.R     | 2 ++
 man/revision_analysis.Rd  | 3 +++
 vignettes/epi_archive.Rmd | 7 ++++---
 3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/R/revision_analysis.R b/R/revision_analysis.R
index 49d5b3d30..25e45bbeb 100644
--- a/R/revision_analysis.R
+++ b/R/revision_analysis.R
@@ -65,6 +65,8 @@
 #'   requested signal. The default is `TRUE`.
 #' @param compactify_abs_tol length-1 double, used if `compactify` is `TRUE`, it
 #'   determines the threshold for when two doubles are considered identical.
+#' @param return_only_tibble boolean to return only the simple `tibble` of
+#'   computational results rather than the complete S3 object.
 #'
 #' @details Applies to `epi_archive`s with `time_type`s of `"day"`, `"week"`,
 #'   and `"yearmonth"`. It can also work with a `time_type` of `"integer"` if
diff --git a/man/revision_analysis.Rd b/man/revision_analysis.Rd
index fc8ef217b..89683afb8 100644
--- a/man/revision_analysis.Rd
+++ b/man/revision_analysis.Rd
@@ -90,6 +90,9 @@ requested signal. The default is \code{TRUE}.}
 
 \item{compactify_abs_tol}{length-1 double, used if \code{compactify} is \code{TRUE}, it
 determines the threshold for when two doubles are considered identical.}
+
+\item{return_only_tibble}{boolean to return only the simple \code{tibble} of
+computational results rather than the complete S3 object.}
 }
 \value{
 An S3 object with class \code{revision_behavior}. This function is typically
diff --git a/vignettes/epi_archive.Rmd b/vignettes/epi_archive.Rmd
index f87ea2915..cae40dfa6 100644
--- a/vignettes/epi_archive.Rmd
+++ b/vignettes/epi_archive.Rmd
@@ -162,7 +162,8 @@ addition to the per key summary, it also returns an overall summary. Here is an
 a sample of the output:
 
 ```{r}
-revision_details <- revision_summary(dv_archive, print_inform = TRUE)
+revision_details <- revision_summary(dv_archive)
+revision_details
 ```
 
 We can see from the output that, as mentioned above, this data set has a lot of
@@ -174,9 +175,9 @@ inspect the returned `revision_details` tibble. Here we collect a number of
 statistics for each state:
 
 ```{r}
-revision_details %>%
+revision_details$revision_behavior %>%
   group_by(geo_value) %>%
-  summarise(
+  summarize(
     n_rev = mean(n_revisions),
     min_lag = min(min_lag),
     max_lag = max(max_lag),

From 91bbde3ef4c362d59b1e5b90c4c96fa53fd52a6f Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Tue, 8 Apr 2025 17:31:53 -0700
Subject: [PATCH 082/117] pass local checks

---
 R/revision_analysis.R    | 2 +-
 man/revision_analysis.Rd | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/R/revision_analysis.R b/R/revision_analysis.R
index 25e45bbeb..77f5ebfae 100644
--- a/R/revision_analysis.R
+++ b/R/revision_analysis.R
@@ -84,7 +84,7 @@
 #'
 #' @examples
 #' revision_example <- revision_analysis(archive_cases_dv_subset, percent_cli)
-#' revision_example %>% arrange(desc(spread))
+#' revision_example$revision_behavior %>% arrange(desc(spread))
 #'
 #' @export
 #' @importFrom cli cli_inform cli_abort cli_li
diff --git a/man/revision_analysis.Rd b/man/revision_analysis.Rd
index 89683afb8..737e5ec0a 100644
--- a/man/revision_analysis.Rd
+++ b/man/revision_analysis.Rd
@@ -139,6 +139,6 @@ contains jumps at year boundaries.
 }
 \examples{
 revision_example <- revision_analysis(archive_cases_dv_subset, percent_cli)
-revision_example \%>\% arrange(desc(spread))
+revision_example$revision_behavior \%>\% arrange(desc(spread))
 
 }

From 68f8c500c93781ad80572ee73ceef32eb165d763 Mon Sep 17 00:00:00 2001
From: dsweber2 <david.weber2@pm.me>
Date: Wed, 9 Apr 2025 10:56:54 -0500
Subject: [PATCH 083/117] formatting

---
 R/revision_analysis.R | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/R/revision_analysis.R b/R/revision_analysis.R
index 77f5ebfae..f33cdd827 100644
--- a/R/revision_analysis.R
+++ b/R/revision_analysis.R
@@ -75,10 +75,10 @@
 #'   Using a `time_type` of `"integer"` with week numbers like 202501 will
 #'   produce incorrect results for some calculations, since week numbering
 #'   contains jumps at year boundaries.
-#' 
+#'
 #' @return An S3 object with class `revision_behavior`. This function is typically
 #'   called for the purposes of inspecting the printed output. The
-#'   results of the computations are available in 
+#'   results of the computations are available in
 #'   `revision_analysis(...)$revision_behavior`. If you only want to access
 #'   the internal computations, use `return_only_tibble = TRUE`.
 #'
@@ -205,7 +205,7 @@ revision_analysis <- function(epi_arch,
   if (!return_only_tibble) {
     revision_behavior <- structure(list(
       revision_behavior = revision_behavior,
-      range_time_values = range(epi_arch$DT$time_value), 
+      range_time_values = range(epi_arch$DT$time_value),
       signal_variable = arg,
       drop_nas = drop_nas,
       time_type = time_type,
@@ -248,9 +248,9 @@ print.revision_behavior <- function(x, ...) {
   )
   cli_inform("Few revisions (At most {x$few_revisions} revisions for that `time_value`):")
   cli_li(num_percent(total_barely_revised, total_num, ""))
-  
+
   cli::cli_h3("Fraction of revised epi_key + time_values which have:")
-  
+
   real_revisions <- x$revision_behavior %>% filter(n_revisions > 0) # nolint: object_usage_linter
   n_real_revised <- nrow(real_revisions) # nolint: object_usage_linter
   rel_spread <- sum( # nolint: object_usage_linter
@@ -266,7 +266,7 @@ print.revision_behavior <- function(x, ...) {
   ) # nolint: object_usage_linter
   cli_inform("Spread of more than {x$abs_spread_threshold} in actual value (when revised):")
   cli_li(num_percent(abs_spread, n_real_revised, ""))
-  
+
   # time_type_unit_pluralizer[[time_type]] is a format string controlled by us
   # and/or downstream devs, so we can paste it onto our format string safely:
   units_plural <- pluralize(paste0("{qty(2)}", time_type_unit_pluralizer[[x$time_type]])) # nolint: object_usage_linter

From 496178a49fdfd9ec288471a660abed7dc329ebd2 Mon Sep 17 00:00:00 2001
From: dsweber2 <david.weber2@pm.me>
Date: Wed, 9 Apr 2025 11:29:25 -0500
Subject: [PATCH 084/117] accessing wrong object, renaming x, fixed snapshot

---
 R/revision_analysis.R                         | 41 ++++++++++---------
 .../_snaps/revision-latency-functions.md      | 20 ++++-----
 .../test-revision-latency-functions.R         |  4 +-
 3 files changed, 33 insertions(+), 32 deletions(-)

diff --git a/R/revision_analysis.R b/R/revision_analysis.R
index f33cdd827..941b6fb2d 100644
--- a/R/revision_analysis.R
+++ b/R/revision_analysis.R
@@ -222,56 +222,57 @@ revision_analysis <- function(epi_arch,
 }
 
 #' @export
-print.revision_behavior <- function(x, ...) {
-  cli::cli_h2("An epi_archive spanning {.val {x$range_time_values[1]}} to {.val {x$range_time_values[1]}}.")
+print.revision_behavior <- function(rev_beh, ...) {
+  revision_behavior <- rev_beh$revision_behavior
+  cli::cli_h2("An epi_archive spanning {.val {rev_beh$range_time_values[1]}} to {.val {rev_beh$range_time_values[1]}}.")
   cli::cli_h3("Min lag (time to first version):")
-  time_delta_summary(x$revision_behavior$min_lag, x$time_type) %>% print()
-  if (!x$drop_nas) {
+  time_delta_summary(revision_behavior$min_lag, rev_beh$time_type) %>% print()
+  if (!rev_beh$drop_nas) {
     cli_inform("Fraction of all versions that are `NA`:")
-    cli_li(num_percent(x$total_na, x$n_obs, ""))
+    cli_li(num_percent(rev_beh$total_na, rev_beh$n_obs, ""))
     cli_inform("")
   }
   cli::cli_h3("Fraction of epi_key + time_values with")
-  total_num <- nrow(x$revision_behavior) # nolint: object_usage_linter
-  total_num_unrevised <- sum(x$n_revisions == 0) # nolint: object_usage_linter
+  total_num <- nrow(revision_behavior) # nolint: object_usage_linter
+  total_num_unrevised <- sum(revision_behavior$n_revisions == 0) # nolint: object_usage_linter
   cli_inform("No revisions:")
   cli_li(num_percent(total_num_unrevised, total_num, ""))
   total_quickly_revised <- sum( # nolint: object_usage_linter
-    time_delta_to_n_steps(x$revision_behavior$max_lag, x$time_type) <=
-      time_delta_to_n_steps(x$quick_revision, x$time_type)
+    time_delta_to_n_steps(revision_behavior$max_lag, rev_beh$time_type) <=
+      time_delta_to_n_steps(rev_beh$quick_revision, rev_beh$time_type)
   )
-  cli_inform("Quick revisions (last revision within {format_time_delta(x$quick_revision, x$time_type)}
+  cli_inform("Quick revisions (last revision within {format_time_delta(rev_beh$quick_revision, rev_beh$time_type)}
                 of the `time_value`):")
   cli_li(num_percent(total_quickly_revised, total_num, ""))
   total_barely_revised <- sum( # nolint: object_usage_linter
-    x$n_revisions <= x$few_revisions
+    revision_behavior$n_revisions <= rev_beh$few_revisions
   )
-  cli_inform("Few revisions (At most {x$few_revisions} revisions for that `time_value`):")
+  cli_inform("Few revisions (At most {rev_beh$few_revisions} revisions for that `time_value`):")
   cli_li(num_percent(total_barely_revised, total_num, ""))
 
   cli::cli_h3("Fraction of revised epi_key + time_values which have:")
 
-  real_revisions <- x$revision_behavior %>% filter(n_revisions > 0) # nolint: object_usage_linter
+  real_revisions <- revision_behavior %>% filter(n_revisions > 0) # nolint: object_usage_linter
   n_real_revised <- nrow(real_revisions) # nolint: object_usage_linter
   rel_spread <- sum( # nolint: object_usage_linter
     real_revisions$rel_spread <
-      x$rel_spread_threshold,
+      rev_beh$rel_spread_threshold,
     na.rm = TRUE
   ) + sum(is.na(real_revisions$rel_spread))
-  cli_inform("Less than {x$rel_spread_threshold} spread in relative value:")
+  cli_inform("Less than {rev_beh$rel_spread_threshold} spread in relative value:")
   cli_li(num_percent(rel_spread, n_real_revised, ""))
   abs_spread <- sum( # nolint: object_usage_linter
     real_revisions$spread >
-      x$abs_spread_threshold
+      rev_beh$abs_spread_threshold
   ) # nolint: object_usage_linter
-  cli_inform("Spread of more than {x$abs_spread_threshold} in actual value (when revised):")
+  cli_inform("Spread of more than {rev_beh$abs_spread_threshold} in actual value (when revised):")
   cli_li(num_percent(abs_spread, n_real_revised, ""))
 
   # time_type_unit_pluralizer[[time_type]] is a format string controlled by us
   # and/or downstream devs, so we can paste it onto our format string safely:
-  units_plural <- pluralize(paste0("{qty(2)}", time_type_unit_pluralizer[[x$time_type]])) # nolint: object_usage_linter
-  cli::cli_h3("{toTitleCase(units_plural)} until within {x$within_latest*100}% of the latest value:")
-  time_delta_summary(x$revision_behavior[["lag_near_latest"]], x$time_type) %>% print()
+  units_plural <- pluralize(paste0("{qty(2)}", time_type_unit_pluralizer[[rev_beh$time_type]])) # nolint: object_usage_linter
+  cli::cli_h3("{toTitleCase(units_plural)} until within {rev_beh$within_latest*100}% of the latest value:")
+  time_delta_summary(revision_behavior[["lag_near_latest"]], rev_beh$time_type) %>% print()
 }
 
 #' @export
diff --git a/tests/testthat/_snaps/revision-latency-functions.md b/tests/testthat/_snaps/revision-latency-functions.md
index e04e6c1f8..af4f5c20c 100644
--- a/tests/testthat/_snaps/revision-latency-functions.md
+++ b/tests/testthat/_snaps/revision-latency-functions.md
@@ -14,11 +14,11 @@
       
       -- Fraction of epi_key + time_values with 
       No revisions:
-      * 0 out of 7 (0%)
+      * 3 out of 7 (42.86%)
       Quick revisions (last revision within 3 days of the `time_value`):
       * 4 out of 7 (57.14%)
       Few revisions (At most 3 revisions for that `time_value`):
-      * 0 out of 7 (0%)
+      * 6 out of 7 (85.71%)
       
       -- Fraction of revised epi_key + time_values which have: 
       Less than 0.1 spread in relative value:
@@ -74,11 +74,11 @@
       
       -- Fraction of epi_key + time_values with 
       No revisions:
-      * 0 out of 7 (0%)
+      * 2 out of 7 (28.57%)
       Quick revisions (last revision within 3 days of the `time_value`):
       * 4 out of 7 (57.14%)
       Few revisions (At most 3 revisions for that `time_value`):
-      * 0 out of 7 (0%)
+      * 6 out of 7 (85.71%)
       
       -- Fraction of revised epi_key + time_values which have: 
       Less than 0.1 spread in relative value:
@@ -134,11 +134,11 @@
       
       -- Fraction of epi_key + time_values with 
       No revisions:
-      * 0 out of 7 (0%)
+      * 2 out of 7 (28.57%)
       Quick revisions (last revision within 1 week of the `time_value`):
       * 2 out of 7 (28.57%)
       Few revisions (At most 3 revisions for that `time_value`):
-      * 0 out of 7 (0%)
+      * 6 out of 7 (85.71%)
       
       -- Fraction of revised epi_key + time_values which have: 
       Less than 0.1 spread in relative value:
@@ -194,11 +194,11 @@
       
       -- Fraction of epi_key + time_values with 
       No revisions:
-      * 0 out of 7 (0%)
+      * 2 out of 7 (28.57%)
       Quick revisions (last revision within 1 month of the `time_value`):
       * 2 out of 7 (28.57%)
       Few revisions (At most 3 revisions for that `time_value`):
-      * 0 out of 7 (0%)
+      * 6 out of 7 (85.71%)
       
       -- Fraction of revised epi_key + time_values which have: 
       Less than 0.1 spread in relative value:
@@ -254,11 +254,11 @@
       
       -- Fraction of epi_key + time_values with 
       No revisions:
-      * 0 out of 7 (0%)
+      * 2 out of 7 (28.57%)
       Quick revisions (last revision within 3 time steps of the `time_value`):
       * 4 out of 7 (57.14%)
       Few revisions (At most 3 revisions for that `time_value`):
-      * 0 out of 7 (0%)
+      * 6 out of 7 (85.71%)
       
       -- Fraction of revised epi_key + time_values which have: 
       Less than 0.1 spread in relative value:
diff --git a/tests/testthat/test-revision-latency-functions.R b/tests/testthat/test-revision-latency-functions.R
index 031cf4ac3..86d98f8eb 100644
--- a/tests/testthat/test-revision-latency-functions.R
+++ b/tests/testthat/test-revision-latency-functions.R
@@ -78,7 +78,7 @@ test_that("revision_summary works for dummy datasets", {
   # Yearmonthly has the same story. It would have been close to encountering
   # min_waiting_period-based filtering but we actually set its versions_end to
   # sometime in 2080 rather than 2022:
-  rs4 <- dummy_ex_yearmonthly %>% revision_summary(drop_nas = FALSE) 
+  rs4 <- dummy_ex_yearmonthly %>% revision_summary(drop_nas = FALSE)
   expect_snapshot(rs4)
   expect_snapshot(rs4$revision_behavior %>% print(n = 10, width = 300))
   # Integer is very much like daily. We have to provide some of the
@@ -91,7 +91,7 @@ test_that("revision_summary works for dummy datasets", {
     revision_summary(
       min_waiting_period = 60, quick_revision = 3,
       drop_nas = FALSE
-    ) 
+    )
   expect_snapshot(rs5)
   expect_snapshot(rs5$revision_behavior %>% print(n = 10, width = 300))
 })

From 1592fd8d397084eb1f0e0bbe07abb3a8c8f9a2f4 Mon Sep 17 00:00:00 2001
From: dsweber2 <david.weber2@pm.me>
Date: Wed, 9 Apr 2025 11:37:08 -0500
Subject: [PATCH 085/117] print needs name x

---
 R/revision_analysis.R | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/R/revision_analysis.R b/R/revision_analysis.R
index 941b6fb2d..785a18f1a 100644
--- a/R/revision_analysis.R
+++ b/R/revision_analysis.R
@@ -222,14 +222,14 @@ revision_analysis <- function(epi_arch,
 }
 
 #' @export
-print.revision_behavior <- function(rev_beh, ...) {
-  revision_behavior <- rev_beh$revision_behavior
+print.revision_behavior <- function(x, ...) {
+  revision_behavior <- x$revision_behavior
   cli::cli_h2("An epi_archive spanning {.val {rev_beh$range_time_values[1]}} to {.val {rev_beh$range_time_values[1]}}.")
   cli::cli_h3("Min lag (time to first version):")
-  time_delta_summary(revision_behavior$min_lag, rev_beh$time_type) %>% print()
-  if (!rev_beh$drop_nas) {
+  time_delta_summary(revision_behavior$min_lag, x$time_type) %>% print()
+  if (!x$drop_nas) {
     cli_inform("Fraction of all versions that are `NA`:")
-    cli_li(num_percent(rev_beh$total_na, rev_beh$n_obs, ""))
+    cli_li(num_percent(x$total_na, x$n_obs, ""))
     cli_inform("")
   }
   cli::cli_h3("Fraction of epi_key + time_values with")
@@ -238,14 +238,14 @@ print.revision_behavior <- function(rev_beh, ...) {
   cli_inform("No revisions:")
   cli_li(num_percent(total_num_unrevised, total_num, ""))
   total_quickly_revised <- sum( # nolint: object_usage_linter
-    time_delta_to_n_steps(revision_behavior$max_lag, rev_beh$time_type) <=
-      time_delta_to_n_steps(rev_beh$quick_revision, rev_beh$time_type)
+    time_delta_to_n_steps(revision_behavior$max_lag, x$time_type) <=
+      time_delta_to_n_steps(x$quick_revision, x$time_type)
   )
   cli_inform("Quick revisions (last revision within {format_time_delta(rev_beh$quick_revision, rev_beh$time_type)}
                 of the `time_value`):")
   cli_li(num_percent(total_quickly_revised, total_num, ""))
   total_barely_revised <- sum( # nolint: object_usage_linter
-    revision_behavior$n_revisions <= rev_beh$few_revisions
+    revision_behavior$n_revisions <= x$few_revisions
   )
   cli_inform("Few revisions (At most {rev_beh$few_revisions} revisions for that `time_value`):")
   cli_li(num_percent(total_barely_revised, total_num, ""))
@@ -256,23 +256,23 @@ print.revision_behavior <- function(rev_beh, ...) {
   n_real_revised <- nrow(real_revisions) # nolint: object_usage_linter
   rel_spread <- sum( # nolint: object_usage_linter
     real_revisions$rel_spread <
-      rev_beh$rel_spread_threshold,
+      x$rel_spread_threshold,
     na.rm = TRUE
   ) + sum(is.na(real_revisions$rel_spread))
   cli_inform("Less than {rev_beh$rel_spread_threshold} spread in relative value:")
   cli_li(num_percent(rel_spread, n_real_revised, ""))
   abs_spread <- sum( # nolint: object_usage_linter
     real_revisions$spread >
-      rev_beh$abs_spread_threshold
+      x$abs_spread_threshold
   ) # nolint: object_usage_linter
   cli_inform("Spread of more than {rev_beh$abs_spread_threshold} in actual value (when revised):")
   cli_li(num_percent(abs_spread, n_real_revised, ""))
 
   # time_type_unit_pluralizer[[time_type]] is a format string controlled by us
   # and/or downstream devs, so we can paste it onto our format string safely:
-  units_plural <- pluralize(paste0("{qty(2)}", time_type_unit_pluralizer[[rev_beh$time_type]])) # nolint: object_usage_linter
+  units_plural <- pluralize(paste0("{qty(2)}", time_type_unit_pluralizer[[x$time_type]])) # nolint: object_usage_linter
   cli::cli_h3("{toTitleCase(units_plural)} until within {rev_beh$within_latest*100}% of the latest value:")
-  time_delta_summary(revision_behavior[["lag_near_latest"]], rev_beh$time_type) %>% print()
+  time_delta_summary(revision_behavior[["lag_near_latest"]], x$time_type) %>% print()
 }
 
 #' @export

From 48aa65ec65f2f8d462d2b0e2e513a615081457aa Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Wed, 9 Apr 2025 10:13:06 -0700
Subject: [PATCH 086/117] move print options to print method

---
 R/revision_analysis.R | 72 ++++++++++++++++++++++---------------------
 1 file changed, 37 insertions(+), 35 deletions(-)

diff --git a/R/revision_analysis.R b/R/revision_analysis.R
index 77f5ebfae..dd4a1dd99 100644
--- a/R/revision_analysis.R
+++ b/R/revision_analysis.R
@@ -43,20 +43,6 @@
 #'   to `NULL` or 0.
 #' @param within_latest double between 0 and 1. Determines the threshold
 #'   used for the `lag_to`
-#' @param quick_revision difftime or integer (integer is treated as days), for
-#'   the printed summary, the amount of time between the final revision and the
-#'   actual time_value to consider the revision quickly resolved. Default of 3
-#'   days
-#' @param few_revisions integer, for the printed summary, the upper bound on the
-#'   number of revisions to consider "few". Default is 3.
-#' @param abs_spread_threshold length-1 numeric, for the printed summary, the
-#'   maximum spread used to characterize revisions which don't actually change
-#'   very much. Default is 5% of the maximum value in the dataset, but this is
-#'   the most unit dependent of values, and likely needs to be chosen
-#'   appropriate for the scale of the dataset.
-#' @param rel_spread_threshold length-1 double between 0 and 1, for the printed
-#'   summary, the relative spread fraction used to characterize revisions which
-#'   don't actually change very much. Default is .1, or 10% of the final value
 #' @param compactify bool. If `TRUE`, we will compactify after the signal
 #'   requested in `...` has been selected on its own and the `drop_nas` step.
 #'   This helps, for example, to give similar results when called on
@@ -98,11 +84,6 @@ revision_analysis <- function(epi_arch,
                               min_waiting_period = as.difftime(60, units = "days") %>%
                                 difftime_approx_ceiling_time_delta(epi_arch$time_type),
                               within_latest = 0.2,
-                              quick_revision = as.difftime(3, units = "days") %>%
-                                difftime_approx_ceiling_time_delta(epi_arch$time_type),
-                              few_revisions = 3,
-                              abs_spread_threshold = NULL,
-                              rel_spread_threshold = 0.1,
                               compactify = TRUE,
                               compactify_abs_tol = 0,
                               return_only_tibble = FALSE) {
@@ -211,18 +192,41 @@ revision_analysis <- function(epi_arch,
       time_type = time_type,
       total_na = total_na,
       n_obs = nrow(epi_arch$DT),
-      quick_revision = quick_revision,
-      few_revisions = few_revisions,
-      rel_spread_threshold = rel_spread_threshold,
-      abs_spread_threshold = abs_spread_threshold,
       within_latest = within_latest
-    ), class = "revision_behavior")
+    ), class = "revision_analysis")
   }
   return(revision_behavior)
 }
 
+
+
+
+#' Print a `revision_analysis` object
+#'
+#' @param x a `revision_analysis` object
+#' @param quick_revision Difftime or integer (integer is treated as days).
+#'   The amount of time between the final revision and the
+#'   actual time_value to consider the revision quickly resolved. Default of 3
+#'   days
+#' @param few_revisions Integer. The upper bound on the
+#'   number of revisions to consider "few". Default is 3.
+#' @param abs_spread_threshold Scalar numeric. The
+#'   maximum spread used to characterize revisions which don't actually change
+#'   very much. Default is 5% of the maximum value in the dataset, but this is
+#'   the most unit dependent of values, and likely needs to be chosen
+#'   appropriate for the scale of the dataset.
+#' @param rel_spread_threshold Scalar between 0 and 1. The relative spread fraction used to characterize revisions which
+#'   don't actually change very much. Default is .1, or 10% of the final value
+#' 
+#' @rdname revision_analysis
 #' @export
-print.revision_behavior <- function(x, ...) {
+print.revision_analysis <- function(x,
+                                    quick_revision = as.difftime(3, units = "days") %>%
+                                      difftime_approx_ceiling_time_delta(x$time_type),
+                                    few_revisions = 3,
+                                    abs_spread_threshold = NULL,
+                                    rel_spread_threshold = 0.1,
+                                    ...) {
   cli::cli_h2("An epi_archive spanning {.val {x$range_time_values[1]}} to {.val {x$range_time_values[1]}}.")
   cli::cli_h3("Min lag (time to first version):")
   time_delta_summary(x$revision_behavior$min_lag, x$time_type) %>% print()
@@ -238,15 +242,15 @@ print.revision_behavior <- function(x, ...) {
   cli_li(num_percent(total_num_unrevised, total_num, ""))
   total_quickly_revised <- sum( # nolint: object_usage_linter
     time_delta_to_n_steps(x$revision_behavior$max_lag, x$time_type) <=
-      time_delta_to_n_steps(x$quick_revision, x$time_type)
+      time_delta_to_n_steps(quick_revision, x$time_type)
   )
-  cli_inform("Quick revisions (last revision within {format_time_delta(x$quick_revision, x$time_type)}
+  cli_inform("Quick revisions (last revision within {format_time_delta(quick_revision, x$time_type)}
                 of the `time_value`):")
   cli_li(num_percent(total_quickly_revised, total_num, ""))
   total_barely_revised <- sum( # nolint: object_usage_linter
-    x$n_revisions <= x$few_revisions
+    x$n_revisions <= few_revisions
   )
-  cli_inform("Few revisions (At most {x$few_revisions} revisions for that `time_value`):")
+  cli_inform("Few revisions (At most {few_revisions} revisions for that `time_value`):")
   cli_li(num_percent(total_barely_revised, total_num, ""))
   
   cli::cli_h3("Fraction of revised epi_key + time_values which have:")
@@ -254,17 +258,15 @@ print.revision_behavior <- function(x, ...) {
   real_revisions <- x$revision_behavior %>% filter(n_revisions > 0) # nolint: object_usage_linter
   n_real_revised <- nrow(real_revisions) # nolint: object_usage_linter
   rel_spread <- sum( # nolint: object_usage_linter
-    real_revisions$rel_spread <
-      x$rel_spread_threshold,
+    real_revisions$rel_spread < rel_spread_threshold,
     na.rm = TRUE
   ) + sum(is.na(real_revisions$rel_spread))
-  cli_inform("Less than {x$rel_spread_threshold} spread in relative value:")
+  cli_inform("Less than {rel_spread_threshold} spread in relative value:")
   cli_li(num_percent(rel_spread, n_real_revised, ""))
   abs_spread <- sum( # nolint: object_usage_linter
-    real_revisions$spread >
-      x$abs_spread_threshold
+    real_revisions$spread > abs_spread_threshold
   ) # nolint: object_usage_linter
-  cli_inform("Spread of more than {x$abs_spread_threshold} in actual value (when revised):")
+  cli_inform("Spread of more than {abs_spread_threshold} in actual value (when revised):")
   cli_li(num_percent(abs_spread, n_real_revised, ""))
   
   # time_type_unit_pluralizer[[time_type]] is a format string controlled by us

From 32403fbb3e55d04e3eaea06109ffd66d5ead4e1b Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Wed, 9 Apr 2025 10:39:25 -0700
Subject: [PATCH 087/117] rebuild docs and test

---
 R/revision_analysis.R                         | 21 +++++++++----------
 .../_snaps/revision-latency-functions.md      |  2 +-
 .../test-revision-latency-functions.R         |  4 ++--
 3 files changed, 13 insertions(+), 14 deletions(-)

diff --git a/R/revision_analysis.R b/R/revision_analysis.R
index 55be566d6..aac77bdde 100644
--- a/R/revision_analysis.R
+++ b/R/revision_analysis.R
@@ -113,11 +113,6 @@ revision_analysis <- function(epi_arch,
       cli_abort("Not currently implementing more than one column at a time. Run each separately.")
     }
   }
-  if (is.null(abs_spread_threshold)) {
-    abs_spread_threshold <- .05 * epi_arch$DT %>%
-      pull(!!arg) %>%
-      max(na.rm = TRUE)
-  }
   # for each time_value, get
   #   the number of revisions
   #   the maximum spread in value (both absolute and relative)
@@ -191,6 +186,7 @@ revision_analysis <- function(epi_arch,
       drop_nas = drop_nas,
       time_type = time_type,
       total_na = total_na,
+      max_val = max(epi_arch$DT[[arg]], na.rm = TRUE),
       n_obs = nrow(epi_arch$DT),
       within_latest = within_latest
     ), class = "revision_analysis")
@@ -227,8 +223,9 @@ print.revision_analysis <- function(x,
                                     abs_spread_threshold = NULL,
                                     rel_spread_threshold = 0.1,
                                     ...) {
+  if (is.null(abs_spread_threshold)) abs_spread_threshold <- .05 * x$max_val
   rev_beh <- x$revision_behavior
-  cli::cli_h2("An epi_archive spanning {.val {rev_beh$range_time_values[1]}} to {.val {rev_beh$range_time_values[1]}}.")
+  cli::cli_h2("An epi_archive spanning {.val {x$range_time_values[1]}} to {.val {x$range_time_values[1]}}.")
   cli::cli_h3("Min lag (time to first version):")
   time_delta_summary(rev_beh$min_lag, x$time_type) %>% print()
   if (!x$drop_nas) {
@@ -248,8 +245,8 @@ print.revision_analysis <- function(x,
   cli_inform("Quick revisions (last revision within {format_time_delta(quick_revision, x$time_type)}
                 of the `time_value`):")
   cli_li(num_percent(total_quickly_revised, total_num, ""))
-  total_barely_revised <- sum(x$n_revisions <= few_revisions)
-  cli_inform("Few revisions (At most {few_revisions} revisions for that `time_value`):")
+  total_barely_revised <- sum(rev_beh$n_revisions <= few_revisions)
+  cli_inform("Few revisions (At most {.val {few_revisions}} revisions for that `time_value`):")
   cli_li(num_percent(total_barely_revised, total_num, ""))
 
   cli::cli_h3("Fraction of revised epi_key + time_values which have:")
@@ -260,18 +257,20 @@ print.revision_analysis <- function(x,
     real_revisions$rel_spread < rel_spread_threshold,
     na.rm = TRUE
   ) + sum(is.na(real_revisions$rel_spread))
-  cli_inform("Less than {rel_spread_threshold} spread in relative value:")
+  cli_inform("Less than {.val {rel_spread_threshold}} spread in relative value:")
   cli_li(num_percent(rel_spread, n_real_revised, ""))
   abs_spread <- sum( # nolint: object_usage_linter
     real_revisions$spread > abs_spread_threshold
   ) # nolint: object_usage_linter
-  cli_inform("Spread of more than {abs_spread_threshold} in actual value (when revised):")
+  divid <- cli::cli_div(theme = list(.val = list(digits = 3)))
+  cli_inform("Spread of more than {.val {abs_spread_threshold}} in actual value (when revised):")
+  cli::cli_end(divid)
   cli_li(num_percent(abs_spread, n_real_revised, ""))
 
   # time_type_unit_pluralizer[[time_type]] is a format string controlled by us
   # and/or downstream devs, so we can paste it onto our format string safely:
   units_plural <- pluralize(paste0("{qty(2)}", time_type_unit_pluralizer[[x$time_type]])) # nolint: object_usage_linter
-  cli::cli_h3("{toTitleCase(units_plural)} until within {x$within_latest*100}% of the latest value:")
+  cli::cli_h3("{toTitleCase(units_plural)} until within {.val {x$within_latest*100}}% of the latest value:")
   time_delta_summary(rev_beh[["lag_near_latest"]], x$time_type) %>% print()
 }
 
diff --git a/tests/testthat/_snaps/revision-latency-functions.md b/tests/testthat/_snaps/revision-latency-functions.md
index af4f5c20c..1fbbbe06e 100644
--- a/tests/testthat/_snaps/revision-latency-functions.md
+++ b/tests/testthat/_snaps/revision-latency-functions.md
@@ -239,7 +239,7 @@
 ---
 
     Code
-      rs5
+      print(rs5, quick_revision = 3)
     Message
       
       -- An epi_archive spanning 1 to 1. --
diff --git a/tests/testthat/test-revision-latency-functions.R b/tests/testthat/test-revision-latency-functions.R
index 86d98f8eb..1a61e8d88 100644
--- a/tests/testthat/test-revision-latency-functions.R
+++ b/tests/testthat/test-revision-latency-functions.R
@@ -89,10 +89,10 @@ test_that("revision_summary works for dummy datasets", {
   # something nonsensical, but we tried.
   rs5 <- dummy_ex_integerly %>%
     revision_summary(
-      min_waiting_period = 60, quick_revision = 3,
+      min_waiting_period = 60,
       drop_nas = FALSE
     )
-  expect_snapshot(rs5)
+  expect_snapshot(print(rs5, quick_revision = 3))
   expect_snapshot(rs5$revision_behavior %>% print(n = 10, width = 300))
 })
 

From e0304410c4dc64a651f3090a6423d832712f4648 Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Wed, 9 Apr 2025 10:45:18 -0700
Subject: [PATCH 088/117] appease the lintr

---
 R/revision_analysis.R | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/R/revision_analysis.R b/R/revision_analysis.R
index aac77bdde..b1793a8e2 100644
--- a/R/revision_analysis.R
+++ b/R/revision_analysis.R
@@ -213,7 +213,7 @@ revision_analysis <- function(epi_arch,
 #'   appropriate for the scale of the dataset.
 #' @param rel_spread_threshold Scalar between 0 and 1. The relative spread fraction used to characterize revisions which
 #'   don't actually change very much. Default is .1, or 10% of the final value
-#' 
+#'
 #' @rdname revision_analysis
 #' @export
 print.revision_analysis <- function(x,

From 412e1a12aa5580fc29b0b0669faa5f004bf1eadd Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Wed, 9 Apr 2025 12:32:47 -0700
Subject: [PATCH 089/117] move internal function out of default args

---
 R/revision_analysis.R    | 16 ++++++++++------
 man/revision_analysis.Rd | 15 +++++++--------
 2 files changed, 17 insertions(+), 14 deletions(-)

diff --git a/R/revision_analysis.R b/R/revision_analysis.R
index b1793a8e2..3518d974f 100644
--- a/R/revision_analysis.R
+++ b/R/revision_analysis.R
@@ -40,7 +40,8 @@
 #'   the typical time during which most significant revisions occur. The default
 #'   of 60 days corresponds to a typical near-final value for case counts as
 #'   reported in the context of insurance. To avoid this filtering, either set
-#'   to `NULL` or 0.
+#'   to `NULL` or 0. This will be rounded up to the appropriate `time_type` if
+#'   necessary (that is 5 days will be rounded to 1 week if the data is weekly).
 #' @param within_latest double between 0 and 1. Determines the threshold
 #'   used for the `lag_to`
 #' @param compactify bool. If `TRUE`, we will compactify after the signal
@@ -81,13 +82,14 @@
 revision_analysis <- function(epi_arch,
                               ...,
                               drop_nas = TRUE,
-                              min_waiting_period = as.difftime(60, units = "days") %>%
-                                difftime_approx_ceiling_time_delta(epi_arch$time_type),
+                              min_waiting_period = as.difftime(60, units = "days"),
                               within_latest = 0.2,
                               compactify = TRUE,
                               compactify_abs_tol = 0,
                               return_only_tibble = FALSE) {
   assert_class(epi_arch, "epi_archive")
+  min_waiting_period <- min_waiting_period %>%
+    difftime_approx_ceiling_time_delta(epi_arch$time_type)
   # if the column to summarize isn't specified, use the only one if there is only one
   if (dots_n(...) == 0) {
     # Choose the first column that's not a key:
@@ -203,7 +205,8 @@ revision_analysis <- function(epi_arch,
 #' @param quick_revision Difftime or integer (integer is treated as days).
 #'   The amount of time between the final revision and the
 #'   actual time_value to consider the revision quickly resolved. Default of 3
-#'   days
+#'   days. This will be rounded up to the appropriate `time_type` if
+#'   necessary (that is 5 days will be rounded to 1 week if the data is weekly).
 #' @param few_revisions Integer. The upper bound on the
 #'   number of revisions to consider "few". Default is 3.
 #' @param abs_spread_threshold Scalar numeric. The
@@ -217,12 +220,13 @@ revision_analysis <- function(epi_arch,
 #' @rdname revision_analysis
 #' @export
 print.revision_analysis <- function(x,
-                                    quick_revision = as.difftime(3, units = "days") %>%
-                                      difftime_approx_ceiling_time_delta(x$time_type),
+                                    quick_revision = as.difftime(3, units = "days"),
                                     few_revisions = 3,
                                     abs_spread_threshold = NULL,
                                     rel_spread_threshold = 0.1,
                                     ...) {
+  quick_revision <- quick_revision %>% 
+    difftime_approx_ceiling_time_delta(x$time_type)
   if (is.null(abs_spread_threshold)) abs_spread_threshold <- .05 * x$max_val
   rev_beh <- x$revision_behavior
   cli::cli_h2("An epi_archive spanning {.val {x$range_time_values[1]}} to {.val {x$range_time_values[1]}}.")
diff --git a/man/revision_analysis.Rd b/man/revision_analysis.Rd
index 947af17d8..23ddf0212 100644
--- a/man/revision_analysis.Rd
+++ b/man/revision_analysis.Rd
@@ -10,8 +10,7 @@ revision_analysis(
   epi_arch,
   ...,
   drop_nas = TRUE,
-  min_waiting_period = as.difftime(60, units = "days") \%>\%
-    difftime_approx_ceiling_time_delta(epi_arch$time_type),
+  min_waiting_period = as.difftime(60, units = "days"),
   within_latest = 0.2,
   compactify = TRUE,
   compactify_abs_tol = 0,
@@ -20,8 +19,7 @@ revision_analysis(
 
 \method{print}{revision_analysis}(
   x,
-  quick_revision = as.difftime(3, units = "days") \%>\%
-    difftime_approx_ceiling_time_delta(x$time_type),
+  quick_revision = as.difftime(3, units = "days"),
   few_revisions = 3,
   abs_spread_threshold = NULL,
   rel_spread_threshold = 0.1,
@@ -32,8 +30,7 @@ revision_summary(
   epi_arch,
   ...,
   drop_nas = TRUE,
-  min_waiting_period = as.difftime(60, units = "days") \%>\%
-    difftime_approx_ceiling_time_delta(epi_arch$time_type),
+  min_waiting_period = as.difftime(60, units = "days"),
   within_latest = 0.2,
   compactify = TRUE,
   compactify_abs_tol = 0,
@@ -59,7 +56,8 @@ of the \code{versions_end} are removed. \code{min_waiting_period} should charact
 the typical time during which most significant revisions occur. The default
 of 60 days corresponds to a typical near-final value for case counts as
 reported in the context of insurance. To avoid this filtering, either set
-to \code{NULL} or 0.}
+to \code{NULL} or 0. This will be rounded up to the appropriate \code{time_type} if
+necessary (that is 5 days will be rounded to 1 week if the data is weekly).}
 
 \item{within_latest}{double between 0 and 1. Determines the threshold
 used for the \code{lag_to}}
@@ -82,7 +80,8 @@ computational results rather than the complete S3 object.}
 \item{quick_revision}{Difftime or integer (integer is treated as days).
 The amount of time between the final revision and the
 actual time_value to consider the revision quickly resolved. Default of 3
-days}
+days. This will be rounded up to the appropriate \code{time_type} if
+necessary (that is 5 days will be rounded to 1 week if the data is weekly).}
 
 \item{few_revisions}{Integer. The upper bound on the
 number of revisions to consider "few". Default is 3.}

From 788734cc804cdd29d3863beae3ed060d03010045 Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Wed, 9 Apr 2025 12:37:13 -0700
Subject: [PATCH 090/117] appease lintr

---
 R/revision_analysis.R | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/R/revision_analysis.R b/R/revision_analysis.R
index 3518d974f..a1d31ed4f 100644
--- a/R/revision_analysis.R
+++ b/R/revision_analysis.R
@@ -225,7 +225,7 @@ print.revision_analysis <- function(x,
                                     abs_spread_threshold = NULL,
                                     rel_spread_threshold = 0.1,
                                     ...) {
-  quick_revision <- quick_revision %>% 
+  quick_revision <- quick_revision %>%
     difftime_approx_ceiling_time_delta(x$time_type)
   if (is.null(abs_spread_threshold)) abs_spread_threshold <- .05 * x$max_val
   rev_beh <- x$revision_behavior

From 6872e0271814ccaf3c66cc05bf9e266013c2356b Mon Sep 17 00:00:00 2001
From: "Daniel J. McDonald" <dajmcdon@gmail.com>
Date: Wed, 9 Apr 2025 12:46:35 -0700
Subject: [PATCH 091/117] pass local checks

---
 R/revision_analysis.R | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/R/revision_analysis.R b/R/revision_analysis.R
index a1d31ed4f..86210effc 100644
--- a/R/revision_analysis.R
+++ b/R/revision_analysis.R
@@ -40,7 +40,7 @@
 #'   the typical time during which most significant revisions occur. The default
 #'   of 60 days corresponds to a typical near-final value for case counts as
 #'   reported in the context of insurance. To avoid this filtering, either set
-#'   to `NULL` or 0. This will be rounded up to the appropriate `time_type` if
+#'   to `NULL` or 0. A `difftime` will be rounded up to the appropriate `time_type` if
 #'   necessary (that is 5 days will be rounded to 1 week if the data is weekly).
 #' @param within_latest double between 0 and 1. Determines the threshold
 #'   used for the `lag_to`
@@ -88,8 +88,10 @@ revision_analysis <- function(epi_arch,
                               compactify_abs_tol = 0,
                               return_only_tibble = FALSE) {
   assert_class(epi_arch, "epi_archive")
-  min_waiting_period <- min_waiting_period %>%
-    difftime_approx_ceiling_time_delta(epi_arch$time_type)
+  if (methods::is(min_waiting_period, "difftime")) {
+    min_waiting_period <- min_waiting_period %>%
+      difftime_approx_ceiling_time_delta(epi_arch$time_type)
+  }
   # if the column to summarize isn't specified, use the only one if there is only one
   if (dots_n(...) == 0) {
     # Choose the first column that's not a key:
@@ -225,8 +227,10 @@ print.revision_analysis <- function(x,
                                     abs_spread_threshold = NULL,
                                     rel_spread_threshold = 0.1,
                                     ...) {
-  quick_revision <- quick_revision %>%
-    difftime_approx_ceiling_time_delta(x$time_type)
+  if (methods::is(quick_revision, "difftime")) {
+    quick_revision <- quick_revision %>%
+      difftime_approx_ceiling_time_delta(x$time_type)
+  }
   if (is.null(abs_spread_threshold)) abs_spread_threshold <- .05 * x$max_val
   rev_beh <- x$revision_behavior
   cli::cli_h2("An epi_archive spanning {.val {x$range_time_values[1]}} to {.val {x$range_time_values[1]}}.")

From e08a6937a9f3ab0e4baf0788b17e3293cbda4a21 Mon Sep 17 00:00:00 2001
From: Daniel McDonald <dajmcdon@gmail.com>
Date: Wed, 9 Apr 2025 23:01:54 -0700
Subject: [PATCH 092/117] feat: supply tidyselection for sum_groups_epi_df
 (#657)

Co-authored-by: Logan C. Brooks <lcbrooks+github@andrew.cmu.edu>
---
 DESCRIPTION                          |  3 +-
 NEWS.md                              |  1 +
 R/inline-roxygen.R                   | 16 ++++++++++
 R/methods-epi_df.R                   | 44 ++++++++++++++++------------
 R/slide.R                            |  7 +----
 man/sum_groups_epi_df.Rd             | 18 ++++++++++--
 man/tidyselect_arg_roxygen.Rd        | 16 ++++++++++
 tests/testthat/test-methods-epi_df.R |  9 ++++--
 8 files changed, 84 insertions(+), 30 deletions(-)
 create mode 100644 R/inline-roxygen.R
 create mode 100644 man/tidyselect_arg_roxygen.Rd

diff --git a/DESCRIPTION b/DESCRIPTION
index f75879e97..c7b12c3e4 100755
--- a/DESCRIPTION
+++ b/DESCRIPTION
@@ -1,7 +1,7 @@
 Type: Package
 Package: epiprocess
 Title: Tools for basic signal processing in epidemiology
-Version: 0.11.3
+Version: 0.11.4
 Authors@R: c(
     person("Jacob", "Bien", role = "ctb"),
     person("Logan", "Brooks", , "lcbrooks+github@andrew.cmu.edu", role = c("aut", "cre")),
@@ -105,6 +105,7 @@ Collate:
     'methods-epi_archive.R'
     'grouped_epi_archive.R'
     'growth_rate.R'
+    'inline-roxygen.R'
     'key_colnames.R'
     'methods-epi_df.R'
     'outliers.R'
diff --git a/NEWS.md b/NEWS.md
index 88acabee4..2734419bf 100644
--- a/NEWS.md
+++ b/NEWS.md
@@ -58,6 +58,7 @@ Pre-1.0.0 numbering scheme: 0.x will indicate releases, while 0.x.y will indicat
 - Various functions are now faster, using faster variants of core operations and
   avoiding reconstructing grouped `epi_df`s when unnecessary.
 - Add `autoplot.epi_archive()` to display revision patterns.
+- `sum_groups_epi_df()` now supports tidyselect syntax in it's second argument (#655).
 
 ## Bug fixes
 
diff --git a/R/inline-roxygen.R b/R/inline-roxygen.R
new file mode 100644
index 000000000..ae2ce66c1
--- /dev/null
+++ b/R/inline-roxygen.R
@@ -0,0 +1,16 @@
+# Helpers here are meant to be used inside inline R expressions within roxygen2
+# documentation when @template is inappropriate.
+
+#' Description of a single arg that tidyselects value variables
+#'
+#' Not meant for when describing tidyselect `...`.
+#'
+#' @keywords internal
+tidyselect_arg_roxygen <- '
+  <[`tidy-select`][dplyr_tidy_select]> An unquoted column
+  name (e.g., `cases`), multiple column names (e.g., `c(cases, deaths)`),
+  [other tidy-select expression][tidyselect::language], or a vector of
+  characters (e.g. `c("cases", "deaths")`). Variable names can be used as if
+  they were positions in the data frame, so expressions like `x:y` can be
+  used to select a range of variables.
+'
diff --git a/R/methods-epi_df.R b/R/methods-epi_df.R
index 1191521c9..7870dede8 100644
--- a/R/methods-epi_df.R
+++ b/R/methods-epi_df.R
@@ -499,34 +499,40 @@ group_epi_df <- function(x, exclude = character()) {
 #' the resulting `epi_df` will have `geo_value` set to `"total"`.
 #'
 #' @param .x an `epi_df`
-#' @param sum_cols character vector of the columns to aggregate
+#' @param sum_cols `r tidyselect_arg_roxygen`
 #' @param group_cols character vector of column names to group by. "time_value" is
-#' included by default.
+#'   included by default.
 #' @return an `epi_df` object
 #'
+#' @examples
+#' # This data has other_keys age_group and edu_qual:
+#' grad_employ_subset
+#'
+#' # Aggregate num_graduates within each geo_value (and time_value):
+#' grad_employ_subset %>%
+#'   sum_groups_epi_df(num_graduates, group_cols = "geo_value")
+#'
 #' @export
-sum_groups_epi_df <- function(.x, sum_cols = "value", group_cols = character()) {
+sum_groups_epi_df <- function(.x, sum_cols, group_cols = "time_value") {
   assert_class(.x, "epi_df")
-  assert_character(sum_cols)
   assert_character(group_cols)
-  checkmate::assert_subset(sum_cols, setdiff(names(.x), key_colnames(.x)))
   checkmate::assert_subset(group_cols, key_colnames(.x))
   if (!"time_value" %in% group_cols) {
     group_cols <- c("time_value", group_cols)
   }
-
-  out <- .x %>%
-    group_by(across(all_of(group_cols))) %>%
-    dplyr::summarize(across(all_of(sum_cols), sum), .groups = "drop")
+  # Attempt tidyselection ourselves to get "Error in `sum_groups_epi_df()`"
+  # rather than "in `dplyr::summarize()`", before forwarding:
+  sum_cols <- rlang::enquo(sum_cols)
+  tidyselect::eval_select(sum_cols, .x)
+  out <- group_by(.x, across(all_of(group_cols))) %>%
+    dplyr::summarize(across(!!sum_cols, sum), .groups = "drop")
 
   # To preserve epi_df-ness, we need to ensure that the `geo_value` column is
   # present.
-  out <- if (!"geo_value" %in% group_cols) {
-    out %>%
+  if (!"geo_value" %in% group_cols) {
+    out <- out %>%
       mutate(geo_value = "total") %>%
-      relocate(geo_value, .before = 1)
-  } else {
-    out
+      relocate(.data$geo_value, .before = 1)
   }
 
   # The `geo_type` will be correctly inherited here by the following logic:
@@ -535,10 +541,10 @@ sum_groups_epi_df <- function(.x, sum_cols = "value", group_cols = character())
   # - if `geo_value` is not in `group_cols`, then the constructor will see
   #   the unrecognizeable "total" value and will correctly infer the "custom"
   #   geo_type.
-  out %>%
-    as_epi_df(
-      as_of = attr(.x, "metadata")$as_of,
-      other_keys = intersect(attr(.x, "metadata")$other_keys, group_cols)
-    ) %>%
+  as_epi_df(
+    out,
+    as_of = attr(.x, "metadata")$as_of,
+    other_keys = intersect(attr(.x, "metadata")$other_keys, group_cols)
+  ) %>%
     arrange_canonical()
 }
diff --git a/R/slide.R b/R/slide.R
index be000f579..6be54baa6 100644
--- a/R/slide.R
+++ b/R/slide.R
@@ -557,12 +557,7 @@ get_before_after_from_window <- function(window_size, align, time_type) {
 #' `vignette("epi_df")` for more examples.
 #'
 #' @template basic-slide-params
-#' @param .col_names <[`tidy-select`][dplyr_tidy_select]> An unquoted column
-#'   name (e.g., `cases`), multiple column names (e.g., `c(cases, deaths)`),
-#'   [other tidy-select expression][tidyselect::language], or a vector of
-#'   characters (e.g. `c("cases", "deaths")`). Variable names can be used as if
-#'   they were positions in the data frame, so expressions like `x:y` can be
-#'   used to select a range of variables.
+#' @param .col_names `r tidyselect_arg_roxygen`
 #'
 #'   The tidy-selection renaming interface is not supported, and cannot be used
 #'   to provide output column names; if you want to customize the output column
diff --git a/man/sum_groups_epi_df.Rd b/man/sum_groups_epi_df.Rd
index f1ba84745..34ec99930 100644
--- a/man/sum_groups_epi_df.Rd
+++ b/man/sum_groups_epi_df.Rd
@@ -4,12 +4,17 @@
 \alias{sum_groups_epi_df}
 \title{Aggregate an \code{epi_df} object}
 \usage{
-sum_groups_epi_df(.x, sum_cols = "value", group_cols = character())
+sum_groups_epi_df(.x, sum_cols, group_cols = "time_value")
 }
 \arguments{
 \item{.x}{an \code{epi_df}}
 
-\item{sum_cols}{character vector of the columns to aggregate}
+\item{sum_cols}{<\code{\link[=dplyr_tidy_select]{tidy-select}}> An unquoted column
+name (e.g., \code{cases}), multiple column names (e.g., \code{c(cases, deaths)}),
+\link[tidyselect:language]{other tidy-select expression}, or a vector of
+characters (e.g. \code{c("cases", "deaths")}). Variable names can be used as if
+they were positions in the data frame, so expressions like \code{x:y} can be
+used to select a range of variables.}
 
 \item{group_cols}{character vector of column names to group by. "time_value" is
 included by default.}
@@ -22,3 +27,12 @@ Aggregates an \code{epi_df} object by the specified group columns, summing the
 \code{value} column, and returning an \code{epi_df}. If aggregating over \code{geo_value},
 the resulting \code{epi_df} will have \code{geo_value} set to \code{"total"}.
 }
+\examples{
+# This data has other_keys age_group and edu_qual:
+grad_employ_subset
+
+# Aggregate num_graduates within each geo_value (and time_value):
+grad_employ_subset \%>\%
+  sum_groups_epi_df(num_graduates, group_cols = "geo_value")
+
+}
diff --git a/man/tidyselect_arg_roxygen.Rd b/man/tidyselect_arg_roxygen.Rd
new file mode 100644
index 000000000..27cb264d9
--- /dev/null
+++ b/man/tidyselect_arg_roxygen.Rd
@@ -0,0 +1,16 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/inline-roxygen.R
+\docType{data}
+\name{tidyselect_arg_roxygen}
+\alias{tidyselect_arg_roxygen}
+\title{Description of a single arg that tidyselects value variables}
+\format{
+An object of class \code{character} of length 1.
+}
+\usage{
+tidyselect_arg_roxygen
+}
+\description{
+Not meant for when describing tidyselect \code{...}.
+}
+\keyword{internal}
diff --git a/tests/testthat/test-methods-epi_df.R b/tests/testthat/test-methods-epi_df.R
index 3e5c180b0..bc9f1e35d 100644
--- a/tests/testthat/test-methods-epi_df.R
+++ b/tests/testthat/test-methods-epi_df.R
@@ -311,20 +311,25 @@ test_that("complete.epi_df works", {
 })
 
 test_that("sum_groups_epi_df works", {
-  out <- toy_epi_df %>% sum_groups_epi_df(sum_cols = "x")
+  out <- toy_epi_df %>% sum_groups_epi_df("x")
   expected_out <- toy_epi_df %>%
     group_by(time_value) %>%
     summarize(x = sum(x)) %>%
     mutate(geo_value = "total") %>%
     as_epi_df(as_of = attr(toy_epi_df, "metadata")$as_of)
   expect_equal(out, expected_out)
+  out <- toy_epi_df %>% sum_groups_epi_df(x)
+  expect_equal(out, expected_out)
 
   out <- toy_epi_df %>%
-    sum_groups_epi_df(sum_cols = c("x", "y"), group_cols = c("time_value", "geo_value", "indic_var1"))
+    sum_groups_epi_df(c(x, y), group_cols = c("time_value", "geo_value", "indic_var1"))
   expected_out <- toy_epi_df %>%
     group_by(time_value, geo_value, indic_var1) %>%
     summarize(x = sum(x), y = sum(y), .groups = "drop") %>%
     as_epi_df(as_of = attr(toy_epi_df, "metadata")$as_of, other_keys = "indic_var1") %>%
     arrange_canonical()
   expect_equal(out, expected_out)
+  out <- toy_epi_df %>%
+    sum_groups_epi_df(x:y, group_cols = c("time_value", "geo_value", "indic_var1"))
+  expect_equal(out, expected_out)
 })

From 115251c5acc93dae14a2d9693ef641136bfa8f24 Mon Sep 17 00:00:00 2001
From: nmdefries <42820733+nmdefries@users.noreply.github.com>
Date: Thu, 10 Apr 2025 14:30:26 -0400
Subject: [PATCH 093/117] wording

---
 R/methods-epi_archive.R |  4 ++--
 R/slide.R               | 10 +++++-----
 man/epi_slide.Rd        |  4 ++--
 man/epi_slide_opt.Rd    |  6 +++---
 man/epix_slide.Rd       | 34 ++++++++++++++++++----------------
 5 files changed, 30 insertions(+), 28 deletions(-)

diff --git a/R/methods-epi_archive.R b/R/methods-epi_archive.R
index a33c91c9a..42afbf46a 100644
--- a/R/methods-epi_archive.R
+++ b/R/methods-epi_archive.R
@@ -640,7 +640,7 @@ epix_detailed_restricted_mutate <- function(.data, ...) {
 #'   computation. The computation will be run on each requested group-version
 #'   combination, with a time window filter applied if `.before` is supplied.
 #'
-#'   - If `.f` is a function must have the form `function(x, g, v)` or
+#'   If `.f` is a function must have the form `function(x, g, v)` or
 #'     `function(x, g, v, <additional configuration args>)`, where
 #'
 #'     - `x` is an `epi_df` with the same column names as the archive's `DT`,
@@ -685,7 +685,7 @@ epix_detailed_restricted_mutate <- function(.data, ...) {
 #'   to reporting latency. Unlike `epi_slide()`, `epix_slide()` won't fill in
 #'   any missing `time_values` in this window.
 #' @param .versions Requested versions on which to run the computation. Each
-#'   requested `.version` also serves as the anchor point around which for which
+#'   requested `.version` also serves as the anchor point from which
 #'   the `time_value` window specified by `.before` is drawn. If `.versions` is
 #'   missing, it will be set to a regularly-spaced sequence of values set to
 #'   cover the range of `version`s in the `DT` plus the `versions_end`; the
diff --git a/R/slide.R b/R/slide.R
index be112d070..c0032691e 100644
--- a/R/slide.R
+++ b/R/slide.R
@@ -28,11 +28,11 @@
 #' @template basic-slide-params
 #' @param .f,... The computation to slide. The input will be a time window of
 #'   the data for a single subpopulation (i.e., a single `geo_value` and single
-#'   value for any [`other_keys`][as_epi_df] you set up for age groups, etc.).
+#'   value for any [`other_keys`][as_epi_df] you set up, such as age groups, race, etc.).
 #'   The input will always have the same size, determined by `.window_size`, and
 #'   will fill in any missing `time_values`, using `NA` values for missing
 #'   measurements. The output should be a scalar value or a 1-row data frame;
-#'   these outputs will be collected and form a new column or columns in the
+#'   these outputs will be collected into a new column or columns in the
 #'   `epi_slide()` result. Data frame outputs will be unpacked into multiple
 #'   columns in the result by default, or [`tidyr::pack`]ed into a single
 #'   data-frame-type column if you provide a name for such a column (e.g., via
@@ -624,11 +624,11 @@ get_before_after_from_window <- function(window_size, align, time_type) {
 #'   `.window_size = Inf`) on the requested columns. Explicit `NA` measurements
 #'   are temporarily added to fill in any time gaps, and, for rolling
 #'   computations, to pad the time series to ensure that the first & last
-#'   computations are over exactly `.window_size` values.
+#'   computations use exactly `.window_size` values.
 #'
 #' `epi_slide_opt` allows you to use any [data.table::froll] or
-#' [slider::summary-slide] function. If none of the specialized functions here
-#' work, you can use `data.table::frollapply` together with a non-rolling
+#' [slider::summary-slide] function. If none of those specialized functions fit
+#' your usecase, you can use `data.table::frollapply` together with a non-rolling
 #' function (e.g., `median`). See [`epi_slide`] if you need to work with
 #' multiple columns at once or output a custom type.
 #'
diff --git a/man/epi_slide.Rd b/man/epi_slide.Rd
index 8f86466a7..734f66d27 100644
--- a/man/epi_slide.Rd
+++ b/man/epi_slide.Rd
@@ -22,11 +22,11 @@ and any columns in \code{other_keys}. If grouped, we make sure the grouping is b
 
 \item{.f, ...}{The computation to slide. The input will be a time window of
 the data for a single subpopulation (i.e., a single \code{geo_value} and single
-value for any \code{\link[=as_epi_df]{other_keys}} you set up for age groups, etc.).
+value for any \code{\link[=as_epi_df]{other_keys}} you set up, such as age groups, race, etc.).
 The input will always have the same size, determined by \code{.window_size}, and
 will fill in any missing \code{time_values}, using \code{NA} values for missing
 measurements. The output should be a scalar value or a 1-row data frame;
-these outputs will be collected and form a new column or columns in the
+these outputs will be collected into a new column or columns in the
 \code{epi_slide()} result. Data frame outputs will be unpacked into multiple
 columns in the result by default, or \code{\link[tidyr:pack]{tidyr::pack}}ed into a single
 data-frame-type column if you provide a name for such a column (e.g., via
diff --git a/man/epi_slide_opt.Rd b/man/epi_slide_opt.Rd
index 687d2ac0a..9b39bdb58 100644
--- a/man/epi_slide_opt.Rd
+++ b/man/epi_slide_opt.Rd
@@ -141,11 +141,11 @@ computation, or alternatively, a running/cumulative computation (with
 \code{.window_size = Inf}) on the requested columns. Explicit \code{NA} measurements
 are temporarily added to fill in any time gaps, and, for rolling
 computations, to pad the time series to ensure that the first & last
-computations are over exactly \code{.window_size} values.
+computations use exactly \code{.window_size} values.
 
 \code{epi_slide_opt} allows you to use any \link[data.table:froll]{data.table::froll} or
-\link[slider:summary-slide]{slider::summary-slide} function. If none of the specialized functions here
-work, you can use \code{data.table::frollapply} together with a non-rolling
+\link[slider:summary-slide]{slider::summary-slide} function. If none of those specialized functions fit
+your usecase, you can use \code{data.table::frollapply} together with a non-rolling
 function (e.g., \code{median}). See \code{\link{epi_slide}} if you need to work with
 multiple columns at once or output a custom type.
 
diff --git a/man/epix_slide.Rd b/man/epix_slide.Rd
index a72e2c685..c0eb3f2bd 100644
--- a/man/epix_slide.Rd
+++ b/man/epix_slide.Rd
@@ -43,22 +43,24 @@ all data in \code{x} will be treated as part of a single data group.}
 \item{.f}{Function, formula, or missing; together with \code{...} specifies the
 computation. The computation will be run on each requested group-version
 combination, with a time window filter applied if \code{.before} is supplied.
-\itemize{
-\item If \code{.f} is a function must have the form \verb{function(x, g, v)} or
+
+If \code{.f} is a function must have the form \verb{function(x, g, v)} or
 \verb{function(x, g, v, <additional configuration args>)}, where
-\itemize{
-\item \code{x} is an \code{epi_df} with the same column names as the archive's \code{DT},
-minus the \code{version} column. (Or, if \code{.all_versions = TRUE}, an
-\code{epi_archive} with the requested partial version history.)
-\item \code{g} is a one-row tibble containing the values of the grouping variables
-for the associated group.
-\item \code{v} (length-1) is the associated \code{version} (one of the requested
-\code{.versions})
-\item \verb{<additional configuration args>} are optional; you can add such
-arguments to your function and set them by passing them through the
-\code{...} argument to \code{epix_slide()}.
-}
-}
+
+\if{html}{\out{<div class="sourceCode">}}\preformatted{- `x` is an `epi_df` with the same column names as the archive's `DT`,
+  minus the `version` column. (Or, if `.all_versions = TRUE`, an
+  `epi_archive` with the requested partial version history.)
+
+- `g` is a one-row tibble containing the values of the grouping variables
+  for the associated group.
+
+- `v` (length-1) is the associated `version` (one of the requested
+  `.versions`)
+
+- `<additional configuration args>` are optional; you can add such
+  arguments to your function and set them by passing them through the
+  `...` argument to `epix_slide()`.
+}\if{html}{\out{</div>}}
 
 If a formula, \code{.f} can operate directly on columns accessed via \code{.x$var} or
 \code{.$var}, as in \code{~ mean (.x$var)} to compute a mean of a column \code{var} for
@@ -91,7 +93,7 @@ to reporting latency. Unlike \code{epi_slide()}, \code{epix_slide()} won't fill
 any missing \code{time_values} in this window.}
 
 \item{.versions}{Requested versions on which to run the computation. Each
-requested \code{.version} also serves as the anchor point around which for which
+requested \code{.version} also serves as the anchor point from which
 the \code{time_value} window specified by \code{.before} is drawn. If \code{.versions} is
 missing, it will be set to a regularly-spaced sequence of values set to
 cover the range of \code{version}s in the \code{DT} plus the \code{versions_end}; the

From 17a369cb6f6ab52ed1c4f765c0996a74dcd7d2e6 Mon Sep 17 00:00:00 2001
From: Dmitry Shemetov <dshemetov@ucdavis.edu>
Date: Tue, 15 Apr 2025 10:41:07 -0700
Subject: [PATCH 094/117] test: add environment variable test

---
 tests/testthat/test-methods-epi_archive.R | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/tests/testthat/test-methods-epi_archive.R b/tests/testthat/test-methods-epi_archive.R
index 6e08c990a..2418af4e2 100644
--- a/tests/testthat/test-methods-epi_archive.R
+++ b/tests/testthat/test-methods-epi_archive.R
@@ -189,6 +189,10 @@ test_that("filter.epi_archive works as expected", {
   expect_equal(intcustom_week_ea$geo_type, "custom")
   expect_equal(intcustom_week_ea$time_type, "week")
 
+  # Environment variables should be fine:
+  version <- as.Date("2020-06-02") + 1
+  expect_no_error(ea2 %>% filter(geo_value == "ca", .env$version <= time_value))
+
   # Error-raising:
   expect_error(
     ea2 %>% filter(version == as.Date("2020-06-02")),

From 80d787b9919b323787e05e00994dc349e762af91 Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Wed, 16 Apr 2025 11:16:12 -0700
Subject: [PATCH 095/117] fix(filter.epi_archive): for+lazy, left-behind
 bindings

---
 R/methods-epi_archive.R                      | 18 +++++++---
 tests/testthat/_snaps/methods-epi_archive.md | 35 ++++++++++++++++++++
 tests/testthat/test-methods-epi_archive.R    | 16 +++++++--
 3 files changed, 62 insertions(+), 7 deletions(-)
 create mode 100644 tests/testthat/_snaps/methods-epi_archive.md

diff --git a/R/methods-epi_archive.R b/R/methods-epi_archive.R
index 02cf2cf30..4b443afac 100644
--- a/R/methods-epi_archive.R
+++ b/R/methods-epi_archive.R
@@ -1069,16 +1069,24 @@ filter.epi_archive <- function(.data, ..., .by = NULL, .format_aware = FALSE) {
                 ">" = "If not, see `?filter.epi_archive` details for how to proceed."
               ), class = "epiprocess__filter_archive__used_version"), assign.env = e)
               for (measurement_colname in measurement_colnames) {
-                delayedAssign(measurement_colname, cli::cli_abort(c(
-                  "Using `{format_varname(measurement_colname)}`
-                 in `filter.epi_archive` may produce unexpected results.",
-                  ">" = "See `?filter.epi_archive` details for how to proceed."
-                ), class = "epiprocess__filter_archive__used_measurement"), assign.env = e)
+                # Record current `measurement_colname` and set up delayed
+                # binding for error in a child environment, so that `for` loop
+                # updating its value and `rm` cleanup don't mess things up:
+                local({
+                  local_measurement_colname <- measurement_colname
+                  delayedAssign(measurement_colname, cli::cli_abort(c(
+                    "Using `{format_varname(local_measurement_colname)}`
+                     in `filter.epi_archive` may produce unexpected results.",
+                    ">" = "See `?filter.epi_archive` details for how to proceed."
+                  ), class = "epiprocess__filter_archive__used_measurement"), assign.env = e)
+                })
               }
               break
             }
             e <- parent.env(e)
           }
+          # Don't mask similarly-named user objects:
+          rm(list = c("e", "measurement_colname"))
           TRUE
         },
         ...,
diff --git a/tests/testthat/_snaps/methods-epi_archive.md b/tests/testthat/_snaps/methods-epi_archive.md
new file mode 100644
index 000000000..200e4202e
--- /dev/null
+++ b/tests/testthat/_snaps/methods-epi_archive.md
@@ -0,0 +1,35 @@
+# filter.epi_archive works as expected
+
+    Code
+      ea2 %>% filter(version <= as.Date("2020-06-02"))
+    Condition <rlang_error>
+      Error in `filter()`:
+      i In argument: `version <= as.Date("2020-06-02")`.
+      Caused by error:
+      ! Using `version` in `filter.epi_archive` may produce unexpected results.
+      > See if `epix_as_of` or `epix_slide` would work instead.
+      > If not, see `?filter.epi_archive` details for how to proceed.
+
+---
+
+    Code
+      ea2 %>% filter(time_value >= as.Date("2020-06-02"), cases >= 2L)
+    Condition <rlang_error>
+      Error in `filter()`:
+      i In argument: `cases >= 2L`.
+      Caused by error:
+      ! Using `cases` in `filter.epi_archive` may produce unexpected results.
+      > See `?filter.epi_archive` details for how to proceed.
+
+---
+
+    Code
+      ea2p %>% filter(cases >= median(cases), .by = geo_value)
+    Condition <rlang_error>
+      Error in `filter()`:
+      i In argument: `cases >= median(cases)`.
+      i In group 1: `geo_value = "ca"`.
+      Caused by error:
+      ! Using `cases` in `filter.epi_archive` may produce unexpected results.
+      > See `?filter.epi_archive` details for how to proceed.
+
diff --git a/tests/testthat/test-methods-epi_archive.R b/tests/testthat/test-methods-epi_archive.R
index 2418af4e2..f9299d0f2 100644
--- a/tests/testthat/test-methods-epi_archive.R
+++ b/tests/testthat/test-methods-epi_archive.R
@@ -191,7 +191,11 @@ test_that("filter.epi_archive works as expected", {
 
   # Environment variables should be fine:
   version <- as.Date("2020-06-02") + 1
-  expect_no_error(ea2 %>% filter(geo_value == "ca", .env$version <= time_value))
+  e <- version
+  expected <- ea2 %>% filter(geo_value == "ca", as.Date("2020-06-02") + 1 <= time_value)
+  expect_equal(ea2 %>% filter(geo_value == "ca", .env$version <= time_value), expected)
+  expect_equal(ea2 %>% filter(geo_value == "ca", e <= time_value), expected)
+  expect_equal(ea2 %>% filter(geo_value == "ca", .env$e <= time_value), expected)
 
   # Error-raising:
   expect_error(
@@ -218,10 +222,18 @@ test_that("filter.epi_archive works as expected", {
     ea2 %>% filter(time_value >= as.Date("2020-06-02"), cases >= 2L),
     class = "epiprocess__filter_archive__used_measurement"
   )
+  ea2p <- ea2_data %>%
+    # to check for `for` + `delayedAssign` mishap in expect_snapshot
+    mutate(deaths = 0) %>%
+    as_epi_archive()
   expect_error(
-    ea2 %>% filter(cases >= median(cases), .by = geo_value),
+    ea2p %>% filter(cases >= median(cases), .by = geo_value),
     class = "epiprocess__filter_archive__used_measurement"
   )
+  expect_snapshot(
+    ea2p %>% filter(cases >= median(cases), .by = geo_value),
+    error = TRUE, cnd_class = TRUE
+  )
 
   # Escape hatch:
   expect_equal(

From 07e54f40c340c57b4f956cf5e491963c408ac5a1 Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Wed, 16 Apr 2025 12:20:48 -0700
Subject: [PATCH 096/117] fix(filter.epi_archive): avoid other lazy eval traps

---
 R/methods-epi_archive.R                   | 46 +++++++++++++++--------
 tests/testthat/test-methods-epi_archive.R | 19 +++++++++-
 2 files changed, 49 insertions(+), 16 deletions(-)

diff --git a/R/methods-epi_archive.R b/R/methods-epi_archive.R
index 4b443afac..9f244aa91 100644
--- a/R/methods-epi_archive.R
+++ b/R/methods-epi_archive.R
@@ -1063,30 +1063,46 @@ filter.epi_archive <- function(.data, ..., .by = NULL, .format_aware = FALSE) {
               # They are expected to be active bindings, so directly
               # assigning has issues; `rm` first.
               rm(list = forbidden_colnames, envir = e)
-              delayedAssign("version", cli::cli_abort(c(
-                "Using `version` in `filter.epi_archive` may produce unexpected results.",
-                ">" = "See if `epix_as_of` or `epix_slide` would work instead.",
-                ">" = "If not, see `?filter.epi_archive` details for how to proceed."
-              ), class = "epiprocess__filter_archive__used_version"), assign.env = e)
+              eval_env <- new.env(parent = asNamespace("epiprocess")) # see (2) below
+              delayedAssign(
+                "version",
+                cli_abort(c(
+                  "Using `version` in `filter.epi_archive` may produce unexpected results.",
+                  ">" = "See if `epix_as_of` or `epix_slide` would work instead.",
+                  ">" = "If not, see `?filter.epi_archive` details for how to proceed."
+                ), class = "epiprocess__filter_archive__used_version"),
+                eval.env = eval_env,
+                assign.env = e
+              )
               for (measurement_colname in measurement_colnames) {
-                # Record current `measurement_colname` and set up delayed
-                # binding for error in a child environment, so that `for` loop
-                # updating its value and `rm` cleanup don't mess things up:
-                local({
-                  local_measurement_colname <- measurement_colname
-                  delayedAssign(measurement_colname, cli::cli_abort(c(
+                # Record current `measurement_colname` and set up execution for
+                # the promise for the error in its own dedicated environment, so
+                # that (1) `for` loop updating its value and `rm` cleanup don't
+                # mess things up. We can also (2) prevent changes to data mask
+                # ancestry (to involve user's quosure env rather than our
+                # quosure env) or contents (from edge case of user binding
+                # functions inside the mask) from potentially interfering by
+                # setting the promise's execution environment to skip over the
+                # data mask.
+                eval_env <- new.env(parent = asNamespace("epiprocess"))
+                eval_env[["local_measurement_colname"]] <- measurement_colname
+                delayedAssign(
+                  measurement_colname,
+                  cli_abort(c(
                     "Using `{format_varname(local_measurement_colname)}`
                      in `filter.epi_archive` may produce unexpected results.",
                     ">" = "See `?filter.epi_archive` details for how to proceed."
-                  ), class = "epiprocess__filter_archive__used_measurement"), assign.env = e)
-                })
+                  ), class = "epiprocess__filter_archive__used_measurement"),
+                  eval.env = eval_env,
+                  assign.env = e
+                )
               }
               break
             }
             e <- parent.env(e)
           }
-          # Don't mask similarly-named user objects:
-          rm(list = c("e", "measurement_colname"))
+          # Don't mask similarly-named user objects in ancestor envs:
+          rm(list = c("e", "measurement_colname", "eval_env"))
           TRUE
         },
         ...,
diff --git a/tests/testthat/test-methods-epi_archive.R b/tests/testthat/test-methods-epi_archive.R
index f9299d0f2..dafa5521e 100644
--- a/tests/testthat/test-methods-epi_archive.R
+++ b/tests/testthat/test-methods-epi_archive.R
@@ -222,8 +222,10 @@ test_that("filter.epi_archive works as expected", {
     ea2 %>% filter(time_value >= as.Date("2020-06-02"), cases >= 2L),
     class = "epiprocess__filter_archive__used_measurement"
   )
+  # Check for `for` + `delayedAssign` mishap in `expect_snapshot` (we should say
+  # something about `cases` (the relevant colname), not `deaths` (the last
+  # measurement colname)):
   ea2p <- ea2_data %>%
-    # to check for `for` + `delayedAssign` mishap in expect_snapshot
     mutate(deaths = 0) %>%
     as_epi_archive()
   expect_error(
@@ -234,6 +236,21 @@ test_that("filter.epi_archive works as expected", {
     ea2p %>% filter(cases >= median(cases), .by = geo_value),
     error = TRUE, cnd_class = TRUE
   )
+  # Check that we are insulated from other lazy eval traps:
+  expected <- rlang::catch_cnd(ea2p %>% filter(cases >= median(cases), .by = geo_value))
+  expect_class(expected$parent, "epiprocess__filter_archive__used_measurement")
+  with(list(cli_abort = function(...) stop("now, pretend user didn't have cli attached")), {
+    expect_equal(rlang::catch_cnd(ea2p %>% filter(cases >= median(cases), .by = geo_value))$parent$message,
+                 expected$parent$message)
+  })
+  expect_equal(
+    rlang::catch_cnd(ea2p %>% filter({
+      c <- function(...) stop("and that they overwrote `c` to try to debug their own code")
+      cases >= median(cases)
+    }, .by = geo_value))$parent$message,
+    expected$parent$message
+  )
+
 
   # Escape hatch:
   expect_equal(

From 751f547c177f00cf0e4bba82e4cc887dd318bebc Mon Sep 17 00:00:00 2001
From: brookslogan <brookslogan@users.noreply.github.com>
Date: Wed, 16 Apr 2025 19:24:52 +0000
Subject: [PATCH 097/117] style: styler (GHA)

---
 tests/testthat/test-methods-epi_archive.R | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/tests/testthat/test-methods-epi_archive.R b/tests/testthat/test-methods-epi_archive.R
index dafa5521e..a073594f9 100644
--- a/tests/testthat/test-methods-epi_archive.R
+++ b/tests/testthat/test-methods-epi_archive.R
@@ -240,14 +240,19 @@ test_that("filter.epi_archive works as expected", {
   expected <- rlang::catch_cnd(ea2p %>% filter(cases >= median(cases), .by = geo_value))
   expect_class(expected$parent, "epiprocess__filter_archive__used_measurement")
   with(list(cli_abort = function(...) stop("now, pretend user didn't have cli attached")), {
-    expect_equal(rlang::catch_cnd(ea2p %>% filter(cases >= median(cases), .by = geo_value))$parent$message,
-                 expected$parent$message)
+    expect_equal(
+      rlang::catch_cnd(ea2p %>% filter(cases >= median(cases), .by = geo_value))$parent$message,
+      expected$parent$message
+    )
   })
   expect_equal(
-    rlang::catch_cnd(ea2p %>% filter({
-      c <- function(...) stop("and that they overwrote `c` to try to debug their own code")
-      cases >= median(cases)
-    }, .by = geo_value))$parent$message,
+    rlang::catch_cnd(ea2p %>% filter(
+      {
+        c <- function(...) stop("and that they overwrote `c` to try to debug their own code")
+        cases >= median(cases)
+      },
+      .by = geo_value
+    ))$parent$message,
     expected$parent$message
   )
 

From 7fb25fb32c49693eb74c5808bd35c7ed63bf47d2 Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Wed, 16 Apr 2025 12:28:37 -0700
Subject: [PATCH 098/117] Bump DESCRIPTION version

---
 DESCRIPTION | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/DESCRIPTION b/DESCRIPTION
index c7a651d8e..b008956d3 100755
--- a/DESCRIPTION
+++ b/DESCRIPTION
@@ -1,7 +1,7 @@
 Type: Package
 Package: epiprocess
 Title: Tools for basic signal processing in epidemiology
-Version: 0.11.5
+Version: 0.11.6
 Authors@R: c(
     person("Jacob", "Bien", role = "ctb"),
     person("Logan", "Brooks", , "lcbrooks+github@andrew.cmu.edu", role = c("aut", "cre")),

From 8f63ad8e11ebed25f71c947e121a26302789735e Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Wed, 16 Apr 2025 12:39:01 -0700
Subject: [PATCH 099/117] Address failing checks

---
 R/methods-epi_archive.R | 2 +-
 _pkgdown.yml            | 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/R/methods-epi_archive.R b/R/methods-epi_archive.R
index 3f2c987aa..cff16aede 100644
--- a/R/methods-epi_archive.R
+++ b/R/methods-epi_archive.R
@@ -1093,7 +1093,7 @@ filter.epi_archive <- function(.data, ..., .by = NULL, .format_aware = FALSE) {
           # likely just the parent env, but search to make sure, in a way akin
           # to `<<-`:
           e <- environment()
-          while (!identical(e, globalenv()) && !identical(e, emptyenv())) {
+          while (!identical(e, globalenv()) && !identical(e, emptyenv())) { # nolint:vector_logic_linter
             if ("version" %in% names(e)) {
               # This is where the column bindings are. Replace the forbidden ones.
               # They are expected to be active bindings, so directly
diff --git a/_pkgdown.yml b/_pkgdown.yml
index 2953039cb..ebd73dc84 100644
--- a/_pkgdown.yml
+++ b/_pkgdown.yml
@@ -72,8 +72,9 @@ reference:
       - epix_as_of
       - epix_as_of_current
       - epix_slide
-      - epix_merge
       - revision_summary
+      - epix_merge
+      - filter.epi_archive
       - epix_fill_through_version
       - epix_truncate_versions_after
       - set_versions_end

From 8b569ff96a9cdb6e006b700d0a15f54430c0fd09 Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Wed, 16 Apr 2025 12:50:04 -0700
Subject: [PATCH 100/117] docs(filter.epi_archive): suggest alternative for
 meas col filtering

---
 R/methods-epi_archive.R   | 9 ++++++---
 man/filter.epi_archive.Rd | 9 ++++++---
 man/revision_analysis.Rd  | 2 +-
 3 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/R/methods-epi_archive.R b/R/methods-epi_archive.R
index cff16aede..33f731417 100644
--- a/R/methods-epi_archive.R
+++ b/R/methods-epi_archive.R
@@ -1036,9 +1036,12 @@ dplyr_col_modify.col_modify_recorder_df <- function(data, cols) {
 #'
 #' By default, using the `version` column or measurement columns is disabled as
 #' it's easy to get unexpected results. See if either [`epix_as_of`] or
-#' [`epix_slide`] works as an alternative. If they don't cover your use case,
-#' then you can set `.format_aware = TRUE` to enable usage of these columns, but
-#' be careful to:
+#' [`epix_slide`] works for any version selection you have in mind: for version
+#' selection, see the `version` or `.versions` args, respectively; for
+#' measurement column-based filtering, try `filter`ing after `epix_as_of` or
+#' inside the `.f` in `epix_slide()`. If they don't cover your use case, then
+#' you can set `.format_aware = TRUE` to enable usage of these columns, but be
+#' careful to:
 #' * Factor in that `.data$DT` may have been converted into a compact format
 #'   based on diffing consecutive versions, and the last version of each
 #'   observation in `.data$DT` will always be carried forward to future
diff --git a/man/filter.epi_archive.Rd b/man/filter.epi_archive.Rd
index a007bdc5e..5f9d72db4 100644
--- a/man/filter.epi_archive.Rd
+++ b/man/filter.epi_archive.Rd
@@ -23,9 +23,12 @@ details.}
 \details{
 By default, using the \code{version} column or measurement columns is disabled as
 it's easy to get unexpected results. See if either \code{\link{epix_as_of}} or
-\code{\link{epix_slide}} works as an alternative. If they don't cover your use case,
-then you can set \code{.format_aware = TRUE} to enable usage of these columns, but
-be careful to:
+\code{\link{epix_slide}} works for any version selection you have in mind: for version
+selection, see the \code{version} or \code{.versions} args, respectively; for
+measurement column-based filtering, try \code{filter}ing after \code{epix_as_of} or
+inside the \code{.f} in \code{epix_slide()}. If they don't cover your use case, then
+you can set \code{.format_aware = TRUE} to enable usage of these columns, but be
+careful to:
 \itemize{
 \item Factor in that \code{.data$DT} may have been converted into a compact format
 based on diffing consecutive versions, and the last version of each
diff --git a/man/revision_analysis.Rd b/man/revision_analysis.Rd
index 23ddf0212..1c7336b3c 100644
--- a/man/revision_analysis.Rd
+++ b/man/revision_analysis.Rd
@@ -56,7 +56,7 @@ of the \code{versions_end} are removed. \code{min_waiting_period} should charact
 the typical time during which most significant revisions occur. The default
 of 60 days corresponds to a typical near-final value for case counts as
 reported in the context of insurance. To avoid this filtering, either set
-to \code{NULL} or 0. This will be rounded up to the appropriate \code{time_type} if
+to \code{NULL} or 0. A \code{difftime} will be rounded up to the appropriate \code{time_type} if
 necessary (that is 5 days will be rounded to 1 week if the data is weekly).}
 
 \item{within_latest}{double between 0 and 1. Determines the threshold

From 26721facbfec803d8b97b03cd1d20b838632b894 Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Wed, 16 Apr 2025 13:33:44 -0700
Subject: [PATCH 101/117] tests: `expect_equal` when comparing Dates due to
 inconsistent backing typeof

---
 tests/testthat/test-epi_slide.R | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/tests/testthat/test-epi_slide.R b/tests/testthat/test-epi_slide.R
index 0aa4aca7f..b9be125df 100644
--- a/tests/testthat/test-epi_slide.R
+++ b/tests/testthat/test-epi_slide.R
@@ -610,7 +610,7 @@ test_that("epi_slide_opt helper `full_date_seq` returns expected date values", {
   before <- 2L
   after <- 1L
 
-  expect_identical(
+  expect_equal(
     full_date_seq(
       epi_data_missing %>%
         mutate(time_value = days) %>%
@@ -627,7 +627,7 @@ test_that("epi_slide_opt helper `full_date_seq` returns expected date values", {
       pad_late_dates = as.Date(c("2022-01-08"))
     )
   )
-  expect_identical(
+  expect_equal(
     full_date_seq(
       epi_data_missing %>%
         mutate(time_value = weeks) %>%
@@ -677,7 +677,7 @@ test_that("epi_slide_opt helper `full_date_seq` returns expected date values", {
   before <- 5L
   after <- 0L
 
-  expect_identical(
+  expect_equal(
     full_date_seq(
       epi_data_missing %>%
         mutate(time_value = days) %>%
@@ -701,7 +701,7 @@ test_that("epi_slide_opt helper `full_date_seq` returns expected date values", {
   before <- 0L
   after <- 3L
 
-  expect_identical(
+  expect_equal(
     full_date_seq(
       epi_data_missing %>%
         mutate(time_value = days) %>%

From e4177567771a1135b97db42f98d69aa40082e7af Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Wed, 16 Apr 2025 14:06:46 -0700
Subject: [PATCH 102/117] Relax typeof checks due to Date backing typeof
 inconsistencies

---
 R/archive.R                                  | 41 ++++++++------------
 R/methods-epi_archive.R                      | 11 +-----
 man/epi_archive.Rd                           | 34 ++++++++--------
 tests/testthat/test-archive-version-bounds.R | 15 +++++--
 tests/testthat/test-time-utils.R             |  4 +-
 5 files changed, 48 insertions(+), 57 deletions(-)

diff --git a/R/archive.R b/R/archive.R
index 922371f1c..88f8d7049 100644
--- a/R/archive.R
+++ b/R/archive.R
@@ -52,13 +52,6 @@ validate_version_bound <- function(version_bound, x, na_ok = FALSE,
         class = "epiprocess__version_bound_mismatched_class"
       )
     }
-    if (!identical(typeof(version_bound), typeof(x[["version"]]))) {
-      cli_abort(
-        "{version_bound_arg} must have the same `typeof` as x$version,
-        which has a `typeof` of {typeof(x$version)}",
-        class = "epiprocess__version_bound_mismatched_typeof"
-      )
-    }
   }
 
   return(invisible(NULL))
@@ -207,23 +200,23 @@ next_after.Date <- function(x) x + 1L
 #'   undergo tiny nonmeaningful revisions and the archive object with the
 #'   default setting is too large.
 #' @param clobberable_versions_start Optional; `length`-1; either a value of the
-#'   same `class` and `typeof` as `x$version`, or an `NA` of any `class` and
-#'   `typeof`: specifically, either (a) the earliest version that could be
-#'   subject to "clobbering" (being overwritten with different update data, but
-#'   using the *same* version tag as the old update data), or (b) `NA`, to
-#'   indicate that no versions are clobberable. There are a variety of reasons
-#'   why versions could be clobberable under routine circumstances, such as (a)
-#'   today's version of one/all of the columns being published after initially
-#'   being filled with `NA` or LOCF, (b) a buggy version of today's data being
-#'   published but then fixed and republished later in the day, or (c) data
-#'   pipeline delays (e.g., publisher uploading, periodic scraping, database
-#'   syncing, periodic fetching, etc.) that make events (a) or (b) reflected
-#'   later in the day (or even on a different day) than expected; potential
-#'   causes vary between different data pipelines. The default value is `NA`,
-#'   which doesn't consider any versions to be clobberable. Another setting that
-#'   may be appropriate for some pipelines is `max_version_with_row_in(x)`.
-#' @param versions_end Optional; length-1, same `class` and `typeof` as
-#'   `x$version`: what is the last version we have observed? The default is
+#'   same `class` as `x$version`, or an `NA` of any `class`: specifically,
+#'   either (a) the earliest version that could be subject to "clobbering"
+#'   (being overwritten with different update data, but using the *same* version
+#'   tag as the old update data), or (b) `NA`, to indicate that no versions are
+#'   clobberable. There are a variety of reasons why versions could be
+#'   clobberable under routine circumstances, such as (a) today's version of
+#'   one/all of the columns being published after initially being filled with
+#'   `NA` or LOCF, (b) a buggy version of today's data being published but then
+#'   fixed and republished later in the day, or (c) data pipeline delays (e.g.,
+#'   publisher uploading, periodic scraping, database syncing, periodic
+#'   fetching, etc.) that make events (a) or (b) reflected later in the day (or
+#'   even on a different day) than expected; potential causes vary between
+#'   different data pipelines. The default value is `NA`, which doesn't consider
+#'   any versions to be clobberable. Another setting that may be appropriate for
+#'   some pipelines is `max_version_with_row_in(x)`.
+#' @param versions_end Optional; length-1, same `class` as `x$version`: what is
+#'   the last version we have observed? The default is
 #'   `max_version_with_row_in(x)`, but values greater than this could also be
 #'   valid, and would indicate that we observed additional versions of the data
 #'   beyond `max(x$version)`, but they all contained empty updates. (The default
diff --git a/R/methods-epi_archive.R b/R/methods-epi_archive.R
index 33f731417..faf3c1289 100644
--- a/R/methods-epi_archive.R
+++ b/R/methods-epi_archive.R
@@ -80,19 +80,13 @@ epix_as_of <- function(x, version, min_time_value = -Inf, all_versions = FALSE,
       "`version` must have the same `class` vector as `epi_archive$DT$version`."
     )
   }
-  if (!identical(typeof(version), typeof(x$DT$version))) {
-    cli_abort(
-      "`version` must have the same `typeof` as `epi_archive$DT$version`."
-    )
-  }
   assert_scalar(version, na.ok = FALSE)
   if (version > x$versions_end) {
     cli_abort("`version` must be at most `epi_archive$versions_end`.")
   }
   assert_scalar(min_time_value, na.ok = FALSE)
   min_time_value_inf <- is.infinite(min_time_value) && min_time_value < 0
-  min_time_value_same_type <- typeof(min_time_value) == typeof(x$DT$time_value) &
-    class(min_time_value) == class(x$DT$time_value)
+  min_time_value_same_type <- identical(class(min_time_value), class(x$DT$time_value))
   if (!min_time_value_inf && !min_time_value_same_type) {
     cli_abort("`min_time_value` must be either -Inf or a time_value of the same type and
       class as `epi_archive$time_value`.")
@@ -941,9 +935,6 @@ epix_truncate_versions_after.epi_archive <- function(x, max_version) {
   if (!identical(class(max_version), class(x$DT$version))) {
     cli_abort("`max_version` must have the same `class` as `epi_archive$DT$version`.")
   }
-  if (!identical(typeof(max_version), typeof(x$DT$version))) {
-    cli_abort("`max_version` must have the same `typeof` as `epi_archive$DT$version`.")
-  }
   assert_scalar(max_version, na.ok = FALSE)
   if (max_version > x$versions_end) {
     cli_abort("`max_version` must be at most `epi_archive$versions_end`.")
diff --git a/man/epi_archive.Rd b/man/epi_archive.Rd
index b92cd5057..f91834f37 100644
--- a/man/epi_archive.Rd
+++ b/man/epi_archive.Rd
@@ -63,21 +63,21 @@ undergo tiny nonmeaningful revisions and the archive object with the
 default setting is too large.}
 
 \item{clobberable_versions_start}{Optional; \code{length}-1; either a value of the
-same \code{class} and \code{typeof} as \code{x$version}, or an \code{NA} of any \code{class} and
-\code{typeof}: specifically, either (a) the earliest version that could be
-subject to "clobbering" (being overwritten with different update data, but
-using the \emph{same} version tag as the old update data), or (b) \code{NA}, to
-indicate that no versions are clobberable. There are a variety of reasons
-why versions could be clobberable under routine circumstances, such as (a)
-today's version of one/all of the columns being published after initially
-being filled with \code{NA} or LOCF, (b) a buggy version of today's data being
-published but then fixed and republished later in the day, or (c) data
-pipeline delays (e.g., publisher uploading, periodic scraping, database
-syncing, periodic fetching, etc.) that make events (a) or (b) reflected
-later in the day (or even on a different day) than expected; potential
-causes vary between different data pipelines. The default value is \code{NA},
-which doesn't consider any versions to be clobberable. Another setting that
-may be appropriate for some pipelines is \code{max_version_with_row_in(x)}.}
+same \code{class} as \code{x$version}, or an \code{NA} of any \code{class}: specifically,
+either (a) the earliest version that could be subject to "clobbering"
+(being overwritten with different update data, but using the \emph{same} version
+tag as the old update data), or (b) \code{NA}, to indicate that no versions are
+clobberable. There are a variety of reasons why versions could be
+clobberable under routine circumstances, such as (a) today's version of
+one/all of the columns being published after initially being filled with
+\code{NA} or LOCF, (b) a buggy version of today's data being published but then
+fixed and republished later in the day, or (c) data pipeline delays (e.g.,
+publisher uploading, periodic scraping, database syncing, periodic
+fetching, etc.) that make events (a) or (b) reflected later in the day (or
+even on a different day) than expected; potential causes vary between
+different data pipelines. The default value is \code{NA}, which doesn't consider
+any versions to be clobberable. Another setting that may be appropriate for
+some pipelines is \code{max_version_with_row_in(x)}.}
 
 \item{.versions_end}{location based versions_end, used to avoid prefix
 \code{version = issue} from being assigned to \code{versions_end} instead of being
@@ -86,8 +86,8 @@ used to rename columns.}
 \item{...}{used for specifying column names, as in \code{\link[dplyr:rename]{dplyr::rename}}. For
 example \code{version = release_date}}
 
-\item{versions_end}{Optional; length-1, same \code{class} and \code{typeof} as
-\code{x$version}: what is the last version we have observed? The default is
+\item{versions_end}{Optional; length-1, same \code{class} as \code{x$version}: what is
+the last version we have observed? The default is
 \code{max_version_with_row_in(x)}, but values greater than this could also be
 valid, and would indicate that we observed additional versions of the data
 beyond \code{max(x$version)}, but they all contained empty updates. (The default
diff --git a/tests/testthat/test-archive-version-bounds.R b/tests/testthat/test-archive-version-bounds.R
index ee500d308..878cde1c0 100644
--- a/tests/testthat/test-archive-version-bounds.R
+++ b/tests/testthat/test-archive-version-bounds.R
@@ -71,14 +71,21 @@ test_that("`validate_version_bound` validate and class checks together allow and
   # Bad:
   expect_error(validate_version_bound(3.5, x_int, TRUE, "vb"), regexp = "must have the same `class`")
   expect_error(validate_version_bound(.Machine$integer.max, x_dbl, TRUE, "vb"), regexp = "must have the same `class`")
-  expect_error(validate_version_bound(
-    `class<-`(list(2), "clazz"),
-    tibble::tibble(version = `class<-`(5L, "clazz")), TRUE, "vb"
-  ), regexp = "must have the same `typeof`", class = "epiprocess__version_bound_mismatched_typeof")
   # Maybe questionable:
   expect_error(validate_version_bound(3, x_int, TRUE, "vb"))
   expect_error(validate_version_bound(3L, x_dbl, TRUE, "vb"))
+  # Maybe questionable, but accept to relax things a bit, as this is happening
+  # with Dates in some R(?) versions. Might need to turn some things into
+  # vec_cast_common, but idea is just make Date stuff work for now:
+  validate_version_bound(
+    `class<-`(list(2), "clazz"),
+    tibble::tibble(version = `class<-`(5L, "clazz")), TRUE, "vb"
+  )
   # Good:
+  validate_version_bound(
+    `class<-`(2, "Date"),
+    tibble::tibble(version = `class<-`(5L, "Date")), TRUE, "vb"
+  )
   validate_version_bound(my_int, x_int, TRUE, "vb")
   validate_version_bound(my_dbl, x_dbl, TRUE, "vb")
   validate_version_bound(my_list, x_list, TRUE, "vb")
diff --git a/tests/testthat/test-time-utils.R b/tests/testthat/test-time-utils.R
index 6fe8d78ad..7ddd70c04 100644
--- a/tests/testthat/test-time-utils.R
+++ b/tests/testthat/test-time-utils.R
@@ -17,11 +17,11 @@ test_that("guess_period works", {
   # On Dates:
   daily_dates <- seq(as.Date("2020-01-01"), as.Date("2020-01-15"), by = "day")
   weekly_dates <- seq(as.Date("2020-01-01"), as.Date("2020-01-15"), by = "week")
-  expect_identical(
+  expect_equal(
     daily_dates[[1L]] + guess_period(daily_dates) * (seq_along(daily_dates) - 1L),
     daily_dates
   )
-  expect_identical(
+  expect_equal(
     weekly_dates[[1L]] + guess_period(weekly_dates) * (seq_along(weekly_dates) - 1L),
     weekly_dates
   )

From 332362bd17505cbe35ad5523f1094fe17b170fd7 Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Wed, 16 Apr 2025 14:08:56 -0700
Subject: [PATCH 103/117] Address tidyselect warnings on .data$

---
 R/methods-epi_df.R | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/R/methods-epi_df.R b/R/methods-epi_df.R
index 7870dede8..51f6cf33f 100644
--- a/R/methods-epi_df.R
+++ b/R/methods-epi_df.R
@@ -532,7 +532,7 @@ sum_groups_epi_df <- function(.x, sum_cols, group_cols = "time_value") {
   if (!"geo_value" %in% group_cols) {
     out <- out %>%
       mutate(geo_value = "total") %>%
-      relocate(.data$geo_value, .before = 1)
+      relocate("geo_value", .before = 1)
   }
 
   # The `geo_type` will be correctly inherited here by the following logic:

From 65f907ea5e95608704ab38603dbbb43d807fac57 Mon Sep 17 00:00:00 2001
From: dsweber2 <david.weber2@pm.me>
Date: Fri, 18 Apr 2025 14:38:40 -0500
Subject: [PATCH 104/117] printing for revision analysis off

---
 R/revision_analysis.R | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/R/revision_analysis.R b/R/revision_analysis.R
index 86210effc..fc057c560 100644
--- a/R/revision_analysis.R
+++ b/R/revision_analysis.R
@@ -233,7 +233,7 @@ print.revision_analysis <- function(x,
   }
   if (is.null(abs_spread_threshold)) abs_spread_threshold <- .05 * x$max_val
   rev_beh <- x$revision_behavior
-  cli::cli_h2("An epi_archive spanning {.val {x$range_time_values[1]}} to {.val {x$range_time_values[1]}}.")
+  cli::cli_h2("An epi_archive spanning {.val {x$range_time_values[1]}} to {.val {x$range_time_values[2]}}.")
   cli::cli_h3("Min lag (time to first version):")
   time_delta_summary(rev_beh$min_lag, x$time_type) %>% print()
   if (!x$drop_nas) {

From eb5c30d3bd366dae299c88f335c2aad1e2f5ead6 Mon Sep 17 00:00:00 2001
From: dsweber2 <david.weber2@pm.me>
Date: Fri, 18 Apr 2025 15:55:58 -0500
Subject: [PATCH 105/117] test snapshot Ref:

---
 tests/testthat/_snaps/revision-latency-functions.md | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/tests/testthat/_snaps/revision-latency-functions.md b/tests/testthat/_snaps/revision-latency-functions.md
index 1fbbbe06e..424dff11e 100644
--- a/tests/testthat/_snaps/revision-latency-functions.md
+++ b/tests/testthat/_snaps/revision-latency-functions.md
@@ -4,7 +4,7 @@
       rs1
     Message
       
-      -- An epi_archive spanning 2020-01-01 to 2020-01-01. --
+      -- An epi_archive spanning 2020-01-01 to 2020-01-04. --
       
       -- Min lag (time to first version): 
     Output
@@ -62,7 +62,7 @@
       rs2
     Message
       
-      -- An epi_archive spanning 2020-01-01 to 2020-01-01. --
+      -- An epi_archive spanning 2020-01-01 to 2020-01-04. --
       
       -- Min lag (time to first version): 
     Output
@@ -122,7 +122,7 @@
       rs3
     Message
       
-      -- An epi_archive spanning 2020-01-01 to 2020-01-01. --
+      -- An epi_archive spanning 2020-01-01 to 2020-01-22. --
       
       -- Min lag (time to first version): 
     Output
@@ -182,7 +182,7 @@
       rs4
     Message
       
-      -- An epi_archive spanning 2020 Jan to 2020 Jan. --
+      -- An epi_archive spanning 2020 Jan to 2020 Apr. --
       
       -- Min lag (time to first version): 
     Output
@@ -242,7 +242,7 @@
       print(rs5, quick_revision = 3)
     Message
       
-      -- An epi_archive spanning 1 to 1. --
+      -- An epi_archive spanning 1 to 4. --
       
       -- Min lag (time to first version): 
     Output

From 80413a22f73f2442a1f48107c2ac718d2c17d8d2 Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Wed, 21 May 2025 16:40:13 -0700
Subject: [PATCH 106/117] docs(epix_slide): remove unnecessary S3 impl
 prototype boilerplate

---
 R/grouped_epi_archive.R |  3 +--
 R/methods-epi_archive.R |  1 -
 man/epix_slide.Rd       | 24 +-----------------------
 3 files changed, 2 insertions(+), 26 deletions(-)

diff --git a/R/grouped_epi_archive.R b/R/grouped_epi_archive.R
index 378ea13bc..77101ea05 100644
--- a/R/grouped_epi_archive.R
+++ b/R/grouped_epi_archive.R
@@ -203,13 +203,12 @@ ungroup.grouped_epi_archive <- function(x, ...) {
 }
 
 
-#' @rdname epix_slide
-#'
 #' @importFrom data.table key address rbindlist setDF copy
 #' @importFrom tibble as_tibble new_tibble validate_tibble
 #' @importFrom dplyr group_by groups
 #' @importFrom rlang !! !!! enquo quo_is_missing enquos is_quosure sym syms
 #'  env missing_arg
+#'
 #' @export
 epix_slide.grouped_epi_archive <- function(
     .x,
diff --git a/R/methods-epi_archive.R b/R/methods-epi_archive.R
index 42afbf46a..01b4de849 100644
--- a/R/methods-epi_archive.R
+++ b/R/methods-epi_archive.R
@@ -842,7 +842,6 @@ epix_slide <- function(
 }
 
 
-#' @rdname epix_slide
 #' @export
 epix_slide.epi_archive <- function(
     .x,
diff --git a/man/epix_slide.Rd b/man/epix_slide.Rd
index c0eb3f2bd..ce0ddefc5 100644
--- a/man/epix_slide.Rd
+++ b/man/epix_slide.Rd
@@ -1,9 +1,7 @@
 % Generated by roxygen2: do not edit by hand
-% Please edit documentation in R/methods-epi_archive.R, R/grouped_epi_archive.R
+% Please edit documentation in R/methods-epi_archive.R
 \name{epix_slide}
 \alias{epix_slide}
-\alias{epix_slide.epi_archive}
-\alias{epix_slide.grouped_epi_archive}
 \title{Take each requested (group and) version in an archive, run a computation (e.g., forecast)}
 \usage{
 epix_slide(
@@ -15,26 +13,6 @@ epix_slide(
   .new_col_name = NULL,
   .all_versions = FALSE
 )
-
-\method{epix_slide}{epi_archive}(
-  .x,
-  .f,
-  ...,
-  .before = Inf,
-  .versions = NULL,
-  .new_col_name = NULL,
-  .all_versions = FALSE
-)
-
-\method{epix_slide}{grouped_epi_archive}(
-  .x,
-  .f,
-  ...,
-  .before = Inf,
-  .versions = NULL,
-  .new_col_name = NULL,
-  .all_versions = FALSE
-)
 }
 \arguments{
 \item{.x}{An \code{\link{epi_archive}} or \code{\link{grouped_epi_archive}} object. If ungrouped,

From 675af5b825388cbddec3bab1797ede92693750e4 Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Wed, 21 May 2025 17:10:33 -0700
Subject: [PATCH 107/117] Bump version

---
 DESCRIPTION | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/DESCRIPTION b/DESCRIPTION
index b008956d3..183097a82 100755
--- a/DESCRIPTION
+++ b/DESCRIPTION
@@ -1,7 +1,7 @@
 Type: Package
 Package: epiprocess
 Title: Tools for basic signal processing in epidemiology
-Version: 0.11.6
+Version: 0.11.7
 Authors@R: c(
     person("Jacob", "Bien", role = "ctb"),
     person("Logan", "Brooks", , "lcbrooks+github@andrew.cmu.edu", role = c("aut", "cre")),

From 30b2dc7082ca3afc3cd0cac10847d4efdda0a20c Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Thu, 22 May 2025 09:30:30 -0700
Subject: [PATCH 108/117] perf: fix temp key in filter.epi_archive that has to
 be reset

---
 R/methods-epi_archive.R | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/R/methods-epi_archive.R b/R/methods-epi_archive.R
index faf3c1289..3cfe4a8da 100644
--- a/R/methods-epi_archive.R
+++ b/R/methods-epi_archive.R
@@ -1159,7 +1159,7 @@ filter.epi_archive <- function(.data, ..., .by = NULL, .format_aware = FALSE) {
   out_other_keys <- .data$other_keys
   # `filter` makes no guarantees about not aliasing columns in its result when
   # the filter condition is all TRUE, so don't setDT.
-  out_dtbl <- as.data.table(out_tbl, key = out_other_keys)
+  out_dtbl <- as.data.table(out_tbl, key = c("geo_value", "time_value", out_other_keys, "version"))
   result <- new_epi_archive(
     out_dtbl,
     out_geo_type, out_time_type, out_other_keys,

From 81c76d91e14cfe8812fc47abbb9aa4541e0e55e0 Mon Sep 17 00:00:00 2001
From: brookslogan <lcbrooks+github@andrew.cmu.edu>
Date: Wed, 28 May 2025 10:52:46 -0700
Subject: [PATCH 109/117] List /style before /doc in PR template

so that following in order will style roxygen `@examples` and /doc will propagate styled examples to the .Rd files, rather than leaving .Rd files out of date.
---
 .github/pull_request_template.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.github/pull_request_template.md b/.github/pull_request_template.md
index 93b00f150..e7e5fa890 100644
--- a/.github/pull_request_template.md
+++ b/.github/pull_request_template.md
@@ -15,8 +15,8 @@ Please:
       Collect the changes under the next release number (e.g. if you are on
       1.7.2, then write your changes under the 1.8 heading).
 - [ ] Styling and documentation checks. Make a PR comment with:
-  - `/document` to check the package documentation and fix any issues.
   - `/style` to check the style and fix any issues.
+  - `/document` to check the package documentation and fix any issues.
   - `/preview-docs` to preview the docs.
   - See Actions GitHub tab to track progress of these commands.
 - See [DEVELOPMENT.md](DEVELOPMENT.md) for more information on the development

From 76aac9629b6f073a5203b5839e0ac96a03baf038 Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Wed, 28 May 2025 11:04:46 -0700
Subject: [PATCH 110/117] docs: fix partial variable rename in example

---
 R/methods-epi_archive.R | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/R/methods-epi_archive.R b/R/methods-epi_archive.R
index 7bbcb7fbc..f3a8d0318 100644
--- a/R/methods-epi_archive.R
+++ b/R/methods-epi_archive.R
@@ -850,7 +850,7 @@ epix_detailed_restricted_mutate <- function(.data, ...) {
 #'       )
 #'     },
 #'     .before = 5, .all_versions = TRUE,
-#'     .versions = versions
+#'     .versions = requested_versions
 #'   ) %>%
 #'   ungroup() %>%
 #'   # Focus on one geo_value so we can better see the columns above:

From b59cd8fd0f463407ac8424af2f9ca0c8ea7d33de Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Wed, 28 May 2025 11:23:14 -0700
Subject: [PATCH 111/117] docs: document()

---
 man/epix_slide.Rd | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/man/epix_slide.Rd b/man/epix_slide.Rd
index ce0ddefc5..ad6be18b5 100644
--- a/man/epix_slide.Rd
+++ b/man/epix_slide.Rd
@@ -225,7 +225,7 @@ archive_cases_dv_subset \%>\%
       )
     },
     .before = 5, .all_versions = TRUE,
-    .versions = versions
+    .versions = requested_versions
   ) \%>\%
   ungroup() \%>\%
   # Focus on one geo_value so we can better see the columns above:

From a50eaef9143683b0996aa94e933a41346999ede8 Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Wed, 28 May 2025 11:38:17 -0700
Subject: [PATCH 112/117] docs: address unqualified-cross-reference check
 issues

---
 R/autoplot.R             | 4 ++--
 R/inline-roxygen.R       | 2 +-
 R/revision_analysis.R    | 2 +-
 R/slide.R                | 6 +++---
 man/autoplot-epi.Rd      | 4 ++--
 man/epi_slide.Rd         | 6 +++---
 man/epi_slide_opt.Rd     | 2 +-
 man/revision_analysis.Rd | 2 +-
 man/sum_groups_epi_df.Rd | 2 +-
 9 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/R/autoplot.R b/R/autoplot.R
index 15d577ec4..2644f3ba4 100644
--- a/R/autoplot.R
+++ b/R/autoplot.R
@@ -1,7 +1,7 @@
 #' Automatically plot an epi_df or epi_archive
 #'
 #' @param object,x An `epi_df` or `epi_archive`
-#' @param ... <[`tidy-select`][dplyr_tidy_select]> One or more unquoted
+#' @param ... <[`tidy-select`][dplyr::dplyr_tidy_select]> One or more unquoted
 #'   expressions separated by commas. Variable names can be used as if they
 #'   were positions in the data frame, so expressions like `x:y` can
 #'   be used to select a range of variables.
@@ -22,7 +22,7 @@
 #' @param .max_facets `r lifecycle::badge("deprecated")`
 #' @param .facet_filter Select which facets will be displayed. Especially
 #'   useful for when there are many `geo_value`'s or keys. This is a
-#'   <[`rlang`][args_data_masking]> expression along the lines of [dplyr::filter()].
+#'   <[`rlang`][rlang::args_data_masking]> expression along the lines of [dplyr::filter()].
 #'   However, it must be a single expression combined with the `&` operator. This
 #'   contrasts to the typical use case which allows multiple comma-separated expressions
 #'   which are implicitly combined with `&`. When multiple variables are selected
diff --git a/R/inline-roxygen.R b/R/inline-roxygen.R
index ae2ce66c1..e6fb4cd11 100644
--- a/R/inline-roxygen.R
+++ b/R/inline-roxygen.R
@@ -7,7 +7,7 @@
 #'
 #' @keywords internal
 tidyselect_arg_roxygen <- '
-  <[`tidy-select`][dplyr_tidy_select]> An unquoted column
+  <[`tidy-select`][dplyr::dplyr_tidy_select]> An unquoted column
   name (e.g., `cases`), multiple column names (e.g., `c(cases, deaths)`),
   [other tidy-select expression][tidyselect::language], or a vector of
   characters (e.g. `c("cases", "deaths")`). Variable names can be used as if
diff --git a/R/revision_analysis.R b/R/revision_analysis.R
index fc057c560..4e12fb77c 100644
--- a/R/revision_analysis.R
+++ b/R/revision_analysis.R
@@ -26,7 +26,7 @@
 #'   the window afterwards at 150.
 #'
 #' @param epi_arch an epi_archive to be analyzed
-#' @param ... <[`tidyselect`][dplyr_tidy_select]>, used to choose the column to
+#' @param ... <[`tidyselect`][dplyr::dplyr_tidy_select]>, used to choose the column to
 #'   summarize. If empty and there is only one value/measurement column (i.e.,
 #'   not in [`key_colnames`]) in the archive, it will automatically select it.
 #'   If supplied, `...` must select exactly one column.
diff --git a/R/slide.R b/R/slide.R
index 78a14251e..17b4c45c5 100644
--- a/R/slide.R
+++ b/R/slide.R
@@ -108,12 +108,12 @@
 #'
 #' ## Advanced uses of `.f` via tidy evaluation
 #'
-#' If specifying `.f` via tidy evaluation, in addition to the standard [`.data`]
-#' and [`.env`], we make some additional "pronoun"-like bindings available:
+#' If specifying `.f` via tidy evaluation, in addition to the standard [`.data`][rlang::.data]
+#' and [`.env`][rlang::.env], we make some additional "pronoun"-like bindings available:
 #'
 #'   - .x, which is like `.x` in [`dplyr::group_modify`]; an ordinary object
 #'     like an `epi_df` rather than an rlang [pronoun][rlang::as_data_pronoun]
-#'     like [`.data`]; this allows you to use additional `dplyr`, `tidyr`, and
+#'     like `.data`; this allows you to use additional `dplyr`, `tidyr`, and
 #'     `epiprocess` operations. If you have multiple expressions in `...`, this
 #'     won't let you refer to the output of the earlier expressions, but `.data`
 #'     will.
diff --git a/man/autoplot-epi.Rd b/man/autoplot-epi.Rd
index 6ed9ddafb..c7c8a7c4f 100644
--- a/man/autoplot-epi.Rd
+++ b/man/autoplot-epi.Rd
@@ -34,7 +34,7 @@
 \arguments{
 \item{object, x}{An \code{epi_df} or \code{epi_archive}}
 
-\item{...}{<\code{\link[=dplyr_tidy_select]{tidy-select}}> One or more unquoted
+\item{...}{<\code{\link[dplyr:dplyr_tidy_select]{tidy-select}}> One or more unquoted
 expressions separated by commas. Variable names can be used as if they
 were positions in the data frame, so expressions like \code{x:y} can
 be used to select a range of variables.}
@@ -60,7 +60,7 @@ locations would share the same color line.}
 
 \item{.facet_filter}{Select which facets will be displayed. Especially
 useful for when there are many \code{geo_value}'s or keys. This is a
-<\code{\link[=args_data_masking]{rlang}}> expression along the lines of \code{\link[dplyr:filter]{dplyr::filter()}}.
+<\code{\link[rlang:args_data_masking]{rlang}}> expression along the lines of \code{\link[dplyr:filter]{dplyr::filter()}}.
 However, it must be a single expression combined with the \code{&} operator. This
 contrasts to the typical use case which allows multiple comma-separated expressions
 which are implicitly combined with \code{&}. When multiple variables are selected
diff --git a/man/epi_slide.Rd b/man/epi_slide.Rd
index 734f66d27..e8fbb3acc 100644
--- a/man/epi_slide.Rd
+++ b/man/epi_slide.Rd
@@ -158,12 +158,12 @@ stage if necessary.
 
 \subsection{Advanced uses of \code{.f} via tidy evaluation}{
 
-If specifying \code{.f} via tidy evaluation, in addition to the standard \code{\link{.data}}
-and \code{\link{.env}}, we make some additional "pronoun"-like bindings available:
+If specifying \code{.f} via tidy evaluation, in addition to the standard \code{\link[rlang:dot-data]{.data}}
+and \code{\link[rlang:dot-data]{.env}}, we make some additional "pronoun"-like bindings available:
 \itemize{
 \item .x, which is like \code{.x} in \code{\link[dplyr:group_map]{dplyr::group_modify}}; an ordinary object
 like an \code{epi_df} rather than an rlang \link[rlang:as_data_mask]{pronoun}
-like \code{\link{.data}}; this allows you to use additional \code{dplyr}, \code{tidyr}, and
+like \code{.data}; this allows you to use additional \code{dplyr}, \code{tidyr}, and
 \code{epiprocess} operations. If you have multiple expressions in \code{...}, this
 won't let you refer to the output of the earlier expressions, but \code{.data}
 will.
diff --git a/man/epi_slide_opt.Rd b/man/epi_slide_opt.Rd
index 9b39bdb58..e1cb20924 100644
--- a/man/epi_slide_opt.Rd
+++ b/man/epi_slide_opt.Rd
@@ -51,7 +51,7 @@ epi_slide_sum(
 and any columns in \code{other_keys}. If grouped, we make sure the grouping is by
 \code{geo_value} and \code{other_keys}.}
 
-\item{.col_names}{<\code{\link[=dplyr_tidy_select]{tidy-select}}> An unquoted column
+\item{.col_names}{<\code{\link[dplyr:dplyr_tidy_select]{tidy-select}}> An unquoted column
 name (e.g., \code{cases}), multiple column names (e.g., \code{c(cases, deaths)}),
 \link[tidyselect:language]{other tidy-select expression}, or a vector of
 characters (e.g. \code{c("cases", "deaths")}). Variable names can be used as if
diff --git a/man/revision_analysis.Rd b/man/revision_analysis.Rd
index 1c7336b3c..59f0c9593 100644
--- a/man/revision_analysis.Rd
+++ b/man/revision_analysis.Rd
@@ -40,7 +40,7 @@ revision_summary(
 \arguments{
 \item{epi_arch}{an epi_archive to be analyzed}
 
-\item{...}{<\code{\link[=dplyr_tidy_select]{tidyselect}}>, used to choose the column to
+\item{...}{<\code{\link[dplyr:dplyr_tidy_select]{tidyselect}}>, used to choose the column to
 summarize. If empty and there is only one value/measurement column (i.e.,
 not in \code{\link{key_colnames}}) in the archive, it will automatically select it.
 If supplied, \code{...} must select exactly one column.}
diff --git a/man/sum_groups_epi_df.Rd b/man/sum_groups_epi_df.Rd
index 34ec99930..62eecf29d 100644
--- a/man/sum_groups_epi_df.Rd
+++ b/man/sum_groups_epi_df.Rd
@@ -9,7 +9,7 @@ sum_groups_epi_df(.x, sum_cols, group_cols = "time_value")
 \arguments{
 \item{.x}{an \code{epi_df}}
 
-\item{sum_cols}{<\code{\link[=dplyr_tidy_select]{tidy-select}}> An unquoted column
+\item{sum_cols}{<\code{\link[dplyr:dplyr_tidy_select]{tidy-select}}> An unquoted column
 name (e.g., \code{cases}), multiple column names (e.g., \code{c(cases, deaths)}),
 \link[tidyselect:language]{other tidy-select expression}, or a vector of
 characters (e.g. \code{c("cases", "deaths")}). Variable names can be used as if

From dd8ae17673ea78a717c8adb71b05ce60e5f4de66 Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Wed, 28 May 2025 11:40:21 -0700
Subject: [PATCH 113/117] docs: fix doubled link label

---
 R/slide.R | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/R/slide.R b/R/slide.R
index 17b4c45c5..eadcd4117 100644
--- a/R/slide.R
+++ b/R/slide.R
@@ -20,8 +20,8 @@
 #' (Here, the value 22 was selected using `epi_cor()` and averaging across
 #' `geo_value`s. See
 #' \href{https://www.medrxiv.org/content/10.1101/2024.12.27.24319518v1}{this
-#' manuscript}{this manuscript} for some warnings & information using similar
-#' types of CFR estimators.)
+#' manuscript} for some warnings & information using similar types of CFR
+#' estimators.)
 #'
 #' See `vignette("epi_df")` for more examples.
 #'

From 30c9c143f774cb41af200574280d41bede05107a Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Wed, 28 May 2025 11:40:46 -0700
Subject: [PATCH 114/117] docs: document()

---
 man/epi_slide.Rd | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/man/epi_slide.Rd b/man/epi_slide.Rd
index e8fbb3acc..2a1f7aef6 100644
--- a/man/epi_slide.Rd
+++ b/man/epi_slide.Rd
@@ -132,8 +132,8 @@ columns:
 (Here, the value 22 was selected using \code{epi_cor()} and averaging across
 \code{geo_value}s. See
 \href{https://www.medrxiv.org/content/10.1101/2024.12.27.24319518v1}{this
-manuscript}{this manuscript} for some warnings & information using similar
-types of CFR estimators.)
+manuscript} for some warnings & information using similar types of CFR
+estimators.)
 
 See \code{vignette("epi_df")} for more examples.
 \subsection{Motivation and lower-level alternatives}{

From a79361a7d5611078f829d215ab5f7fe7e2caf409 Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Wed, 28 May 2025 11:43:08 -0700
Subject: [PATCH 115/117] docs(epi_slide): remove redundant `...` documentation

---
 R/slide.R        | 4 ----
 man/epi_slide.Rd | 5 -----
 2 files changed, 9 deletions(-)

diff --git a/R/slide.R b/R/slide.R
index eadcd4117..e52e12c8f 100644
--- a/R/slide.R
+++ b/R/slide.R
@@ -75,10 +75,6 @@
 #'   respectively, as additional names for these same quantities (similar to
 #'   [`dplyr::group_modify`]).
 #'
-#' @param ... Additional arguments to pass to the function or formula specified
-#'   via `.f`. Alternatively, if `.f` is missing, then the `...` is interpreted
-#'   as a ["data-masking"][rlang::args_data_masking] expression or expressions
-#'   for tidy evaluation.
 #' @param .new_col_name Name for the new column that will contain the computed
 #'   values. The default is "slide_value" unless your slide computations output
 #'   data frames, in which case they will be unpacked (as in `tidyr::unpack()`)
diff --git a/man/epi_slide.Rd b/man/epi_slide.Rd
index 2a1f7aef6..53b0a518b 100644
--- a/man/epi_slide.Rd
+++ b/man/epi_slide.Rd
@@ -61,11 +61,6 @@ respectively, as additional names for these same quantities (similar to
 \code{\link[dplyr:group_map]{dplyr::group_modify}}).
 }}
 
-\item{...}{Additional arguments to pass to the function or formula specified
-via \code{.f}. Alternatively, if \code{.f} is missing, then the \code{...} is interpreted
-as a \link[rlang:args_data_masking]{"data-masking"} expression or expressions
-for tidy evaluation.}
-
 \item{.window_size}{The size of the sliding window. The accepted values
 depend on the type of the \code{time_value} column in \code{.x}:
 \itemize{

From d2459bc2457929baddd600419ddd81946e5850ea Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Wed, 28 May 2025 12:02:20 -0700
Subject: [PATCH 116/117] docs(epi_slide): have function intro sentence in
 intro paragraph

---
 R/slide.R        | 14 ++++++++------
 man/epi_slide.Rd | 14 ++++++++------
 2 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/R/slide.R b/R/slide.R
index e52e12c8f..bc993bf4a 100644
--- a/R/slide.R
+++ b/R/slide.R
@@ -1,12 +1,14 @@
 #' More general form of [`epi_slide_opt`] for rolling/running computations
 #'
 #' Most rolling/running computations can be handled by [`epi_slide_mean`],
-#' [`epi_slide_sum`], or the medium-generality [`epi_slide_opt`] functions
-#' instead, which are much faster. You typically only need to consider
-#' `epi_slide()` if you have a computation that depends on multiple columns
-#' simultaneously, outputs multiple columns simultaneously, or produces
-#' non-numeric output.  For example, this computation depends on multiple
-#' columns:
+#' [`epi_slide_sum`], or the medium-generality [`epi_slide_opt`] functions,
+#' which have been specialized to run more quickly. `epi_slide()` is a slower
+#' but even more general function for rolling/running computations, and uses a
+#' different interface to specify the computations; you typically only need to
+#' consider using `epi_slide()` if you have a computation that depends on
+#' multiple columns simultaneously, outputs multiple columns simultaneously, or
+#' produces non-numeric output. For example, this computation depends on
+#' multiple columns:
 #'
 #' ```
 #' cases_deaths_subset %>%
diff --git a/man/epi_slide.Rd b/man/epi_slide.Rd
index 53b0a518b..92c1fa914 100644
--- a/man/epi_slide.Rd
+++ b/man/epi_slide.Rd
@@ -108,12 +108,14 @@ as \code{.x} if \code{.x} was grouped.
 }
 \description{
 Most rolling/running computations can be handled by \code{\link{epi_slide_mean}},
-\code{\link{epi_slide_sum}}, or the medium-generality \code{\link{epi_slide_opt}} functions
-instead, which are much faster. You typically only need to consider
-\code{epi_slide()} if you have a computation that depends on multiple columns
-simultaneously, outputs multiple columns simultaneously, or produces
-non-numeric output.  For example, this computation depends on multiple
-columns:
+\code{\link{epi_slide_sum}}, or the medium-generality \code{\link{epi_slide_opt}} functions,
+which have been specialized to run more quickly. \code{epi_slide()} is a slower
+but even more general function for rolling/running computations, and uses a
+different interface to specify the computations; you typically only need to
+consider using \code{epi_slide()} if you have a computation that depends on
+multiple columns simultaneously, outputs multiple columns simultaneously, or
+produces non-numeric output. For example, this computation depends on
+multiple columns:
 }
 \details{
 \if{html}{\out{<div class="sourceCode">}}\preformatted{cases_deaths_subset \%>\%

From 3d9164522d053e5e677e335d6669c12da06beede Mon Sep 17 00:00:00 2001
From: "Logan C. Brooks" <lcbrooks+github@andrew.cmu.edu>
Date: Thu, 5 Jun 2025 15:31:50 -0700
Subject: [PATCH 117/117] Release 0.12.0

---
 DESCRIPTION | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/DESCRIPTION b/DESCRIPTION
index 183097a82..decdb2802 100755
--- a/DESCRIPTION
+++ b/DESCRIPTION
@@ -1,7 +1,7 @@
 Type: Package
 Package: epiprocess
 Title: Tools for basic signal processing in epidemiology
-Version: 0.11.7
+Version: 0.12.0
 Authors@R: c(
     person("Jacob", "Bien", role = "ctb"),
     person("Logan", "Brooks", , "lcbrooks+github@andrew.cmu.edu", role = c("aut", "cre")),