|
1 | | -#' Subset of JHU daily state cases and deaths |
| 1 | +#' JHU daily COVID-19 cases and deaths rates from all states |
2 | 2 | #' |
3 | | -#' This data source of confirmed COVID-19 cases and deaths |
4 | | -#' is based on reports made available by the Center for |
5 | | -#' Systems Science and Engineering at Johns Hopkins University. |
6 | | -#' This example data ranges from Dec 31, 2020 to Dec 31, 2021, |
7 | | -#' and includes all states. |
| 3 | +#' This data source of confirmed COVID-19 cases and deaths is based on reports |
| 4 | +#' made available by the Center for Systems Science and Engineering at Johns |
| 5 | +#' Hopkins University, as downloaded from the CMU Delphi COVIDcast Epidata |
| 6 | +#' API. This example data is a snapshot as of March 20, 2024, and |
| 7 | +#' ranges from December 31, 2020 to December 31, 2021. It |
| 8 | +#' includes all states. It is used in the {epiprocess} correlation vignette. |
8 | 9 | #' |
9 | | -#' @format A tibble with 20,496 rows and 4 variables: |
| 10 | +#' @format An [`epiprocess::epi_df`] (object of class `c("epi_df", "tbl_df", "tbl", "data.frame")`) with 37576 rows and 4 columns. |
| 11 | +#' @section Data dictionary: |
| 12 | +#' The data has columns: |
10 | 13 | #' \describe{ |
11 | 14 | #' \item{geo_value}{the geographic value associated with each row |
12 | 15 | #' of measurements.} |
|
38 | 41 | #' |
39 | 42 | #' Data set on state populations, from the 2019 US Census. |
40 | 43 | #' |
41 | | -#' @format Data frame with 57 rows (including one for the United States as a |
42 | | -#' whole, plus the District of Columbia, Puerto Rico Commonwealth, |
43 | | -#' American Samoa, Guam, the U.S. Virgin Islands, and the Northern Mariana, |
44 | | -#' Islands). |
| 44 | +#' @format A [`tibble::tibble`] (object of class `c("tbl_df", "tbl", "data.frame")`) with 57 rows and 4 columns. |
| 45 | +#' @section Data dictionary: |
| 46 | +#' The data includes 57 regions (all US states, the United |
| 47 | +#' States as a whole, the District of Columbia, Puerto Rico Commonwealth, |
| 48 | +#' American Samoa, Guam, the U.S. Virgin Islands, and the Northern Mariana |
| 49 | +#' Islands) with columns: |
45 | 50 | #' |
46 | 51 | #' \describe{ |
47 | | -#' \item{fips}{FIPS code} |
| 52 | +#' \item{fips}{2-digit FIPS code} |
48 | 53 | #' \item{name}{Full name of the state or territory} |
49 | 54 | #' \item{pop}{Estimate of the location's resident population in |
50 | 55 | #' 2019.} |
51 | 56 | #' \item{abbr}{Postal abbreviation for the location} |
52 | 57 | #' } |
53 | 58 | #' |
54 | | -#' @source United States Census Bureau, at |
| 59 | +#' @source |
| 60 | +#' This object is derived from several datasets from the United States |
| 61 | +#' Census Bureau, Population Division, at |
55 | 62 | #' \url{https://www2.census.gov/programs-surveys/popest/datasets/2010-2019/counties/totals/co-est2019-alldata.pdf}, |
56 | 63 | #' \url{https://www.census.gov/data/tables/time-series/demo/popest/2010s-total-puerto-rico-municipios.html}, |
57 | | -#' and \url{https://www.census.gov/data/tables/2010/dec/2010-island-areas.html} |
| 64 | +#' and \url{https://www.census.gov/data/tables/2010/dec/2010-island-areas.html}. |
| 65 | +#' It is made available through the `covidcast` package. This data is |
| 66 | +#' public domain. |
58 | 67 | "state_census" |
59 | 68 |
|
60 | 69 | # Epipredict Vignette Data ---------------------------------------------------- |
61 | 70 |
|
62 | | -#' CTIS COVID Behaviours |
| 71 | +#' Subset of CTIS COVID-19-related behaviours from 5 states |
63 | 72 | #' |
64 | 73 | #' Data set for a handful of states on masking and distancing behaviours |
65 | | -#' during the COVID-19 Pandemic and downloaded from the CMU Delphi COVIDcast |
66 | | -#' Epidata API. This data set covers the period from |
67 | | -#' June to December 2021. |
| 74 | +#' during the COVID-19 Pandemic, and downloaded from the CMU Delphi COVIDcast |
| 75 | +#' Epidata API. This example data is a snapshot as of March 20, 2024, and |
| 76 | +#' ranges from June 4, 2021 to December 31, 2021. |
| 77 | +#' It is limited to California, Florida, Texas, New Jersey, and New York. |
| 78 | +#' |
| 79 | +#' @format A [`tibble::tibble`] (object of class `c("tbl_df", "tbl", "data.frame")`) with 1055 rows and 4 columns. |
| 80 | +#' @section Data dictionary: |
| 81 | +#' The data has columns: |
| 82 | +#' \describe{ |
| 83 | +#' \item{geo_value}{the geographic value associated with each row |
| 84 | +#' of measurements.} |
| 85 | +#' \item{time_value}{the time value associated with each row of measurements.} |
| 86 | +#' \item{masking}{Estimated percentage of people who wore a mask for most or all of the time while in public in the past 7 days; those not in public in the past 7 days are not counted.} |
| 87 | +#' \item{distancing}{Estimated percentage of respondents who reported that all or most people they encountered in public in the past 7 days maintained a distance of at least 6 feet. Respondents who said that they have not been in public for the past 7 days are excluded.} |
| 88 | +#' } |
| 89 | +#' |
| 90 | +#' @source |
| 91 | +#' This object contains a modified part of the |
| 92 | +#' \href{https://cmu-delphi.github.io/delphi-epidata/symptom-survey/#covid-19-trends-and-impact-survey}{data |
| 93 | +#' aggregations in the API} that are prepared from the |
| 94 | +#' \href{https://www.pnas.org/doi/full/10.1073/pnas.2111454118}{COVID-19 |
| 95 | +#' Trends and Impact Survey}; see the first link for more information on |
| 96 | +#' citing in publications. |
| 97 | +#' The data is made available via the |
| 98 | +#' \href{https://cmu-delphi.github.io/delphi-epidata/}{Delphi Epidata API}. |
| 99 | +#' |
| 100 | +#' These aggregations are licensed under the terms of |
| 101 | +#' the \href{https://creativecommons.org/licenses/by/4.0/}{Creative Commons |
| 102 | +#' Attribution license}. |
| 103 | +#' |
| 104 | +#' Modifications: |
| 105 | +#' * The data has been limited to a very small number of rows, the |
| 106 | +#' signal names slightly altered, and formatted into an `epi_df`. |
68 | 107 | "ctis_covid_behaviours" |
69 | 108 |
|
70 | | -#' COVID-19 Incident Cases and Deaths |
| 109 | +#' Subset of COVID-19 incident cases and deaths from 5 states |
71 | 110 | #' |
72 | 111 | #' Data set for 5 states containing COVID-19 Incident Cases and Deaths as |
73 | | -#' reported |
74 | | -#' by JHU-CSSE and downloaded from the CMU Delphi COVIDcast Epidata API. |
75 | | -#' This data set covers the period from June 2021 to December 2021, and is |
76 | | -#' used in the epipredict Vignette on ... . |
| 112 | +#' reported by JHU-CSSE and downloaded from the CMU Delphi COVIDcast Epidata |
| 113 | +#' API. This example data is a snapshot as of March 20, 2024, and |
| 114 | +#' ranges from June 4, 2021 to December 31, 2021. It |
| 115 | +#' is limited to California, Florida, Texas, New Jersey, and New York. |
77 | 116 | #' |
78 | | -#' @source This object contains a modified part of the \href{https://github.com/CSSEGISandData/COVID-19}{COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University} as \href{https://cmu-delphi.github.io/delphi-epidata/api/covidcast-signals/jhu-csse.html}{republished in the COVIDcast Epidata API}. This data set is licensed under the terms of the |
| 117 | +#' @format An [`epiprocess::epi_df`] (object of class `c("epi_df", "tbl_df", "tbl", "data.frame")`) with 1055 rows and 4 columns. |
| 118 | +#' @section Data dictionary: |
| 119 | +#' The data has columns: |
| 120 | +#' \describe{ |
| 121 | +#' \item{geo_value}{the geographic value associated with each row |
| 122 | +#' of measurements.} |
| 123 | +#' \item{time_value}{the time value associated with each row of measurements.} |
| 124 | +#' \item{cases}{Number of new confirmed COVID-19 cases, daily} |
| 125 | +#' \item{deaths}{Number of new confirmed COVID-19 deaths, daily} |
| 126 | +#' } |
| 127 | +#' |
| 128 | +#' @source This object contains a modified part of the \href{https://github.com/CSSEGISandData/COVID-19}{COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University} |
| 129 | +#' as \href{https://cmu-delphi.github.io/delphi-epidata/api/covidcast-signals/jhu-csse.html}{republished in the COVIDcast Epidata API}. |
| 130 | +#' This data set is licensed under the terms of the |
79 | 131 | #' \href{https://creativecommons.org/licenses/by/4.0/}{Creative Commons Attribution 4.0 International license} |
80 | 132 | #' by the Johns Hopkins University on behalf of its Center for Systems Science in Engineering. |
81 | 133 | #' Copyright Johns Hopkins University 2020. |
| 134 | +#' |
| 135 | +#' Modifications: |
| 136 | +#' * \href{https://cmu-delphi.github.io/delphi-epidata/api/covidcast-signals/jhu-csse.html}{From the COVIDcast Epidata API}: |
| 137 | +#' The signals are taken directly from the JHU CSSE |
| 138 | +#' \href{https://github.com/CSSEGISandData/COVID-19}{COVID-19 GitHub repository} |
| 139 | +#' without changes. |
| 140 | +#' * Furthermore, the data has been limited to a very small number of rows, the |
| 141 | +#' signal names slightly altered, and formatted into an `epi_df`. |
82 | 142 | "counts_subset" |
83 | 143 |
|
84 | 144 | #' Canadian COVID-19 case rates |
|
93 | 153 | #' \href{https://github.com/ccodwg/CovidTimelineCanada}{ccodwg/CovidTimelineCanada GitHub repository}, |
94 | 154 | #' which also reports vaccine-related signals. |
95 | 155 | #' |
96 | | -#' This dataset contains versioned data covering the period from April 2020 to |
97 | | -#' December 2021 and is used in the epipredict slide vignette. |
| 156 | +#' This dataset contains versioned data snapshots from February 1, 2021 to December |
| 157 | +#' 1, 2021 covering the period from April 2, 2020 to December 1, 2021. It is |
| 158 | +#' used in the epipredict slide vignette. |
98 | 159 | #' |
| 160 | +#' @format An [`epiprocess::epi_archive`]. The DT attribute contains the data formatted as a [`data.table::data.table`] (object of class `c("data.table", "data.frame")`) with 65299 rows and 4 columns. |
| 161 | +#' @section Data dictionary: |
| 162 | +#' The data in the `epi_archive$DT` attribute has columns: |
| 163 | +#' \describe{ |
| 164 | +#' \item{version}{the time value specifying the version for each row of measurements.} |
| 165 | +#' \item{geo_value}{the province or territory associated with each row of measurements.} |
| 166 | +#' \item{time_value}{the time value associated with each row of measurements.} |
| 167 | +#' \item{case_rate}{number of new confirmed cases due to COVID-19 per 100,000 population, daily} |
| 168 | +#' } |
99 | 169 | #' @source This object contains a modified part of the COVID-19 Canada Open |
100 | 170 | #' Data Working Group's |
101 | 171 | #' \href{https://github.com/ccodwg/Covid19Canada}{Covid19Canada data repository} (archived). |
102 | 172 | #' This data set is licensed under the terms of the |
103 | 173 | #' \href{https://creativecommons.org/licenses/by/4.0/}{Creative Commons Attribution 4.0 International license} |
104 | | -#' by the COVID-19 Canada Open Data Working Group. |
| 174 | +#' by the COVID-19 Canada Open Data Working Group. The COVID-19 Canada Open |
| 175 | +#' Data Working Group collected the data from publicly available sources such |
| 176 | +#' as government datasets and news releases. |
| 177 | +#' |
| 178 | +#' Modifications: |
| 179 | +#' * The case rate signal are calculated using the case count taken directly from the CCODWG |
| 180 | +#' \href{https://github.com/ccodwg/Covid19Canada}{ccodwg/Covid19Canada GitHub repository} |
| 181 | +#' and population data. |
| 182 | +#' * Furthermore, the data has been limited to a very small number of rows, the |
| 183 | +#' signal names slightly altered, some province names replaced with abbreviations, and |
| 184 | +#' formatted into an `epi_archive`. |
| 185 | +#' |
| 186 | +#' The population data used (but not included in the dataset itself) is from the |
| 187 | +#' \href{https://github.com/mountainMath/BCCovidSnippets/}{mountainMath/BCCovidSnippets GitHub repository}. |
105 | 188 | "can_prov_cases" |
| 189 | + |
| 190 | +#' Subset of Statistics Canada median employment income for postsecondary graduates |
| 191 | +#' |
| 192 | +#' Data set for all territories (aggregated) and all 10 provinces containing |
| 193 | +#' yearly income data for postsecondary graduates as reported by Statistics |
| 194 | +#' Canada, downloaded from the Statistics Canada website at |
| 195 | +#' www.statcan.gc.ca. This example data is a snapshot as of September 18, |
| 196 | +#' 2024, and ranges from 2010 to 2017 (yearly). |
| 197 | +#' |
| 198 | +#' @format An [`epiprocess::epi_df`] (object of class `c("epi_df", "tbl_df", "tbl", "data.frame")`) with 10193 rows and 8 columns. |
| 199 | +#' @section Data dictionary: |
| 200 | +#' The data has columns: |
| 201 | +#' \describe{ |
| 202 | +#' \item{geo_value}{The province in Canada associated with each |
| 203 | +#' row of measurements.} |
| 204 | +#' \item{time_value}{The time value, a year integer in YYYY format} |
| 205 | +#' \item{edu_qual}{The education qualification} |
| 206 | +#' \item{fos}{The field of study} |
| 207 | +#' \item{age_group}{The age group; either 15 to 34 or 35 to 64} |
| 208 | +#' \item{num_graduates}{The number of graduates for the given row of characteristics} |
| 209 | +#' \item{med_income_2y}{The median employment income two years after graduation} |
| 210 | +#' \item{med_income_5y}{The median employment income five years after graduation} |
| 211 | +#' } |
| 212 | +#' @source This object contains modified data adapted from |
| 213 | +#' Statistics Canada, \href{https://www150.statcan.gc.ca/t1/tbl1/en/tv.action?pid=3710011501}{ |
| 214 | +#' Table 37-10-0115-01 Characteristics and median employment income of |
| 215 | +#' longitudinal cohorts of postsecondary graduates two and five years after |
| 216 | +#' graduation, by educational qualification and field of study |
| 217 | +#' (primary groupings)}. This does not constitute an endorsement by Statistics Canada of this product. |
| 218 | +#' |
| 219 | +#' The data is licensed under the terms of the |
| 220 | +#' \href{https://www.statcan.gc.ca/en/reference/licence}{Statistics Canada Open License}. |
| 221 | +#' |
| 222 | +#' Modifications: |
| 223 | +#' * Only provincial and territorial regions are kept. |
| 224 | +#' * Only age group, field of study, and educational qualification are kept as |
| 225 | +#' covariates. For the remaining covariates, we keep aggregated values and |
| 226 | +#' drop the level-specific rows. |
| 227 | +#' * No modifications were made to the time range of the data. |
| 228 | +"grad_employ_subset" |
0 commit comments