PSU researchers shall select an optimal model design from Task 2, based on TAC/OSA input, and implement it using R as a package for the unified RSPM framework. The result will be a new functionality in RSPM that can be applied across several Oregon and national tools. The implementation of the new functionality will be checked in by the PSU researchers to a shared code repository and estimation data properly documented
All sub-tasks in Task 3 have been accomplished and listed below:
#=======================================================
#SECTION 3: DEFINE FUNCTIONS THAT IMPLEMENT THE SUBMODEL
#=======================================================
#Main module function that predicts AADVMT for households
#------------------------------------------------------
#' Main module function
#'
#' \code{PredictAADVMT} predicts AADVMT for each household in the households
#' dataset using independent variables including household characteristics
#' and 5D built environment variables.
#'
#' This function predicts AADVMT for each hosuehold in the model region where
#' each household is assigned an AADVMT. The model objects as a part of the
#' inputs are stored in data frame with two columns: a column for segmentation
#' (e.g., metro, non-metro) and a 'model' column for model object (list-column
#' data structure). The function "nests" the households data frame into a
#' list-column data frame by segments and applies the generic predict() function
#' for each segment to predict AADVMT for each household. The vectors of HhId
#' and AADVMT produced by the PredictAADVMT function are to be stored in the
#' "Household" table.
#'
#' If this table does not exist, the function calculates a LENGTH value for
#' the table and returns that as well. The framework uses this information to
#' initialize the Households table. The function also computes the maximum
#' numbers of characters in the HhId and Azone datasets and assigns these to a
#' SIZE vector. This is necessary so that the framework can initialize these
#' datasets in the datastore. All the results are returned in a list.
#'
#' @param L A list containing the components listed in the Get specifications
#' for the module.
#' @return A list containing the components specified in the Set
#' specifications for the module along with:
#' LENGTH: A named integer vector having a single named element, "Household",
#' which identifies the length (number of rows) of the Household table to be
#' created in the datastore.
#' SIZE: A named integer vector having two elements. The first element, "Azone",
#' identifies the size of the longest Azone name. The second element, "HhId",
#' identifies the size of the longest HhId.
#' @import visioneval dplyr purrr tidyr pscl
#' @importFrom "splines" "ns"
#' @export
PredictAADVMT <- function(L) {
#TODO: get id_name from L or specification?
dataset_name <- "Household"
id_name <- "HhId"
y_name <- "AADVMT"
Bzone_df <- data.frame(L$Year[["Bzone"]])
stopifnot("data.frame" %in% class(Bzone_df))
Marea_df <- data.frame(L$Year[["Marea"]])
stopifnot("data.frame" %in% class(Marea_df))
D_df <- data.frame(L$Year[[dataset_name]])
stopifnot("data.frame" %in% class(D_df))
D_df <- D_df %>%
mutate(LogIncome=log1p(Income),
DrvAgePop=HhSize - Age0to14,
VehPerDriver=ifelse(Drivers==0 || is.na(Drivers), 0, Vehicles/Drivers),
LifeCycle = as.character(LifeCycle),
LifeCycle = ifelse(LifeCycle=="01", "Single", LifeCycle),
LifeCycle = ifelse(LifeCycle %in% c("02"), "Couple w/o children", LifeCycle),
LifeCycle = ifelse(LifeCycle %in% c("00", "03", "04", "05", "06", "07", "08"), "Couple w/ children", LifeCycle),
LifeCycle = ifelse(LifeCycle %in% c("09", "10"), "Empty Nester", LifeCycle)
) %>%
left_join(Bzone_df, by="Bzone") %>%
crossing(Marea_df)
D_df <- D_df %>% mutate_if(is.factor, as.character)
#D_df <- D_df %>%
# crossing(Marea_df, by="Marea")
#load("data/AADVMTModel_df.rda")
Model_df <- AADVMTModel_df
# find cols used for segmenting households ("metro" by default)
SegmentCol_vc <- setdiff(names(Model_df), c("model", "step", "post_func", "bias_adj"))
# segmenting columns must appear in D_df
stopifnot(all(SegmentCol_vc %in% names(D_df)))
Preds <- do_predictions(Model_df, D_df,
dataset_name, id_name, y_name, SegmentCol_vc)
# fill NA with 0s - produced with negative predictions before inversing power transformation
Preds <- Preds %>%
mutate(y=ifelse(is.na(y) | y < 0, 0, y))
Out_ls <- initDataList()
Out_ls$Year$Household <-
list(
AADVMT = -1
)
Out_ls$Year$Household$AADVMT <- Preds[["y"]]
#Return the list
Out_ls
}