Fit multiple models and select the best fit

Selects the best model by log-likelihood, aic, or bic.

Usage

model_select(
  x,
  models = univariateML_models,
  criterion = c("AIC", "BIC", "logLik"),
  na.rm = FALSE,
  type = c("both", "discrete", "continuous"),
  return = c("best", "all"),
  ...
)

Arguments

x: a (non-empty) numeric vector of data values.
models: a character vector containing the distribution models to select from; see print(univariateML_models). Defaults to all implemented models.
criterion: the model selection criterion. Must be one of "AIC", "BIC", and "logLik", ignoring case. Defaults to "AIC".
na.rm: logical. Should missing values be removed?
type: Either "both", "discrete", or "continuous". The supplied models vector is restricted to the desired class.
return: character length 1. "univariateML" (default) if the function should return the single best model; "all" if a tibble data frame of all results should be returned, sorted by decreasing model performance.
...: unused.

Value

The return value depends on the return argument. For return = "best" (default), model_select returns an object of class univariateML

For return = "all", model_select returns a tibble data frame with the following columns:

model: The name of the model.
d_loglik, d_aic, d_bic: See loglik, aic, bic.
p: Number of parameters fitted.
loglik, aic, bic: The negative log-likelihood at the maximum, the aic, and the bic, respectively. The minimum of each of these is noted and then subtracted from each value to give their delta versions d_loglik, d_aic, d_bic

. So, the model with the lowest aic will have d_aic of 0; the d_aic of all the other models shows how much higher their aics are from the minimum. The same goes with d_loglik and d_bic.

ml: The internal code name for the model.
univariateML: The univariateML object for the model. This is return = "all", this object is returned for all tested models.

Examples

# Select among all possible continuous models.
model_select(precip, type = "continuous")
#> Maximum likelihood estimates for the Gompertz model 
#>        a         b  
#> 0.004121  0.071110  

# View possible models to fit.
print(univariateML_models)
#>  [1] "beta"       "betapr"     "binom"      "burr"       "cauchy"    
#>  [6] "dunif"      "exp"        "fatigue"    "gamma"      "ged"       
#> [11] "geom"       "gompertz"   "gumbel"     "invburr"    "invgamma"  
#> [16] "invgauss"   "invweibull" "kumar"      "laplace"    "lgamma"    
#> [21] "lgser"      "llogis"     "lnorm"      "logis"      "logitnorm" 
#> [26] "lomax"      "naka"       "nbinom"     "norm"       "paralogis" 
#> [31] "pareto"     "pois"       "power"      "rayleigh"   "sged"      
#> [36] "snorm"      "sstd"       "std"        "unif"       "weibull"   
#> [41] "zip"        "zipf"      

# Try out only gamma, Weibull, and exponential.
model_select(precip, c("gamma", "weibull", "exp"))
#> Maximum likelihood estimates for the Weibull model 
#>  shape   scale  
#>  2.829  39.084  

# Fit the discrete `corbet` data to all available discrete models
model_select(corbet, type = "discrete", return = "all")
#> # A tibble: 7 × 10
#>   model         d_logLik d_AIC d_BIC logLik     p   AIC   BIC ml    univariateML
#>   <chr>            <dbl> <dbl> <dbl>  <dbl> <int> <dbl> <dbl> <chr> <named list>
#> 1 Zipf               0      0     0  -1363.     2 2730. 2738. zipf  <univrtML>  
#> 2 Logarithmic …     54.2  106.  102. -1417.     1 2836. 2840. lgser <univrtML>  
#> 3 Negative bin…    110.   220.  220. -1473.     2 2949. 2958. nbin… <univrtML>  
#> 4 Geometric        120.   237.  233. -1483.     1 2967. 2971. geom  <univrtML>  
#> 5 Discrete uni…    229.   459.  459. -1592.     2 3188. 3197. dunif <univrtML>  
#> 6 Poisson          818.  1634. 1630. -2181.     1 4364. 4368. pois  <univrtML>  
#> 7 Zero-inflate…    818.  1636. 1636. -2181.     2 4366. 4374. zip   <univrtML>

Usage

Arguments

Value

See also

Examples