Skip to contents

Selects the best model by log-likelihood, aic, or bic.

Usage

model_select(
  x,
  models = univariateML_models,
  criterion = c("AIC", "BIC", "logLik"),
  na.rm = FALSE,
  type = c("both", "discrete", "continuous"),
  return = c("best", "all"),
  ...
)

Arguments

x

a (non-empty) numeric vector of data values.

models

a character vector containing the distribution models to select from; see print(univariateML_models). Defaults to all implemented models.

criterion

the model selection criterion. Must be one of "AIC", "BIC", and "logLik", ignoring case. Defaults to "AIC".

na.rm

logical. Should missing values be removed?

type

Either "both", "discrete", or "continuous". The supplied models vector is restricted to the desired class.

return

character length 1. "univariateML" (default) if the function should return the single best model; "all" if a tibble data frame of all results should be returned, sorted by decreasing model performance.

...

unused.

Value

The return value depends on the return argument. For return = "best" (default), model_select returns an object of class univariateML

For return = "all", model_select returns a tibble data frame with the following columns:

model

The name of the model.

d_loglik, d_aic, d_bic

See loglik, aic, bic.

p

Number of parameters fitted.

loglik, aic, bic

The negative log-likelihood at the maximum, the aic, and the bic, respectively. The minimum of each of these is noted and then subtracted from each value to give their delta versions d_loglik, d_aic, d_bic

. So, the model with the lowest aic will have d_aic of 0; the d_aic of all the other models shows how much higher their aics are from the minimum. The same goes with d_loglik and d_bic.

ml

The internal code name for the model.

univariateML

The univariateML object for the model. This is return = "all", this object is returned for all tested models.

See also

Johnson, N. L., Kotz, S. and Balakrishnan, N. (1995) Continuous Univariate Distributions, Volume 1, Chapter 17. Wiley, New York.

Examples

# Select among all possible continuous models.
model_select(precip, type = "continuous")
#> Maximum likelihood estimates for the Skew Normal model 
#>    mean       sd       xi  
#> 34.6957  13.5471   0.8088  

# View possible models to fit.
print(univariateML_models)
#>  [1] "beta"       "betapr"     "binom"      "cauchy"     "exp"       
#>  [6] "gamma"      "ged"        "geom"       "gumbel"     "invgamma"  
#> [11] "invgauss"   "invweibull" "kumar"      "laplace"    "lgamma"    
#> [16] "lgser"      "llogis"     "lnorm"      "logis"      "logitnorm" 
#> [21] "lomax"      "naka"       "nbinom"     "norm"       "pareto"    
#> [26] "pois"       "power"      "rayleigh"   "sged"       "snorm"     
#> [31] "sstd"       "std"        "unif"       "weibull"    "zip"       
#> [36] "zipf"      

# Try out only gamma, Weibull, and exponential.
model_select(precip, c("gamma", "weibull", "exp"))
#> Maximum likelihood estimates for the Weibull model 
#>  shape   scale  
#>  2.829  39.084  

# Fit the discrete `corbet` data to all available discrete models
model_select(corbet, type = "discrete", return = "all")
#> # A tibble: 6 × 10
#>   model         d_logLik d_AIC d_BIC logLik     p   AIC   BIC ml    univariateML
#>   <chr>            <dbl> <dbl> <dbl>  <dbl> <int> <dbl> <dbl> <chr> <named list>
#> 1 Zipf               0      0     0   1363.     2 2730. 2738. zipf  <univrtML>  
#> 2 Logarithmic …     54.2  106.  102.  1417.     1 2836. 2840. lgser <univrtML>  
#> 3 Negative bin…    110.   220.  220.  1473.     2 2949. 2958. nbin… <univrtML>  
#> 4 Geometric        120.   237.  233.  1483.     1 2967. 2971. geom  <univrtML>  
#> 5 Poisson          818.  1634. 1630.  2181.     1 4364. 4368. pois  <univrtML>  
#> 6 Zero-inflate…    818.  1636. 1636.  2181.     2 4366. 4374. zip   <univrtML>