Estimator

(class from pyomo.contrib.parmest.parmest)

class pyomo.contrib.parmest.parmest.Estimator(experiment_list, obj_function=None, tee=False, diagnostic_mode=False, solver_options=None, regularization=None, prior_FIM=None, theta_ref=None, regularization_weight=None)[source]

Bases: object

Parameter estimation class

Parameters:

experiment_list (list of Experiments) – A list of experiment objects which creates one labeled model for each experiment
obj_function (string or function (optional)) – Built-in objective (“SSE” or “SSE_weighted”) or custom function used to formulate parameter estimation objective. If no function is specified, the model is used “as is” and should be defined with a “FirstStageCost” and “SecondStageCost” expression that are used to build an objective. Default is None.
tee (bool, optional) – If True, print the solver output to the screen. Default is False.
diagnostic_mode (bool, optional) – If True, print diagnostics from the solver. Default is False.
solver_options (dict, optional) – Provides options to the solver (also the name of an attribute). Default is None.
regularization (string, optional) – Built-in regularization type (“L2”). If no regularization is specified, no regularization term is added to the objective. Default is None.
prior_FIM (pd.DataFrame, optional) – Prior Fisher Information Matrix from previous experimental design to be added to the FIM of the current experiments for regularization. The prior_FIM should be a square matrix with parameter names as both row and column labels.
theta_ref (dict, optional) – Reference parameter values used in regularization. If None, defaults to the current parameter values in the model.
regularization_weight (float, optional) – Weighting factor for the regularization term. Used with regularization="L2". Default is 1.0.

__init__(experiment_list, obj_function=None, tee=False, diagnostic_mode=False, solver_options=None, regularization=None, prior_FIM=None, theta_ref=None, regularization_weight=None)[source]
__init__(model_function: Callable, data, theta_names, obj_function=None, tee=False, diagnostic_mode=False, solver_options=None)

Methods

`__init__`()
`confidence_region_test`(theta_values, ...[, ...])	Confidence region test to determine if theta values are within a rectangular, multivariate normal, or Gaussian kernel density distribution for a range of alpha values
`cov_est`([method, solver, step])	Covariance matrix calculation using all scenarios in the data
`leaveNout_bootstrap_test`(lNo, lNo_samples, ...)	Leave-N-out bootstrap test to compare theta values where N data points are left out to a bootstrap analysis using the remaining data, results indicate if theta is within a confidence region determined by the bootstrap analysis
`likelihood_ratio_test`(obj_at_theta, ...[, ...])	Likelihood ratio test to identify theta values within a confidence region using the \(\chi^2\) distribution
`objective_at_theta`([theta_values, ...])	Objective value for each theta, solving extensive form problem with fixed theta values.
`theta_est`([solver, return_values, calc_cov, ...])	Parameter estimation using all scenarios in the data
`theta_est_bootstrap`(bootstrap_samples[, ...])	Parameter estimation using bootstrap resampling of the data
`theta_est_leaveNout`(lNo[, lNo_samples, ...])	Parameter estimation where N data points are left out of each sample

Member Documentation

confidence_region_test(theta_values, distribution, alphas, test_theta_values=None, seed=None)[source]

Confidence region test to determine if theta values are within a rectangular, multivariate normal, or Gaussian kernel density distribution for a range of alpha values

Parameters:

theta_values (pd.DataFrame, columns = theta_names) – Theta values used to generate a confidence region (generally returned by theta_est_bootstrap)
distribution (string) – Statistical distribution used to define a confidence region, options = ‘MVN’ for multivariate_normal, ‘KDE’ for gaussian_kde, and ‘Rect’ for rectangular.
alphas (list) – List of alpha values used to determine if theta values are inside or outside the region.
test_theta_values (pd.Series or pd.DataFrame, keys/columns = theta_names, optional) – Additional theta values that are compared to the confidence region to determine if they are inside or outside.

Returns:

training_results (pd.DataFrame) – Theta value used to generate the confidence region along with True (inside) or False (outside) for each alpha
test_results (pd.DataFrame) – If test_theta_values is not None, returns test theta value along with True (inside) or False (outside) for each alpha

cov_est(method='finite_difference', solver='ipopt', step=0.001)[source]

Covariance matrix calculation using all scenarios in the data

Parameters:

method (str, optional) – Covariance calculation method. Options - ‘finite_difference’, ‘reduced_hessian’, and ‘automatic_differentiation_kaug’. Default is ‘finite_difference’
solver (str, optional) – Solver name, e.g., ‘ipopt’. Default is ‘ipopt’
step (float, optional) – Float used for relative perturbation of the parameters, e.g., step=0.02 is a 2% perturbation. Default is 1e-3

Returns:

cov – Covariance matrix of the estimated parameters

Return type:

pd.DataFrame

leaveNout_bootstrap_test(lNo, lNo_samples, bootstrap_samples, distribution, alphas, seed=None)[source]

Leave-N-out bootstrap test to compare theta values where N data points are left out to a bootstrap analysis using the remaining data, results indicate if theta is within a confidence region determined by the bootstrap analysis

Parameters:

lNo (int) – Number of data points to leave out for parameter estimation
lNo_samples (int) – Leave-N-out sample size. If lNo_samples=None, the maximum number of combinations will be used
bootstrap_samples (int:) – Bootstrap sample size
distribution (string) – Statistical distribution used to define a confidence region, options = ‘MVN’ for multivariate_normal, ‘KDE’ for gaussian_kde, and ‘Rect’ for rectangular.
alphas (list) – List of alpha values used to determine if theta values are inside or outside the region.
seed (int or None, optional) – Random seed

Returns:

List of tuples with one entry per lNo_sample
* The first item in each tuple is the list of N samples that are left – out.
* The second item in each tuple is a DataFrame of theta estimated using – the N samples.
* The third item in each tuple is a DataFrame containing results from – the bootstrap analysis using the remaining samples.
For each DataFrame a column is added for each value of alpha which
indicates if the theta estimate is in (True) or out (False) of the
alpha region for a given distribution (based on the bootstrap results)

likelihood_ratio_test(obj_at_theta, obj_value, alphas, return_thresholds=False)[source]

Likelihood ratio test to identify theta values within a confidence region using the \(\chi^2\) distribution

Parameters:

obj_at_theta (pd.DataFrame, columns = theta_names + 'obj') – Objective values for each theta value (returned by objective_at_theta)
obj_value (int or float) – Objective value from parameter estimation using all data
alphas (list) – List of alpha values to use in the chi2 test
return_thresholds (bool, optional) – Return the threshold value for each alpha. Default is False.

Returns:

LR (pd.DataFrame) – Objective values for each theta value along with True or False for each alpha
thresholds (pd.Series) – If return_threshold = True, the thresholds are also returned.

objective_at_theta(theta_values=None, initialize_parmest_model=False)[source]: Objective value for each theta, solving extensive form problem with fixed theta values.

theta_est(solver='ef_ipopt', return_values=[], calc_cov=NOTSET, cov_n=NOTSET)[source]

Parameter estimation using all scenarios in the data

Parameters:

solver (str, optional) – Currently only “ef_ipopt” is supported. Default is “ef_ipopt”.
return_values (list, optional) – List of Variable names, used to return values from the model for data reconciliation
calc_cov (boolean, optional) –
DEPRECATED.

If True, calculate and return the covariance matrix (only for “ef_ipopt” solver). Default is NOTSET
cov_n (int, optional) –
DEPRECATED.

If calc_cov=True, then the user needs to supply the number of datapoints that are used in the objective function. Default is NOTSET

Returns:

obj_val (float) – The objective function value
theta_vals (pd.Series) – Estimated values for theta
var_values (pd.DataFrame) – Variable values for each variable name in return_values (only for solver=’ef_ipopt’)

theta_est_bootstrap(bootstrap_samples, samplesize=None, replacement=True, seed=None, return_samples=False)[source]

Parameter estimation using bootstrap resampling of the data

Parameters:

bootstrap_samples (int) – Number of bootstrap samples to draw from the data
samplesize (int or None, optional) – Size of each bootstrap sample. If samplesize=None, samplesize will be set to the number of samples in the data
replacement (bool, optional) – Sample with or without replacement. Default is True.
seed (int or None, optional) – Random seed
return_samples (bool, optional) – Return a list of sample numbers used in each bootstrap estimation. Default is False.

Returns:

bootstrap_theta – Theta values for each sample and (if return_samples = True) the sample numbers used in each estimation

Return type:

pd.DataFrame

theta_est_leaveNout(lNo, lNo_samples=None, seed=None, return_samples=False)[source]

Parameter estimation where N data points are left out of each sample

Parameters:

lNo (int) – Number of data points to leave out for parameter estimation
lNo_samples (int) – Number of leave-N-out samples. If lNo_samples=None, the maximum number of combinations will be used
seed (int or None, optional) – Random seed
return_samples (bool, optional) – Return a list of sample numbers that were left out. Default is False.

Returns:

lNo_theta – Theta values for each sample and (if return_samples = True) the sample numbers left out of each estimation

Return type:

pd.DataFrame