API
parmest
- class pyomo.contrib.parmest.parmest.Estimator(experiment_list, obj_function=None, tee=False, diagnostic_mode=False, solver_options=None)[source]
Bases:
object
Parameter estimation class
- Parameters:
experiment_list (list of Experiments) – A list of experiment objects which creates one labeled model for each experiment
obj_function (string or function (optional)) – Built in objective (currently only “SSE”) or custom function used to formulate parameter estimation objective. If no function is specified, the model is used “as is” and should be defined with a “FirstStageCost” and “SecondStageCost” expression that are used to build an objective. Default is None.
tee (bool, optional) – If True, print the solver output to the screen. Default is False.
diagnostic_mode (bool, optional) – If True, print diagnostics from the solver. Default is False.
solver_options (dict, optional) – Provides options to the solver (also the name of an attribute). Default is None.
- confidence_region_test(theta_values, distribution, alphas, test_theta_values=None)[source]
Confidence region test to determine if theta values are within a rectangular, multivariate normal, or Gaussian kernel density distribution for a range of alpha values
- Parameters:
theta_values (pd.DataFrame, columns = theta_names) – Theta values used to generate a confidence region (generally returned by theta_est_bootstrap)
distribution (string) – Statistical distribution used to define a confidence region, options = ‘MVN’ for multivariate_normal, ‘KDE’ for gaussian_kde, and ‘Rect’ for rectangular.
alphas (list) – List of alpha values used to determine if theta values are inside or outside the region.
test_theta_values (pd.Series or pd.DataFrame, keys/columns = theta_names, optional) – Additional theta values that are compared to the confidence region to determine if they are inside or outside.
- Returns:
training_results (pd.DataFrame) – Theta value used to generate the confidence region along with True (inside) or False (outside) for each alpha
test_results (pd.DataFrame) – If test_theta_values is not None, returns test theta value along with True (inside) or False (outside) for each alpha
- leaveNout_bootstrap_test(lNo, lNo_samples, bootstrap_samples, distribution, alphas, seed=None)[source]
Leave-N-out bootstrap test to compare theta values where N data points are left out to a bootstrap analysis using the remaining data, results indicate if theta is within a confidence region determined by the bootstrap analysis
- Parameters:
lNo (int) – Number of data points to leave out for parameter estimation
lNo_samples (int) – Leave-N-out sample size. If lNo_samples=None, the maximum number of combinations will be used
bootstrap_samples (int:) – Bootstrap sample size
distribution (string) – Statistical distribution used to define a confidence region, options = ‘MVN’ for multivariate_normal, ‘KDE’ for gaussian_kde, and ‘Rect’ for rectangular.
alphas (list) – List of alpha values used to determine if theta values are inside or outside the region.
seed (int or None, optional) – Random seed
- Returns:
List of tuples with one entry per lNo_sample
* The first item in each tuple is the list of N samples that are left – out.
* The second item in each tuple is a DataFrame of theta estimated using – the N samples.
* The third item in each tuple is a DataFrame containing results from – the bootstrap analysis using the remaining samples.
For each DataFrame a column is added for each value of alpha which
indicates if the theta estimate is in (True) or out (False) of the
alpha region for a given distribution (based on the bootstrap results)
- likelihood_ratio_test(obj_at_theta, obj_value, alphas, return_thresholds=False)[source]
Likelihood ratio test to identify theta values within a confidence region using the \(\chi^2\) distribution
- Parameters:
obj_at_theta (pd.DataFrame, columns = theta_names + 'obj') – Objective values for each theta value (returned by objective_at_theta)
obj_value (int or float) – Objective value from parameter estimation using all data
alphas (list) – List of alpha values to use in the chi2 test
return_thresholds (bool, optional) – Return the threshold value for each alpha. Default is False.
- Returns:
LR (pd.DataFrame) – Objective values for each theta value along with True or False for each alpha
thresholds (pd.Series) – If return_threshold = True, the thresholds are also returned.
- objective_at_theta(theta_values=None, initialize_parmest_model=False)[source]
Objective value for each theta
- Parameters:
theta_values (pd.DataFrame, columns=theta_names) – Values of theta used to compute the objective
initialize_parmest_model (boolean) – If True: Solve square problem instance, build extensive form of the model for parameter estimation, and set flag model_initialized to True. Default is False.
- Returns:
obj_at_theta – Objective value for each theta (infeasible solutions are omitted).
- Return type:
pd.DataFrame
- theta_est(solver='ef_ipopt', return_values=[], calc_cov=False, cov_n=None)[source]
Parameter estimation using all scenarios in the data
- Parameters:
solver (string, optional) – Currently only “ef_ipopt” is supported. Default is “ef_ipopt”.
return_values (list, optional) – List of Variable names, used to return values from the model for data reconciliation
calc_cov (boolean, optional) – If True, calculate and return the covariance matrix (only for “ef_ipopt” solver). Default is False.
cov_n (int, optional) – If calc_cov=True, then the user needs to supply the number of datapoints that are used in the objective function.
- Returns:
objectiveval (float) – The objective function value
thetavals (pd.Series) – Estimated values for theta
variable values (pd.DataFrame) – Variable values for each variable name in return_values (only for solver=’ef_ipopt’)
cov (pd.DataFrame) – Covariance matrix of the fitted parameters (only for solver=’ef_ipopt’)
- theta_est_bootstrap(bootstrap_samples, samplesize=None, replacement=True, seed=None, return_samples=False)[source]
Parameter estimation using bootstrap resampling of the data
- Parameters:
bootstrap_samples (int) – Number of bootstrap samples to draw from the data
samplesize (int or None, optional) – Size of each bootstrap sample. If samplesize=None, samplesize will be set to the number of samples in the data
replacement (bool, optional) – Sample with or without replacement. Default is True.
seed (int or None, optional) – Random seed
return_samples (bool, optional) – Return a list of sample numbers used in each bootstrap estimation. Default is False.
- Returns:
bootstrap_theta – Theta values for each sample and (if return_samples = True) the sample numbers used in each estimation
- Return type:
pd.DataFrame
- theta_est_leaveNout(lNo, lNo_samples=None, seed=None, return_samples=False)[source]
Parameter estimation where N data points are left out of each sample
- Parameters:
lNo (int) – Number of data points to leave out for parameter estimation
lNo_samples (int) – Number of leave-N-out samples. If lNo_samples=None, the maximum number of combinations will be used
seed (int or None, optional) – Random seed
return_samples (bool, optional) – Return a list of sample numbers that were left out. Default is False.
- Returns:
lNo_theta – Theta values for each sample and (if return_samples = True) the sample numbers left out of each estimation
- Return type:
pd.DataFrame
- pyomo.contrib.parmest.parmest.SSE(model)[source]
Sum of squared error between experiment_output model and data values
- pyomo.contrib.parmest.parmest.group_data(data, groupby_column_name, use_mean=None)[source]
DEPRECATED.
Group data by scenario
- data: DataFrame
Data
- groupby_column_name: strings
Name of data column which contains scenario numbers
- use_mean: list of column names or None, optional
Name of data columns which should be reduced to a single value per scenario by taking the mean
- grouped_data: list of dictionaries
Grouped data
Deprecated since version 6.7.2: This function (group_data) has been deprecated and may be removed in a future release.
scenariocreator
- class pyomo.contrib.parmest.scenariocreator.ParmestScen(name, ThetaVals, probability)[source]
Bases:
object
A little container for scenarios; the Args are the attributes.
- class pyomo.contrib.parmest.scenariocreator.ScenarioCreator(pest, solvername)[source]
Bases:
object
Create scenarios from parmest.
- Parameters:
- ScenariosFromBootstrap(addtoSet, numtomake, seed=None)[source]
Creates new self.Scenarios list using the experiments only.
- Parameters:
addtoSet (ScenarioSet) – the scenarios will be added to this set
numtomake (int) – number of scenarios to create
- ScenariosFromExperiments(addtoSet)[source]
Creates new self.Scenarios list using the experiments only.
- Parameters:
addtoSet (ScenarioSet) – the scenarios will be added to this set
- Returns:
a ScenarioSet
- class pyomo.contrib.parmest.scenariocreator.ScenarioSet(name)[source]
Bases:
object
Class to hold scenario sets
Args: name (str): name of the set (might be “”)
- addone(scen)[source]
Add a scenario to the set
- Parameters:
scen (ParmestScen) – the scenario to add
graphics
- pyomo.contrib.parmest.graphics.fit_kde_dist(theta_values)[source]
Fit a Gaussian kernel-density distribution to theta values
- Parameters:
theta_values (DataFrame) – Theta values, columns = variable names
- Return type:
scipy.stats.gaussian_kde distribution
- pyomo.contrib.parmest.graphics.fit_mvn_dist(theta_values)[source]
Fit a multivariate normal distribution to theta values
- Parameters:
theta_values (DataFrame) – Theta values, columns = variable names
- Return type:
scipy.stats.multivariate_normal distribution
- pyomo.contrib.parmest.graphics.fit_rect_dist(theta_values, alpha)[source]
Fit an alpha-level rectangular distribution to theta values
- Parameters:
theta_values (DataFrame) – Theta values, columns = variable names
alpha (float, optional) – Confidence interval value
- Return type:
tuple containing lower bound and upper bound for each variable
- pyomo.contrib.parmest.graphics.grouped_boxplot(data1, data2, normalize=False, group_names=['data1', 'data2'], filename=None)[source]
Plot a grouped boxplot to compare two datasets
The datasets can be normalized by the median and standard deviation of data1.
- Parameters:
data1 (DataFrame) – Data set, columns = variable names
data2 (DataFrame) – Data set, columns = variable names
normalize (bool, optional) – Normalize both datasets by the median and standard deviation of data1
group_names (list, optional) – Names used in the legend
filename (string, optional) – Filename used to save the figure
- pyomo.contrib.parmest.graphics.grouped_violinplot(data1, data2, normalize=False, group_names=['data1', 'data2'], filename=None)[source]
Plot a grouped violinplot to compare two datasets
The datasets can be normalized by the median and standard deviation of data1.
- Parameters:
data1 (DataFrame) – Data set, columns = variable names
data2 (DataFrame) – Data set, columns = variable names
normalize (bool, optional) – Normalize both datasets by the median and standard deviation of data1
group_names (list, optional) – Names used in the legend
filename (string, optional) – Filename used to save the figure
- pyomo.contrib.parmest.graphics.pairwise_plot(theta_values, theta_star=None, alpha=None, distributions=[], axis_limits=None, title=None, add_obj_contour=True, add_legend=True, filename=None)[source]
Plot pairwise relationship for theta values, and optionally alpha-level confidence intervals and objective value contours
- Parameters:
theta_values (DataFrame or tuple) –
If theta_values is a DataFrame, then it contains one column for each theta variable and (optionally) an objective value column (‘obj’) and columns that contains Boolean results from confidence interval tests (labeled using the alpha value). Each row is a sample.
Theta variables can be computed from
theta_est_bootstrap
,theta_est_leaveNout
, andleaveNout_bootstrap_test
.The objective value can be computed using the
likelihood_ratio_test
.Results from confidence interval tests can be computed using the
leaveNout_bootstrap_test
,likelihood_ratio_test
, andconfidence_region_test
.
If theta_values is a tuple, then it contains a mean, covariance, and number of samples (mean, cov, n) where mean is a dictionary or Series (indexed by variable name), covariance is a DataFrame (indexed by variable name, one column per variable name), and n is an integer. The mean and covariance are used to create a multivariate normal sample of n theta values. The covariance can be computed using
theta_est(calc_cov=True)
.
theta_star (dict or Series, optional) – Estimated value of theta. The dictionary or Series is indexed by variable name. Theta_star is used to slice higher dimensional contour intervals in 2D
alpha (float, optional) – Confidence interval value, if an alpha value is given and the distributions list is empty, the data will be filtered by True/False values using the column name whose value equals alpha (see results from
leaveNout_bootstrap_test
,likelihood_ratio_test
, andconfidence_region_test
)distributions (list of strings, optional) – Statistical distribution used to define a confidence region, options = ‘MVN’ for multivariate_normal, ‘KDE’ for gaussian_kde, and ‘Rect’ for rectangular. Confidence interval is a 2D slice, using linear interpolation at theta_star.
axis_limits (dict, optional) – Axis limits in the format {variable: [min, max]}
title (string, optional) – Plot title
add_obj_contour (bool, optional) – Add a contour plot using the column ‘obj’ in theta_values. Contour plot is a 2D slice, using linear interpolation at theta_star.
add_legend (bool, optional) – Add a legend to the plot
filename (string, optional) – Filename used to save the figure