API

parmest

class pyomo.contrib.parmest.parmest.Estimator(model_function, data, theta_names, obj_function=None, tee=False, diagnostic_mode=False, solver_options=None)[source]

Bases: object

Parameter estimation class

Parameters:
  • model_function (function) – Function that generates an instance of the Pyomo model using ‘data’ as the input argument
  • data (pandas DataFrame, list of dictionaries, or list of json file names) – Data that is used to build an instance of the Pyomo model and build the objective function
  • theta_names (list of strings) – List of Var names to estimate
  • obj_function (function, optional) – Function used to formulate parameter estimation objective, generally sum of squared error between measurements and model variables. If no function is specified, the model is used “as is” and should be defined with a “FirstStateCost” and “SecondStageCost” expression that are used to build an objective for pysp.
  • tee (bool, optional) – Indicates that ef solver output should be teed
  • diagnostic_mode (bool, optional) – If True, print diagnostics from the solver
  • solver_options (dict, optional) – Provides options to the solver (also the name of an attribute)
confidence_region_test(theta_values, distribution, alphas, test_theta_values=None)[source]

Confidence region test to determine if theta values are within a rectangular, multivariate normal, or Gaussian kernel density distribution for a range of alpha values

Parameters:
  • theta_values (DataFrame, columns = theta_names) – Theta values used to generate a confidence region (generally returned by theta_est_bootstrap)
  • distribution (string) – Statistical distribution used to define a confidence region, options = ‘MVN’ for multivariate_normal, ‘KDE’ for gaussian_kde, and ‘Rect’ for rectangular.
  • alphas (list) – List of alpha values used to determine if theta values are inside or outside the region.
  • test_theta_values (dictionary or DataFrame, keys/columns = theta_names, optional) – Additional theta values that are compared to the confidence region to determine if they are inside or outside.
Returns:

  • training_results (DataFrame) – Theta value used to generate the confidence region along with True (inside) or False (outside) for each alpha
  • test_results (DataFrame) – If test_theta_values is not None, returns test theta value along with True (inside) or False (outside) for each alpha

leaveNout_bootstrap_test(lNo, lNo_samples, bootstrap_samples, distribution, alphas, seed=None)[source]

Leave-N-out bootstrap test to compare theta values where N data points are left out to a bootstrap analysis using the remaining data, results indicate if theta is within a confidence region determined by the bootstrap analysis

Parameters:
  • lNo (int) – Number of data points to leave out for parameter estimation
  • lNo_samples (int) – Leave-N-out sample size. If lNo_samples=None, the maximum number of combinations will be used
  • bootstrap_samples (int:) – Bootstrap sample size
  • distribution (string) – Statistical distribution used to define a confidence region, options = ‘MVN’ for multivariate_normal, ‘KDE’ for gaussian_kde, and ‘Rect’ for rectangular.
  • alphas (list) – List of alpha values used to determine if theta values are inside or outside the region.
  • seed (int or None, optional) – Random seed
Returns:

  • List of tuples with one entry per lNo_sample
  • * The first item in each tuple is the list of N samples that are left – out.
  • * The second item in each tuple is a DataFrame of theta estimated using – the N samples.
  • * The third item in each tuple is a DataFrame containing results from – the bootstrap analysis using the remaining samples.
  • For each DataFrame a column is added for each value of alpha which
  • indicates if the theta estimate is in (True) or out (False) of the
  • alpha region for a given distribution (based on the bootstrap results)

likelihood_ratio_test(obj_at_theta, obj_value, alphas, return_thresholds=False)[source]

Likelihood ratio test to identify theta values within a confidence region using the \(\chi^2\) distribution

Parameters:
  • obj_at_theta (DataFrame, columns = theta_names + 'obj') – Objective values for each theta value (returned by objective_at_theta)
  • obj_value (int or float) – Objective value from parameter estimation using all data
  • alphas (list) – List of alpha values to use in the chi2 test
  • return_thresholds (bool, optional) – Return the threshold value for each alpha
Returns:

  • LR (DataFrame) – Objective values for each theta value along with True or False for each alpha
  • thresholds (dictionary) – If return_threshold = True, the thresholds are also returned.

objective_at_theta(theta_values)[source]

Objective value for each theta

Parameters:theta_values (DataFrame, columns=theta_names) – Values of theta used to compute the objective
Returns:obj_at_theta – Objective value for each theta (infeasible solutions are omitted).
Return type:DataFrame
theta_est(solver='ef_ipopt', return_values=[], bootlist=None, calc_cov=False)[source]

Parameter estimation using all scenarios in the data

Parameters:
  • solver (string, optional) – “ef_ipopt” or “k_aug”. Default is “ef_ipopt”.
  • return_values (list, optional) – List of Variable names used to return values from the model
  • bootlist (list, optional) – List of bootstrap sample numbers, used internally when calling theta_est_bootstrap
  • calc_cov (boolean, optional) – If True, calculate and return the covariance matrix (only for “ef_ipopt” solver)
Returns:

  • objectiveval (float) – The objective function value
  • thetavals (dict) – A dictionary of all values for theta
  • variable values (pd.DataFrame) – Variable values for each variable name in return_values (only for ef_ipopt)
  • Hessian (dict) – A dictionary of dictionaries for the Hessian. The Hessian is not returned if the solver is ef_ipopt.
  • cov (numpy.array) – Covariance matrix of the fitted parameters (only for ef_ipopt)

theta_est_bootstrap(bootstrap_samples, samplesize=None, replacement=True, seed=None, return_samples=False)[source]

Parameter estimation using bootstrap resampling of the data

Parameters:
  • bootstrap_samples (int) – Number of bootstrap samples to draw from the data
  • samplesize (int or None, optional) –
    Size of each bootstrap sample. If samplesize=None, samplesize will be
    set to the number of samples in the data
  • replacement (bool, optional) – Sample with or without replacement
  • seed (int or None, optional) – Random seed
  • return_samples (bool, optional) – Return a list of sample numbers used in each bootstrap estimation
Returns:

bootstrap_theta – Theta values for each sample and (if return_samples = True) the sample numbers used in each estimation

Return type:

DataFrame

theta_est_leaveNout(lNo, lNo_samples=None, seed=None, return_samples=False)[source]

Parameter estimation where N data points are left out of each sample

Parameters:
  • lNo (int) – Number of data points to leave out for parameter estimation
  • lNo_samples (int) – Number of leave-N-out samples. If lNo_samples=None, the maximum number of combinations will be used
  • seed (int or None, optional) – Random seed
  • return_samples (bool, optional) – Return a list of sample numbers that were left out
Returns:

lNo_theta – Theta values for each sample and (if return_samples = True) the sample numbers left out of each estimation

Return type:

DataFrame

pyomo.contrib.parmest.parmest.group_data(data, groupby_column_name, use_mean=None)[source]

Group data by scenario

Parameters:
  • data (DataFrame) – Data
  • groupby_column_name (strings) – Name of data column which contains scenario numbers
  • use_mean (list of column names or None, optional) – Name of data columns which should be reduced to a single value per scenario by taking the mean
Returns:

grouped_data – Grouped data

Return type:

list of dictionaries

scenariocreator

class pyomo.contrib.parmest.scenariocreator.ParmestScen(name, ThetaVals, probability)[source]

Bases: object

A little container for scenarios; the Args are the attributes.

Parameters:
  • name (str) – name for reporting; might be “”
  • ThetaVals (dict) – ThetaVals[name]=val
  • probability (float) – probability of occurance “near” these ThetaVals
class pyomo.contrib.parmest.scenariocreator.ScenarioCreator(pest, solvername)[source]

Bases: object

Create scenarios from parmest.

Parameters:
  • pest (Estimator) – the parmest object
  • solvername (str) – name of the solver (e.g. “ipopt”)
ScenariosFromBoostrap(addtoSet, numtomake, seed=None)[source]

Creates new self.Scenarios list using the experiments only.

Parameters:
  • addtoSet (ScenarioSet) – the scenarios will be added to this set
  • numtomake (int) – number of scenarios to create
ScenariosFromExperiments(addtoSet)[source]

Creates new self.Scenarios list using the experiments only.

Parameters:addtoSet (ScenarioSet) – the scenarios will be added to this set
Returns:a ScenarioSet
class pyomo.contrib.parmest.scenariocreator.ScenarioSet(name)[source]

Bases: object

Class to hold scenario sets

Args: name (str): name of the set (might be “”)

ScenarioNumber(scennum)[source]

Returns the scenario with the given, zero-based number

ScensIterator()[source]

Usage: for scenario in ScensIterator()

addone(scen)[source]

Add a scenario to the set

Parameters:scen (ParmestScen) – the scenario to add
append_bootstrap(bootstrap_theta)[source]

Append a boostrap theta df to the scenario set; equally likely

Parameters:boostrap_theta (dataframe) – created by the bootstrap
Note: this can be cleaned up a lot with the list becomes a df,
which is why I put it in the ScenarioSet class.
write_csv(filename)[source]

write a csv file with the scenarios in the set

Parameters:filename (str) – full path and full name of file

graphics

pyomo.contrib.parmest.graphics.fit_kde_dist(theta_values)[source]

Fit a Gaussian kernel-density distribution to theta values

Parameters:theta_values (DataFrame, columns = variable names) – Theta values
Returns:
Return type:scipy.stats.gaussian_kde distribution
pyomo.contrib.parmest.graphics.fit_mvn_dist(theta_values)[source]

Fit a multivariate normal distribution to theta values

Parameters:theta_values (DataFrame, columns = variable names) – Theta values
Returns:
Return type:scipy.stats.multivariate_normal distribution
pyomo.contrib.parmest.graphics.fit_rect_dist(theta_values, alpha)[source]

Fit an alpha-level rectangular distribution to theta values

Parameters:
  • theta_values (DataFrame, columns = variable names) – Theta values
  • alpha (float, optional) – Confidence interval value
Returns:

Return type:

tuple containing lower bound and upper bound for each variable

pyomo.contrib.parmest.graphics.grouped_boxplot(data1, data2, normalize=False, group_names=['data1', 'data2'], filename=None)[source]

Plot a grouped boxplot to compare two datasets

The datasets can be normalized by the median and standard deviation of data1.

Parameters:
  • data1 (DataFrame, columns = variable names) – Data set
  • data2 (DataFrame, columns = variable names) – Data set
  • normalize (bool, optional) – Normalize both datasets by the median and standard deviation of data1
  • group_names (list, optional) – Names used in the legend
  • filename (string, optional) – Filename used to save the figure
pyomo.contrib.parmest.graphics.grouped_violinplot(data1, data2, normalize=False, group_names=['data1', 'data2'], filename=None)[source]

Plot a grouped violinplot to compare two datasets

The datasets can be normalized by the median and standard deviation of data1.

Parameters:
  • data1 (DataFrame, columns = variable names) – Data set
  • data2 (DataFrame, columns = variable names) – Data set
  • normalize (bool, optional) – Normalize both datasets by the median and standard deviation of data1
  • group_names (list, optional) – Names used in the legend
  • filename (string, optional) – Filename used to save the figure
pyomo.contrib.parmest.graphics.pairwise_plot(theta_values, theta_star=None, alpha=None, distributions=[], axis_limits=None, title=None, add_obj_contour=True, add_legend=True, filename=None)[source]

Plot pairwise relationship for theta values, and optionally alpha-level confidence intervals and objective value contours

Parameters:
  • theta_values (DataFrame, columns = variable names and (optionally) 'obj' and alpha values) – Theta values and (optionally) an objective value and results from leaveNout_bootstrap_test, likelihood_ratio_test, or confidence_region_test
  • theta_star (dict or Series, keys = variable names, optional) – Theta* (or other individual values of theta, also used to slice higher dimensional contour intervals in 2D)
  • alpha (float, optional) – Confidence interval value, if an alpha value is given and the distributions list is empty, the data will be filtered by True/False values using the column name whose value equals alpha (see results from leaveNout_bootstrap_test, likelihood_ratio_test, or confidence_region_test)
  • distributions (list of strings, optional) – Statistical distribution used to define a confidence region, options = ‘MVN’ for multivariate_normal, ‘KDE’ for gaussian_kde, and ‘Rect’ for rectangular. Confidence interval is a 2D slice, using linear interpolation at theta*.
  • axis_limits (dict, optional) – Axis limits in the format {variable: [min, max]}
  • title (string, optional) – Plot title
  • add_obj_contour (bool, optional) – Add a contour plot using the column ‘obj’ in theta_values. Contour plot is a 2D slice, using linear interpolation at theta*.
  • add_legend (bool, optional) – Add a legend to the plot
  • filename (string, optional) – Filename used to save the figure