Cost#

class empulse.metrics.Cost[source]#

Strategy for the Expected Cost metric.

build(tp_benefit, tn_benefit, fp_cost, fn_cost)[source]#

Build the metric strategy.

gradient_boost_objective(y_true, y_score, **kwargs)[source]#

Compute the gradient of the metric with respect to gradient boosting instances.

Parameters:
y_true: array-like of shape (n_samples,)

The ground truth labels.

y_score: array-like of shape (n_samples,)

The predicted labels, probabilities, or decision scores (based on the chosen metric).

parameters: float or array-like of shape (n_samples,)

The parameter values for the costs and benefits defined in the metric. If any parameter is a stochastic variable, you should pass values for their distribution parameters. You can set the parameter values for either the symbol names or their aliases.

  • If float, the same value is used for all samples (class-dependent).

  • If array-like, the values are used for each sample (instance-dependent).

Returns:
gradientNDArray of shape (n_samples,)

The gradient of the metric loss with respect to the gradient boosting weights.

hessianNDArray of shape (n_samples,)

The hessian of the metric loss with respect to the gradient boosting weights.

logit_objective(features, weights, y_true, **parameters)[source]#

Compute the metric value and the gradient of the metric with respect to logistic regression coefficients.

Parameters:
featuresNDArray of shape (n_samples, n_features)

The features of the samples.

weightsNDArray of shape (n_features,)

The weights of the logistic regression model.

y_trueNDArray of shape (n_samples,)

The ground truth labels.

parametersfloat or NDArray of shape (n_samples,)

The parameter values for the costs and benefits defined in the metric. If any parameter is a stochastic variable, you should pass values for their distribution parameters. You can set the parameter values for either the symbol names or their aliases.

  • If float, the same value is used for all samples (class-dependent).

  • If array-like, the values are used for each sample (instance-dependent).

Returns:
valuefloat

The metric loss to be minimized.

gradientNDArray of shape (n_features,)

The gradient of the metric loss with respect to the logistic regression weights.

optimal_rate(y_true, y_score, **parameters)[source]#

Compute the predicted positive rate to optimize the metric value.

Parameters:
y_true: array-like of shape (n_samples,)

The ground truth labels.

y_score: array-like of shape (n_samples,)

The predicted labels, probabilities, or decision scores (based on the chosen metric).

parameters: float or array-like of shape (n_samples,)

The parameter values for the costs and benefits defined in the metric. If any parameter is a stochastic variable, you should pass values for their distribution parameters. You can set the parameter values for either the symbol names or their aliases.

  • If float, the same value is used for all samples (class-dependent).

  • If array-like, the values are used for each sample (instance-dependent).

Returns:
optimal_rate: float

The optimal predicted positive rate.

optimal_threshold(y_true, y_score, **parameters)[source]#

Compute the classification threshold(s) to optimize the metric value.

i.e., the score threshold at which an observation should be classified as positive to optimize the metric. For instance-dependent costs and benefits, this will return an array of thresholds, one for each sample. For class-dependent costs and benefits, this will return a single threshold value.

Parameters:
y_true: array-like of shape (n_samples,)

The ground truth labels.

y_score: array-like of shape (n_samples,)

The predicted labels, probabilities, or decision scores (based on the chosen metric).

parameters: float or array-like of shape (n_samples,)

The parameter values for the costs and benefits defined in the metric. If any parameter is a stochastic variable, you should pass values for their distribution parameters. You can set the parameter values for either the symbol names or their aliases.

  • If float, the same value is used for all samples (class-dependent).

  • If array-like, the values are used for each sample (instance-dependent).

Returns:
optimal_threshold: float | FloatNDArray

The optimal classification threshold(s).

score(y_true, y_score, **parameters)[source]#

Compute the metric expected cost loss.

Parameters:
y_true: array-like of shape (n_samples,)

The ground truth labels.

y_score: array-like of shape (n_samples,)

The predicted labels, probabilities, or decision scores (based on the chosen metric).

parameters: float or array-like of shape (n_samples,)

The parameter values for the costs and benefits defined in the metric. If any parameter is a stochastic variable, you should pass values for their distribution parameters. You can set the parameter values for either the symbol names or their aliases.

  • If float, the same value is used for all samples (class-dependent).

  • If array-like, the values are used for each sample (instance-dependent).

Returns:
score: float

The expected cost loss.

to_latex(tp_benefit, tn_benefit, fp_cost, fn_cost)[source]#

Return the LaTeX representation of the metric.