CostMatrix#

class empulse.metrics.CostMatrix[source]#

Class to create a custom value/cost-sensitive cost matrix.

You add the costs and benefits that make up the cost matrix for each case (true positive, true negative, false positive, false negative). The costs and benefits are specified using sympy symbols or expressions. Stochastic variables are supported and can be specified using sympy.stats random variables. Stochastic variables are assumed to be independent of each other.

Read more in the User Guide.

Attributes:
tp_benefitsympy.Expr

The benefit of a true positive. See add_tp_benefit for more details.

tn_benefitsympy.Expr

The benefit of a true negative. See add_tn_benefit for more details.

fp_benefitsympy.Expr

The benefit of a false positive. See add_fp_benefit for more details.

fn_benefitsympy.Expr

The benefit of a false negative. See add_fn_benefit for more details.

tp_costsympy.Expr

The cost of a true positive. See add_tp_cost for more details.

tn_costsympy.Expr

The cost of a true negative. See add_tn_cost for more details.

fp_costsympy.Expr

The cost of a false positive. See add_fp_cost for more details.

fn_costsympy.Expr

The cost of a false negative. See add_fn_cost for more details.

Examples

Reimplementing the empc_score cost matrix.

import sympy as sp
from empulse.metrics import CostMatrix

clv, d, f, alpha, beta = sp.symbols(
    'clv d f alpha beta'
)  # define deterministic variables
gamma = sp.stats.Beta('gamma', alpha, beta)  # define gamma to follow a Beta distribution

cost_matrix = (
    CostMatrix()
    .add_tp_benefit(gamma * (clv - d - f))  # when churner accepts offer
    .add_tp_benefit((1 - gamma) * -f)  # when churner does not accept offer
    .add_fp_cost(d + f)  # when you send an offer to a non-churner
    .alias({'incentive_cost': 'd', 'contact_cost': 'f'})
)
add_fn_benefit(term)[source]#

Add a term to the benefit of classifying a false negative.

Parameters:
term: sympy.Expr | str

The term to add to the benefit of classifying a false negative.

Returns:
CostMatrix
add_fn_cost(term)[source]#

Add a term to the cost of classifying a false negative.

Parameters:
term: sympy.Expr | str

The term to add to the cost of classifying a false negative.

Returns:
CostMatrix
add_fp_benefit(term)[source]#

Add a term to the benefit of classifying a false positive.

Parameters:
term: sympy.Expr | str

The term to add to the benefit of classifying a false positive.

Returns:
CostMatrix
add_fp_cost(term)[source]#

Add a term to the cost of classifying a false positive.

Parameters:
term: sympy.Expr | str

The term to add to the cost of classifying a false positive.

Returns:
CostMatrix
add_tn_benefit(term)[source]#

Add a term to the benefit of classifying a true negative.

Parameters:
term: sympy.Expr | str

The term to add to the benefit of classifying a true negative.

Returns:
CostMatrix
add_tn_cost(term)[source]#

Add a term to the cost of classifying a true negative.

Parameters:
term: sympy.Expr | str

The term to add to the cost of classifying a true negative.

Returns:
CostMatrix
add_tp_benefit(term)[source]#

Add a term to the benefit of classifying a true positive.

Parameters:
term: sympy.Expr | str

The term to add to the benefit of classifying a true positive.

Returns:
CostMatrix
add_tp_cost(term)[source]#

Add a term to the cost of classifying a true positive.

Parameters:
term: sympy.Expr | str

The term to add to the cost of classifying a true positive.

Returns:
CostMatrix
alias(alias, symbol=None)[source]#

Add an alias for a symbol.

Parameters:
alias: str | MutableMapping[str, sympy.Symbol | str]

The alias to add. If a MutableMapping (.e.g, dictionary) is passed, the keys are the aliases and the values are the symbols.

symbol: sympy.Symbol, optional

The symbol to alias to.

Returns:
CostMatrix

Examples

import sympy as sp
from empulse.metrics import Metric, Cost

clv, delta, f, gamma = sp.symbols('clv delta f gamma')
cost_matrix = (
    CostMatrix()
    .add_tp_benefit(gamma * (clv - delta * clv - f))  # when churner accepts offer
    .add_tp_benefit((1 - gamma) * -f)  # when churner does not accept offer
    .add_fp_cost(delta * clv + f)  # when you send an offer to a non-churner
    .alias({'incentive_fraction': 'delta', 'contact_cost': 'f', 'accept_rate': 'gamma'})
)
cost_loss = Metric(cost_matrix, Cost())

y_true = [1, 0, 1, 0, 1]
y_proba = [0.9, 0.1, 0.8, 0.2, 0.7]
cost_loss(
    y_true, y_proba, clv=100, incentive_fraction=0.05, contact_cost=1, accept_rate=0.3
)
mark_outlier_sensitive(symbol)[source]#

Mark a symbol as outlier-sensitive.

This is used to indicate that the symbol is sensitive to outliers. When the metric is used as a loss function or criterion for training a model, RobustCSClassifier will impute outliers for this symbol’s value. This is ignored when not using a RobustCSClassifier model.

Parameters:
symbol: str | sympy.Symbol

The symbol to mark as outlier-sensitive.

Returns:
CostMatrix

Examples

import numpy as np
import sympy as sp
from empulse.metrics import Metric, Cost
from empulse.models import CSLogitClassifier, RobustCSClassifier
from sklearn.datasets import make_classification

X, y = make_classification()
a, b = sp.symbols('a b')
cost_matrix = CostMatrix().add_fp_cost(a).add_fn_cost(b).mark_outlier_sensitive(a)
cost_loss = Metric(cost_matrix, Cost())
fn_cost = np.random.rand(y.size)

model = RobustCSClassifier(CSLogitClassifier(loss=cost_loss))
model.fit(X, y, a=np.random.rand(y.size), b=5)
set_default(**defaults)[source]#

Set default values for symbols or their aliases.

Parameters:
defaults: float

Default values for symbols or their aliases. These default values will be used if not provided in __call__.

Returns:
CostMatrix

Examples

import sympy as sp
from empulse.metrics import Metric, Cost

clv, delta, f, gamma = sp.symbols('clv delta f gamma')
cost_matrix = (
    CostMatrix()
    .add_tp_benefit(gamma * (clv - delta * clv - f))  # when churner accepts offer
    .add_tp_benefit((1 - gamma) * -f)  # when churner does not accept offer
    .add_fp_cost(delta * clv + f)  # when you send an offer to a non-churner
    .alias({'incentive_fraction': 'delta', 'contact_cost': 'f', 'accept_rate': 'gamma'})
    .set_default(incentive_fraction=0.05, contact_cost=1, accept_rate=0.3)
)
cost_loss = Metric(cost_matrix, Cost())

y_true = [1, 0, 1, 0, 1]
y_proba = [0.9, 0.1, 0.8, 0.2, 0.7]
cost_loss(y_true, y_proba, clv=100, incentive_fraction=0.1)