empa_score#

empulse.metrics.empa_score(y_true, y_score, *, alpha=12, beta=0.0015, contact_cost=50, sales_cost=500, direct_selling=1, commission=0.1, check_input=True)[source]#

empa but only returning the EMPA score.

EMPA presumes a situation where leads are targeted either directly or indirectly. Directly targeted leads are contacted and handled by the internal sales team. Indirectly targeted leads are contacted and then referred to intermediaries, which receive a commission. The contribution of a successful acquisition is modeled as a \(Gamma(\alpha, \beta)\) distribution.

See also

empa : to also return the fraction of the leads that should be targeted to maximize profit.

mpa_score : for a deterministic version of this metric.

Parameters:

y_true1D array-like, shape=(n_samples,): Binary target values (‘acquisition’: 1, ‘no acquisition’: 0).
y_score1D array-like, shape=(n_samples,): Target scores, can either be probability estimates or non-thresholded decision values.
alphafloat, default=10: Shape parameter of the gamma distribution of the average contribution of a new customer. (alpha > 0)
betafloat, default=10: Rate parameter of the gamma distribution of the average contribution of a new customer. (beta > 0)
sales_costfloat, default=500: Average sale conversion cost of targeted leads handled by the company (sales_cost ≥ 0).
contact_costfloat, default=50: Average contact cost of targeted leads (contact_cost ≥ 0).
direct_sellingfloat, default=1: Fraction of leads sold to directly (0 ≤ direct_selling ≤ 1). direct_selling = 0 for indirect channel. direct_selling = 1 for direct channel.
commissionfloat, default=0.1: Fraction of contribution paid to the intermedaries (0 ≤ commission ≤ 1).

Note

The commission is only relevant when there is an indirect channel (direct_selling < 1).
check_inputbool, default=True: Perform input validation. Turning off improves performance, useful when using this metric as a loss function.

Returns:

empatuple[float]: Expected Maximum Profit measure for customer Acquisition.

Notes

The EMPA is defined as:

\[\int_{R} [[ \rho(R-c-S)+(1-\rho)(\gamma R - c)] \pi_0 F_0(t) - c \pi_1 F_1(t)] \cdot g(CLV) \, dCLV\]

The EMPA requires that the acquisition class is encoded as 0, and it is NOT interchangeable. However, this implementation assumes the standard notation (‘acquisition’: 1, ‘no acquisition’: 0).

Examples

Direct channel (rho = 1):

>>> from empulse.metrics import empa_score
>>>
>>> y_true = [0, 1, 0, 1, 0, 1, 0, 1]
>>> y_score = [0.1, 0.2, 0.3, 0.4, 0.5, 0.7, 0.8, 0.9]
>>> empa_score(y_true, y_score, direct_selling=1)
3706.2500000052773

Indirect channel using scorer (rho = 0):

>>> import numpy as np
>>> from sklearn.datasets import make_classification
>>> from sklearn.linear_model import LogisticRegression
>>> from sklearn.model_selection import cross_val_score, StratifiedKFold
>>> from sklearn.metrics import make_scorer
>>> from empulse.metrics import empa_score
>>>
>>> X, y = make_classification(random_state=42)
>>> model = LogisticRegression()
>>> cv = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
>>> scorer = make_scorer(
...     empa_score,
...     response_method='predict_proba',
...     alpha=10,
...     beta=0.001,
...     sales_cost=2_000,
...     contact_cost=100,
...     direct_selling=0,
... )
>>> np.mean(cross_val_score(model, X, y, cv=cv, scoring=scorer))
4449.0