empc_score#
- empulse.metrics.empc_score(y_true, y_score, *, alpha=6, beta=14, clv=200, incentive_cost=10, contact_cost=1, check_input=True)[source]#
empc
but only returning the EMPC score.EMPC presumes a situation where identified churners are contacted and offered an incentive to remain customers. Only a fraction of churners accepts the incentive offer, this fraction is described by a \(Beta(\alpha, \beta)\) distribution. As opposed to
empb
, the incentive cost is a fixed value, rather than a fraction of the customer lifetime value. For detailed information, consult the paper [1].See also
empc
: to also return the fraction of the customer base that should be targeted to maximize profit.mpc_score
: for a deterministic version of this metric.empb_score
: for a similar metric, but with a variable incentive cost.- Parameters:
- y_true1D array-like, shape=(n_samples,)
Binary target values (‘churn’: 1, ‘no churn’: 0).
- y_score1D array-like, shape=(n_samples,)
Target scores, can either be probability estimates or non-thresholded decision values.
- alphafloat, default=6
Shape parameter of the beta distribution of the probability that a churner accepts the incentive (
alpha > 1
).- betafloat, default=14
Shape parameter of the beta distribution of the probability that a churner accepts the incentive (
beta > 1
).- clvfloat or 1D array-like, shape=(n_samples), default=200
If
float
: average customer lifetime value of retained customers (clv > incentive_cost
). Ifarray
: customer lifetime value of each customer when retained (mean(clv) > incentive_cost
).Note
Passing a CLV array is equivalent to passing a float with the average CLV of that array.
- incentive_costfloat, default=10
Cost of incentive offered to a customer (
incentive_cost > 0
).- contact_costfloat, default=1
Cost of contacting a customer (
contact_cost > 0
).- check_inputbool, default=True
Perform input validation. Turning off improves performance, useful when using this metric as a loss function.
- Returns:
- empcfloat
Expected Maximum Profit Measure for Customer Churn.
Notes
The EMPC is defined as [1]:
\[\int_\gamma CLV (\gamma (1 - \delta) - \phi) \pi_0 F_0(T) - CLV (\delta + \phi) \pi_1 F_1(T) d\gamma\]The EMPC requires that the churn class is encoded as 0, and it is NOT interchangeable (see [3] p37). However, this implementation assumes the standard notation (‘churn’: 1, ‘no churn’: 0).
An equivalent R implementation is available in [2].
References
[1] (1,2)Verbraken, T., Verbeke, W. and Baesens, B. (2013). A Novel Profit Maximizing Metric for Measuring Classification Performance of Customer Churn Prediction Models. IEEE Transactions on Knowledge and Data Engineering, 25(5), 961-973. Available Online: http://ieeexplore.ieee.org/iel5/69/6486492/06165289.pdf?arnumber=6165289
[2]Bravo, C. and Vanden Broucke, S. and Verbraken, T. (2019). EMP: Expected Maximum Profit Classification Performance Measure. R package version 2.0.5. Available Online: http://cran.r-project.org/web/packages/EMP/index.html
[3]Verbraken, T. (2013). Business-Oriented Data Analytics: Theory and Case Studies. Ph.D. dissertation, Dept. LIRIS, KU Leuven, Leuven, Belgium, 2013.
Examples
>>> from empulse.metrics import empc_score >>> >>> y_true = [0, 1, 0, 1, 0, 1, 0, 1] >>> y_score = [0.1, 0.2, 0.3, 0.4, 0.5, 0.7, 0.8, 0.9] >>> empc_score(y_true, y_score) 23.875593418348124
Using scorer:
>>> import numpy as np >>> from sklearn.datasets import make_classification >>> from sklearn.linear_model import LogisticRegression >>> from sklearn.model_selection import cross_val_score, StratifiedKFold >>> from sklearn.metrics import make_scorer >>> from empulse.metrics import empa_score >>> >>> X, y = make_classification(random_state=42) >>> model = LogisticRegression() >>> cv = StratifiedKFold(n_splits=5, shuffle=True, random_state=42) >>> scorer = make_scorer( ... empc_score, ... response_method='predict_proba', ... clv=300, ... incentive_cost=15, ... ) >>> np.mean(cross_val_score(model, X, y, cv=cv, scoring=scorer)) 42.09000050753503