empcs_score#

empulse.metrics.empcs_score(y_true, y_score, *, success_rate=0.55, default_rate=0.1, roi=0.2644, check_input=True)[source]#

empcs but only returning the EMPCS score.

EMPCS presumes a situation where a company is considering whether to grant a loan to a customer. Correctly identifying defaulters results in receiving a return on investment (ROI), while incorrectly identifying non-defaulters as defaulters results in a loss of the loan amount. The degree to which the loan is lost is determined by the probability that the entire loan is lost (default_rate), probability that the entire loan is paid back (success_rate), and a uniform distribution of partial loan losses (1 - default_rate - success_rate). For detailed information, consult the paper [1].

See also

empcs : to also return the fraction of loan applications that should be accepted to maximize profit.

mpcs_score : for a deterministic version of this metric.

Parameters:
y_true1D array-like, shape=(n_samples,)

Binary target values (‘acquisition’: 1, ‘no acquisition’: 0).

y_score1D array-like, shape=(n_samples,)

Target scores, can either be probability estimates or non-thresholded decision values.

success_ratefloat, default=0.55

Probability that the entire loan is paid back (0 succes_rate 1).

default_ratefloat, default=0.1

Probability that the entire loan is lost (0 default_rate 1).

roifloat, default=0.2644

Return on investment on the loan (roi 0).

check_inputbool, default=True

Perform input validation. Turning off improves performance, useful when using this metric as a loss function.

Returns:
empcsfloat

Expected Maximum Profit measure for customer Credit Scoring.

Notes

The EMP measure for Credit Scoring is defined as [1]:

\[\int_0^1 \lambda \pi_0 F_0(T) - ROI \pi_1 F_1(T) \cdot h(\lambda) d\lambda\]

The EMP measure for Credit Scoring requires that the default class is encoded as 0, and it is NOT interchangeable. However, this implementation assumes the standard notation (‘default’: 1, ‘no default’: 0).

Code adapted from [2].

References

[1] (1,2)

Verbraken, T., Bravo, C., Weber, R., & Baesens, B. (2014). Development and application of consumer credit scoring models using profit-based classification measures. European Journal of Operational Research, 238(2), 505-513.

Examples

>>> from empulse.metrics import empcs_score
>>>
>>> y_true = [0, 1, 0, 1, 0, 1, 0, 1]
>>> y_score = [0.1, 0.2, 0.3, 0.4, 0.5, 0.7, 0.8, 0.9]
>>> empcs_score(y_true, y_score)
0.09747017050000001

Using scorer:

>>> import numpy as np
>>> from sklearn.datasets import make_classification
>>> from sklearn.linear_model import LogisticRegression
>>> from sklearn.model_selection import cross_val_score, StratifiedKFold
>>> from sklearn.metrics import make_scorer
>>> from empulse.metrics import empcs_score
>>>
>>> X, y = make_classification(random_state=42)
>>> model = LogisticRegression()
>>> cv = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
>>> scorer = make_scorer(
...     empcs_score,
...     response_method='predict_proba',
...     roi=0.2,
...     success_rate=0.5,
...     default_rate=0.1,
... )
>>> np.mean(cross_val_score(model, X, y, cv=cv, scoring=scorer))
0.14904