B2BoostClassifier#
- class empulse.models.B2BoostClassifier(estimator=None, *, accept_rate=0.3, clv=200, incentive_fraction=0.05, contact_cost=15)[source]#
Cost-sensitive gradient boosting classifier for B2B customer churn.
B2BoostClassifier supports
xgboost.XGBClassifier,lightgbm.LGBMClassifierandcatboost.CatBoostClassifier. By default, it uses XGBoost classifier with default hyperparameters.Read more in the User Guide.
- Parameters:
- estimator
xgboost.XGBClassifier,lightgbm.LGBMClassifierorcatboost.CatBoostClassifier, optional XGBoost or LightGBM classifier to be fit with desired hyperparameters. If not provided, a XGBoost classifier with default hyperparameters is used.
- accept_ratefloat, default=0.3
Probability of a customer responding to the retention offer (0 < accept_rate < 1). Is overwritten if another accept_rate is passed to the
fitmethod.- clvfloat or 1D array-like, shape=(n_samples), default=200
If
float: constant customer lifetime value per retained customer (clv > incentive_cost). Ifarray: individualized customer lifetime value of each customer when retained (mean(clv) > incentive_cost). Is overwritten if another clv is passed to thefitmethod.Note
It is not recommended to pass instance-dependent costs to the
__init__method. Instead, pass them to thefitmethod.- incentive_fractionfloat, default=0.05
Cost of incentive offered to a customer, as a fraction of customer lifetime value (
0 < incentive_fraction < 1). Is overwritten if another incentive_fraction is passed to thefitmethod.- contact_costfloat, default=15
Constant cost of contact (
contact_cost > 0). Is overwritten if another contact_cost is passed to thefitmethod.
- estimator
- Attributes:
- classes_numpy.ndarray, shape=(n_classes,)
Unique classes in the target.
- estimator_
xgboost.XGBClassifier Fitted XGBoost classifier.
Notes
The instance-specific cost function for customer churn is defined as [1]:
\[C(s_i) = y_i[s_i(f-\gamma (1-\delta )CLV_i] + (1-y_i)[s_i(\delta CLV_i + f)]\]The measure requires that the churn class is encoded as 0, and it is NOT interchangeable. However, this implementation assumes the standard notation (‘churn’: 1, ‘no churn’: 0).
See also
create_objective_churn: Creates the instance-dependent cost function for customer churn.References
[1]Janssens, B., Bogaert, M., Bagué, A., & Van den Poel, D. (2022). B2Boost: Instance-dependent profit-driven modelling of B2B churn. Annals of Operations Research, 1-27.
Examples
import numpy as np from empulse.models import B2BoostClassifier from sklearn.datasets import make_classification X, y = make_classification() clv = np.random.rand(y.size) * 100 model = B2BoostClassifier() model.fit(X, y, clv=clv, incentive_fraction=0.1)
import numpy as np from empulse.models import B2BoostClassifier from sklearn import set_config from sklearn.datasets import make_classification from sklearn.model_selection import cross_val_score from sklearn.pipeline import Pipeline from sklearn.preprocessing import StandardScaler set_config(enable_metadata_routing=True) X, y = make_classification(n_samples=50) clv = np.random.rand(y.size) * 100 pipeline = Pipeline([ ('scaler', StandardScaler()), ('model', B2BoostClassifier(contact_cost=10).set_fit_request(clv=True)) ]) cross_val_score(pipeline, X, y, params={'clv': clv})
import numpy as np from empulse.metrics import empb_score from empulse.models import B2BoostClassifier from sklearn import set_config from sklearn.datasets import make_classification from sklearn.model_selection import GridSearchCV from sklearn.metrics import make_scorer from sklearn.pipeline import Pipeline from sklearn.preprocessing import StandardScaler from xgboost import XGBClassifier set_config(enable_metadata_routing=True) X, y = make_classification() clv = np.random.rand(y.size) * 100 contact_cost = 10 pipeline = Pipeline([ ('scaler', StandardScaler()), ('model', B2BoostClassifier( XGBClassifier(n_jobs=2, n_estimators=10), contact_cost=contact_cost ).set_fit_request(clv=True)) ]) param_grid = { 'model__estimator__learning_rate': np.logspace(-5, 0, 5), } scorer = make_scorer( empb_score, response_method='predict_proba', contact_cost=contact_cost ) scorer = scorer.set_score_request(clv=True) grid_search = GridSearchCV(pipeline, param_grid=param_grid, scoring=scorer) grid_search.fit(X, y, clv=clv)
- fit(X, y, *, accept_rate=Parameter.UNCHANGED, clv=Parameter.UNCHANGED, incentive_fraction=Parameter.UNCHANGED, contact_cost=Parameter.UNCHANGED, fit_params=None, **loss_params)[source]#
Fit the model.
- Parameters:
- Xarray-like of shape (n_samples, n_features)
- yarray-like of shape (n_samples,)
- accept_ratefloat, default=0.3
Probability of a customer responding to the retention offer (
0 < accept_rate < 1).- clvfloat or 1D array-like, shape=(n_samples), default=200
If
float: constant customer lifetime value per retained customer (clv > incentive_cost). Ifarray: individualized customer lifetime value of each customer when retained (mean(clv) > incentive_cost).- incentive_fractionfloat, default=0.05
Cost of incentive offered to a customer, as a fraction of customer lifetime value (
0 < incentive_fraction < 1).- contact_costfloat, default=15
Constant cost of contact (
contact_cost > 0).- fit_paramsdict, optional
Additional parameters to pass to the estimator’s fit method.
- loss_paramsdict
Additional keyword arguments to pass to the loss function.
- Returns:
- selfB2BoostClassifier
Fitted B2Boost model.
- get_metadata_routing()#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
- routingMetadataRequest
A
MetadataRequestencapsulating routing information.
- get_params(deep=True)#
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- predict(X)#
Predict class labels for samples in X.
- Parameters:
- Xarray-like of shape (n_samples, n_features)
Features.
- Returns:
- y_predndarray of shape (n_samples,)
Predicted labels for each sample.
- predict_proba(X)#
Predict class probabilities for X.
- Parameters:
- X2D numpy.ndarray, shape=(n_samples, n_features)
- Returns:
- y_pred2D numpy.ndarray, shape=(n_samples, n_classes)
Predicted class probabilities.
- score(X, y, sample_weight=None)#
Return accuracy on provided data and labels.
In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.
- Parameters:
- Xarray-like of shape (n_samples, n_features)
Test samples.
- yarray-like of shape (n_samples,) or (n_samples, n_outputs)
True labels for X.
- sample_weightarray-like of shape (n_samples,), default=None
Sample weights.
- Returns:
- scorefloat
Mean accuracy of
self.predict(X)w.r.t. y.
- set_fit_request(*, accept_rate='$UNCHANGED$', clv='$UNCHANGED$', contact_cost='$UNCHANGED$', fit_params='$UNCHANGED$', incentive_fraction='$UNCHANGED$')#
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
- accept_ratestr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
accept_rateparameter infit.- clvstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
clvparameter infit.- contact_coststr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
contact_costparameter infit.- fit_paramsstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
fit_paramsparameter infit.- incentive_fractionstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
incentive_fractionparameter infit.
- Returns:
- selfobject
The updated object.
- set_params(**params)#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.
- set_score_request(*, sample_weight='$UNCHANGED$')#
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
- sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
sample_weightparameter inscore.
- Returns:
- selfobject
The updated object.