load_upsell_bank_telemarketing#
- empulse.datasets.load_upsell_bank_telemarketing(*, as_frame=False, return_X_y_costs=False, interest_rate=0.02463333, term_deposit_fraction=0.25, contact_cost=1)[source]#
Load the bank telemarketing dataset (binary classification).
The goal is to predict whether a client will subscribe to a term deposit after being called by the bank. The target variable is whether the client subscribed to the term deposit, ‘yes’ = 1 and ‘no’ = 0.
The dataset is related to a direct marketing campaigns (phone calls) of a Portuguese banking institution. The marketing campaigns were based on phone calls. Often, more than one contact to the same client was required, in order to access if the product (bank term deposit) would be or not subscribed.
Features recorded before the contact event are removed from the original dataset [1] to avoid data leakage. Only clients with a positive balance are considered, since clients in debt are not eligible for term deposits.
For a full data description and additional information about the dataset, consult the User Guide.
Classes
2
Subscribers
4787
Non-subscribers
33144
Samples
37931
Features
10
- Parameters:
- as_framebool, default=False
If True, the output will be a pandas DataFrames or Series instead of numpy arrays.
- return_X_y_costsbool, default=False
If True, return (data, target, tp_cost, fp_cost, tn_cost, fn_cost) instead of a Dataset object.
- interest_ratefloat, default=0.02463333
Interest rate of the term deposit.
- term_deposit_fractionfloat, default=0.25
Fraction of the client’s balance that is deposited in the term deposit.
- contact_costfloat, default=1
Cost of contacting the client.
- Returns:
- dataset
Dataset
or tuple of (data, target, tp_cost, fp_cost, tn_cost, fn_cost) Returns a Dataset object if return_X_y_costs=False (default), otherwise a tuple.
- dataset
Notes
Cost matrix
Actual positive
Actual negative
Predicted positive
tp_cost
fp_cost
Predicted negative
fn_cost
tn_cost
- with
: cost of contacting the client : interest rate of the term deposit : fraction of the client’s balance that is deposited in the term deposit : client’s balance
Using default parameters, it is assumed that
, , for all clients.References
[1]Moro, S., Rita, P., & Cortez, P. (2014). Bank Marketing [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C5K306.
[2]S. Moro, R. Laureano and P. Cortez. Using Data Mining for Bank Direct Marketing: An Application of the CRISP-DM Methodology. In P. Novais et al. (Eds.), Proceedings of the European Simulation and Modelling Conference - ESM’2011, pp. 117-121, Guimaraes, Portugal, October, 2011. EUROSIS. [bank.zip]
[3]A. Correa Bahnsen, A. Stojanovic, D.Aouada, B, Ottersten, “Improving Credit Card Fraud Detection with Calibrated Probabilities”, in Proceedings of the fourteenth SIAM International Conference on Data Mining, 677-685, 2014.
Examples
from empulse.datasets import load_upsell_bank_telemarketing from sklearn.model_selection import train_test_split dataset = load_upsell_bank_telemarketing() X_train, X_test, y_train, y_test = train_test_split( dataset.data, dataset.target, random_state=42 )