2.5. Profit-Driven Logistic Regression (ProfLogit)#
ProfLogit is a profit-driven logistic regression model that optimizes the expected maximum profit (EMP) with elastic net regularization [1].
2.5.1. Regularization#
The strength of regularization can be controlled by the C
parameter.
The l1_ratio
parameter controls the ratio of L1 regularization to L2 regularization.
By default l1_ratio
is set to 1, which means L1 regularization is used
and a sparse solution if found for the coefficients.
from empulse.models import ProfLogitClassifier
proflogit = ProfLogitClassifier(C=100, l1_ratio=0.2)
ProfLogit, by default,
utilizes the EMPC metric (empc_score
)
as its loss function and is optimized by a real-coded genetic algorithm (RGA).
The RGA runs for 1000 iterations, but it can stop early if the loss converges.
However, ProfLogit offers flexibility in terms of customization.
You can modify the stopping conditions, use different loss functions, and even change the optimization algorithms.
2.5.2. Optimization#
2.5.2.1. Custom Stopping Conditions#
The number of iterations, relative tolerance level,
or the number of iterations without improvement can be easily adjusted.
You just need to pass the desired values to the ProfLogitClassifier
initializer.
from empulse.models import ProfLogitClassifier
proflogit = ProfLogitClassifier(optimizer_params={'max_iter': 10000, 'tolerance': 1e-3, 'patience': 100})
For more advanced customization of the stopping conditions,
you can pass an optimize function to the ProfLogitClassifier
initializer.
For example, if you want to use the RGA for a set amount of time, you can do the following:
from empulse.optimizers import Generation
from scipy.optimize import OptimizeResult
from time import perf_counter
def optimize(objective, X, max_time=5, **kwargs) -> OptimizeResult:
generation = Generation(**kwargs)
bounds = [(-5, 5)] * X.shape[1]
start = perf_counter()
for _ in generation.optimize(objective, bounds):
if perf_counter() - start > max_time:
generation.result.message = "Maximum time reached."
generation.result.success = True
break
return generation.result
proflogit = ProfLogitClassifier(optimize_fn=optimize, optimizer_params={'max_time': 10})
Or you can stop the RGA after a set number of fitness evaluations:
def optimize(objective, X, max_evals=10_000, **kwargs) -> OptimizeResult:
generation = Generation(**kwargs)
bounds = [(-5, 5)] * X.shape[1]
for _ in rga.optimize(objective, bounds):
if generation.result.nfev > max_evals:
generation.result.message = "Maximum number of evaluations reached."
generation.result.success = True
break
return generation.result
proflogit = ProfLogitClassifier(optimize_fn=optimize, optimizer_params={'max_evals': 10_000})
2.5.2.2. Custom Loss Functions#
ProfLogit allows the use of any metrics defined in the empulse.metrics
module as the loss function.
To use a different metric,
simply pass the metric function to the ProfLogitClassifier
initializer.
from empulse.metrics import empa_score
proflogit = ProfLogitClassifier(loss=empa_score)
2.5.2.3. Custom Optimization Algorithms#
ProfLogit also supports the use of other optimization algorithms. If you can fit them in an optimize function, you can use them to optimize the loss function. For instance, if you want to use the L-BFGS-B algorithm from scipy.optimize, you can do the following:
import numpy as np
def optimize(objective, X, max_iter=10000, **kwargs) -> OptimizeResult:
initial_guess = np.zeros(X.shape[1])
bounds = [(-5, 5)] * X.shape[1]
result = minimize(
lambda x: -objective(x), # inverse objective function
initial_guess,
method='L-BFGS-B',
bounds=bounds,
options={
'maxiter': max_iter,
'ftol': 1e-4,
},
**kwargs
)
return result
proflogit = ProfLogitClassifier(optimize_fn=optimize)
Note that EMPC is a maximization problem, so we need to pass the inverse objective function to the optimizer.
You can also use unbounded optimization algorithms like BFGS:
def optimize(objective, X, **kwargs) -> OptimizeResult:
initial_guess = np.zeros(X.shape[1])
result = minimize(
lambda x: -objective(x), # inverse objective function
initial_guess,
method='BFGS',
**kwargs
)
return result
proflogit = ProfLogitClassifier(optimize_fn=optimize)