Ensemble Module (API Reference)¶

Ensemble trained model into a more robost model with stacking and voting.

To do ensemble logic with whole trained model, try to improve whole score based on different processing logic, also if we could get better result, then we are lucky!

One more thing, this should be called only and after the pipeline has finished, so that we could load the trained model from disk, so this should be called from the parent automl training logic.

@author: Guangqiang.lu

class automl.model_ensemble.ModelEnsemble(backend, task_type='classification', ensemble_alg='voting', voting_logic='soft')¶

Bases: automl.classifier_algorithms.ClassifierClass

Currently support with 2 different ensemble logics: Voting(weight combine classification: with soft voting and hard voting, regression: weight multiple), stacking(add trained model prediction into training data)

Based on different task to do different logic. :param task_type: which task to do: classification or regression. :param ensemble_alg: which ensemble logic to use: voting or stacking. :param voting_logic: whether with hard or soft voting

classmethod create_stacking_dataset(x, backend, task_type='classification', ensemble_alg='stacking')¶

What I want is to create the new dataset based on the whole instances for stacking.

We could use the class func to create this. As stacking will add new features based on trained models. Should make the attr stacking_models with the models instance. :param x: :param task_type: :param ensemble_alg: :return:

fit_bagging(x, y, **kwargs)¶

Here with ensemble logic like hard by number voting or soft by weight combine.

For bagging fitting, if we face with classification problem, then we could use voting logic to get ensemble prediction, if regression, then will get weights * each model prediction.

fit_stacking(x, y, **kwargs)¶

Implement with stacking logic is combined trained model prediction and original data into a new training data.

# Just to select the best score algorithm for the later step, based on the factory class to get the original class name and to load a new classifier.

Parameters

x – training data
y – training label
kwargs –

Returns

get_model_score_list()¶: To get each model accuracy score list for later compare :return:

get_params(deep=True)¶

Get parameters for this estimator.

Parameters: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: params – Parameter names mapped to their values.
Return type: dict

static get_search_space()¶: This is to get predefined search space for different algorithms, and we should use this to do cross validation to get best fitted parameters. :return:

set_params(**params)¶

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters: **params (dict) – Estimator parameters.
Returns: self – Estimator instance.
Return type: estimator instance