secretflow.ml.boost.sgb_v#
Classes:
|
Sgboost Model & predict. |
|
This class provides both classification and regression tree boosting (also known as GBDT, GBM) for vertical split dataset setting by using secure boost. |
You can customize your own boosting algorithms which are based on any combination of ideas of secureboost, XGB, and lightGBM. |
- class secretflow.ml.boost.sgb_v.SgbModel(label_holder: PYU, objective: RegType, base: float)[source]#
Bases:
objectSgboost Model & predict. It is a distributed tree in essence.
Methods:
__init__(label_holder, objective, base)- param label_holder:
PYU device, label holder's PYU device.
predict(dtrain[, to_pyu])predict on dtrain with this model.
to_dict()save_model(device_path_dict[, ...])Save model to different parties
- __init__(label_holder: PYU, objective: RegType, base: float) None[source]#
- Parameters:
label_holder – PYU device, label holder’s PYU device.
objective – RegType, specifies doing logistic regression or regression
base – float
- predict(dtrain: Union[FedNdarray, VDataFrame], to_pyu: Optional[PYU] = None) Union[PYUObject, FedNdarray][source]#
predict on dtrain with this model.
- Parameters:
dtrain – [FedNdarray, VDataFrame] vertical split dataset.
to – the prediction initiator if not None predict result is reveal to to_pyu device and save as FedNdarray otherwise, keep predict result in plaintext and save as PYUObject in label_holder device.
- Returns:
Pred values store in pyu object or FedNdarray.
- save_model(device_path_dict: Dict, wait_before_proceed=True)[source]#
Save model to different parties
- Parameters:
device_path_dict (Dict) – {device: a path to save model for the device}.
wait_before_process (bool) – if False, handle will be returned, to allow user to wait for model write to finish (and do something else in the meantime).
- class secretflow.ml.boost.sgb_v.Sgb(heu: HEU)[source]#
Bases:
objectThis class provides both classification and regression tree boosting (also known as GBDT, GBM) for vertical split dataset setting by using secure boost.
SGB is short for SecureBoost. Compared to its safer counterpart SS-XGB, SecureBoost focused on protecting label holder.
- Parameters:
heu – secret device running homomorphic encryptions
Methods:
__init__(heu)train(params, dtrain, label[, audit_paths])train on dtrain and label.
- train(params: Dict, dtrain: Union[FedNdarray, VDataFrame], label: Union[FedNdarray, VDataFrame], audit_paths: Dict = {}) SgbModel[source]#
train on dtrain and label.
- Parameters:
params – Dict booster params, details are as follows
dtrain – {FedNdarray, VDataFrame} vertical split dataset.
label – {FedNdarray, VDataFrame} label column.
audit_paths – {party: party_audit_path} for each party. party_audit_path is a file location for gradients. Leave it empty if you do not need audit function.
- booster params details:
- num_boost_roundint, default=10
Number of boosting iterations. range: [1, 1024]
- ‘max_depth’: int, maximum depth of a tree.
default: 5 range: [1, 16]
- ‘learning_rate’: float, step size shrinkage used in update to prevent overfitting.
default: 0.3 range: (0, 1]
- ‘objective’: Specify the learning objective.
default: ‘logistic’ range: [‘linear’, ‘logistic’]
- ‘reg_lambda’: float. L2 regularization term on weights.
default: 0.1 range: [0, 10000]
- ‘gamma’: float. Greater than 0 means pre-pruning enabled.
Gain less than it will not induce split node. default: 0.1 range: [0, 10000]
- ‘subsample’: Subsample ratio of the training instances.
default: 1 range: (0, 1]
- ‘colsample_by_tree’: Subsample ratio of columns when constructing each tree.
default: 1 range: (0, 1]
- ‘sketch_eps’: This roughly translates into O(1 / sketch_eps) number of bins.
default: 0.1 range: (0, 1]
- ‘base_score’: The initial prediction score of all instances, global bias.
default: 0
- ‘seed’: Pseudorandom number generator seed.
default: 42
- ‘fixed_point_parameter’: int. Any floating point number encoded by heu,
will multiply a scale and take the round, scale = 2 ** fixed_point_parameter. larger value may mean more numerical accurate, but too large will lead to overflow problem. See HEU’s document for more details.
default: 20
- Returns:
SgbModel
- class secretflow.ml.boost.sgb_v.SGBFactory[source]#
Bases:
objectYou can customize your own boosting algorithms which are based on any combination of ideas of secureboost, XGB, and lightGBM. The parameters for the produced booster algorithm depends on what components it consists of. See components’ parameters.
- params_dict#
A dict contain params for the factory, booster and its components.
- Type:
dict
- factory_params#
validated params for the factory.
- Type:
- heu#
the device for HE computations. must be set before training.
Methods:
__init__()set_params(params)Set params by a dictionary.
set_heu(heu)get_params([detailed])get the params set
fit(dataset, label)train(params, dataset, label)- get_params(detailed: bool = False) dict[source]#
get the params set
- Parameters:
detailed (bool, optional) – If include default settings. Defaults to False.
- Returns:
current params.
- Return type:
dict
- fit(dataset: Union[FedNdarray, VDataFrame], label: Union[FedNdarray, VDataFrame]) SgbModel[source]#
- train(params: dict, dataset: Union[FedNdarray, VDataFrame], label: Union[FedNdarray, VDataFrame]) SgbModel[source]#
secretflow.ml.boost.sgb_v.model#
Classes:
|
Sgboost Model & predict. |
Functions:
|
|
|
|
|
- class secretflow.ml.boost.sgb_v.model.SgbModel(label_holder: PYU, objective: RegType, base: float)[source]#
Bases:
objectSgboost Model & predict. It is a distributed tree in essence.
Methods:
__init__(label_holder, objective, base)- param label_holder:
PYU device, label holder's PYU device.
predict(dtrain[, to_pyu])predict on dtrain with this model.
to_dict()save_model(device_path_dict[, ...])Save model to different parties
- __init__(label_holder: PYU, objective: RegType, base: float) None[source]#
- Parameters:
label_holder – PYU device, label holder’s PYU device.
objective – RegType, specifies doing logistic regression or regression
base – float
- predict(dtrain: Union[FedNdarray, VDataFrame], to_pyu: Optional[PYU] = None) Union[PYUObject, FedNdarray][source]#
predict on dtrain with this model.
- Parameters:
dtrain – [FedNdarray, VDataFrame] vertical split dataset.
to – the prediction initiator if not None predict result is reveal to to_pyu device and save as FedNdarray otherwise, keep predict result in plaintext and save as PYUObject in label_holder device.
- Returns:
Pred values store in pyu object or FedNdarray.
- save_model(device_path_dict: Dict, wait_before_proceed=True)[source]#
Save model to different parties
- Parameters:
device_path_dict (Dict) – {device: a path to save model for the device}.
wait_before_process (bool) – if False, handle will be returned, to allow user to wait for model write to finish (and do something else in the meantime).
secretflow.ml.boost.sgb_v.sgb#
Classes:
|
This class provides both classification and regression tree boosting (also known as GBDT, GBM) for vertical split dataset setting by using secure boost. |
Functions:
|
|
|
- class secretflow.ml.boost.sgb_v.sgb.Sgb(heu: HEU)[source]#
Bases:
objectThis class provides both classification and regression tree boosting (also known as GBDT, GBM) for vertical split dataset setting by using secure boost.
SGB is short for SecureBoost. Compared to its safer counterpart SS-XGB, SecureBoost focused on protecting label holder.
- Parameters:
heu – secret device running homomorphic encryptions
Methods:
__init__(heu)train(params, dtrain, label[, audit_paths])train on dtrain and label.
- train(params: Dict, dtrain: Union[FedNdarray, VDataFrame], label: Union[FedNdarray, VDataFrame], audit_paths: Dict = {}) SgbModel[source]#
train on dtrain and label.
- Parameters:
params – Dict booster params, details are as follows
dtrain – {FedNdarray, VDataFrame} vertical split dataset.
label – {FedNdarray, VDataFrame} label column.
audit_paths – {party: party_audit_path} for each party. party_audit_path is a file location for gradients. Leave it empty if you do not need audit function.
- booster params details:
- num_boost_roundint, default=10
Number of boosting iterations. range: [1, 1024]
- ‘max_depth’: int, maximum depth of a tree.
default: 5 range: [1, 16]
- ‘learning_rate’: float, step size shrinkage used in update to prevent overfitting.
default: 0.3 range: (0, 1]
- ‘objective’: Specify the learning objective.
default: ‘logistic’ range: [‘linear’, ‘logistic’]
- ‘reg_lambda’: float. L2 regularization term on weights.
default: 0.1 range: [0, 10000]
- ‘gamma’: float. Greater than 0 means pre-pruning enabled.
Gain less than it will not induce split node. default: 0.1 range: [0, 10000]
- ‘subsample’: Subsample ratio of the training instances.
default: 1 range: (0, 1]
- ‘colsample_by_tree’: Subsample ratio of columns when constructing each tree.
default: 1 range: (0, 1]
- ‘sketch_eps’: This roughly translates into O(1 / sketch_eps) number of bins.
default: 0.1 range: (0, 1]
- ‘base_score’: The initial prediction score of all instances, global bias.
default: 0
- ‘seed’: Pseudorandom number generator seed.
default: 42
- ‘fixed_point_parameter’: int. Any floating point number encoded by heu,
will multiply a scale and take the round, scale = 2 ** fixed_point_parameter. larger value may mean more numerical accurate, but too large will lead to overflow problem. See HEU’s document for more details.
default: 20
- Returns:
SgbModel