secretflow.ml.boost.sgb_v#
Classes:
|
Sgboost Model & predict. |
|
This class provides both classification and regression tree boosting (also known as GBDT, GBM) for vertical split dataset setting by using secure boost. |
You can customize your own boosting algorithms which are based on any combination of ideas of secureboost, XGB, and lightGBM. |
- class secretflow.ml.boost.sgb_v.SgbModel(label_holder: PYU, objective: RegType, base: float)[源代码]#
基类:
object
Sgboost Model & predict. It is a distributed tree in essence.
Methods:
__init__
(label_holder, objective, base)- param label_holder:
PYU device, label holder's PYU device.
predict
(dtrain[, to_pyu])predict on dtrain with this model.
to_dict
()save_model
(device_path_dict[, ...])Save model to different parties
- __init__(label_holder: PYU, objective: RegType, base: float) None [源代码]#
- 参数:
label_holder – PYU device, label holder’s PYU device.
objective – RegType, specifies doing logistic regression or regression
base – float
- predict(dtrain: Union[FedNdarray, VDataFrame], to_pyu: Optional[PYU] = None) Union[PYUObject, FedNdarray] [源代码]#
predict on dtrain with this model.
- 参数:
dtrain – [FedNdarray, VDataFrame] vertical split dataset.
to – the prediction initiator if not None predict result is reveal to to_pyu device and save as FedNdarray otherwise, keep predict result in plaintext and save as PYUObject in label_holder device.
- 返回:
Pred values store in pyu object or FedNdarray.
- save_model(device_path_dict: Dict, wait_before_proceed=True)[源代码]#
Save model to different parties
- 参数:
device_path_dict (Dict) – {device: a path to save model for the device}.
wait_before_process (bool) – if False, handle will be returned, to allow user to wait for model write to finish (and do something else in the meantime).
- class secretflow.ml.boost.sgb_v.Sgb(heu: HEU)[源代码]#
基类:
object
This class provides both classification and regression tree boosting (also known as GBDT, GBM) for vertical split dataset setting by using secure boost.
SGB is short for SecureBoost. Compared to its safer counterpart SS-XGB, SecureBoost focused on protecting label holder.
- 参数:
heu – secret device running homomorphic encryptions
Methods:
__init__
(heu)train
(params, dtrain, label[, audit_paths])train on dtrain and label.
- train(params: Dict, dtrain: Union[FedNdarray, VDataFrame], label: Union[FedNdarray, VDataFrame], audit_paths: Dict = {}) SgbModel [源代码]#
train on dtrain and label.
- 参数:
params – Dict booster params, details are as follows
dtrain – {FedNdarray, VDataFrame} vertical split dataset.
label – {FedNdarray, VDataFrame} label column.
audit_paths – {party: party_audit_path} for each party. party_audit_path is a file location for gradients. Leave it empty if you do not need audit function.
- booster params details:
- num_boost_roundint, default=10
Number of boosting iterations. range: [1, 1024]
- ‘max_depth’: int, maximum depth of a tree.
default: 5 range: [1, 16]
- ‘learning_rate’: float, step size shrinkage used in update to prevent overfitting.
default: 0.3 range: (0, 1]
- ‘objective’: Specify the learning objective.
default: ‘logistic’ range: [‘linear’, ‘logistic’]
- ‘reg_lambda’: float. L2 regularization term on weights.
default: 0.1 range: [0, 10000]
- ‘gamma’: float. Greater than 0 means pre-pruning enabled.
Gain less than it will not induce split node. default: 0.1 range: [0, 10000]
- ‘subsample’: Subsample ratio of the training instances.
default: 1 range: (0, 1]
- ‘colsample_by_tree’: Subsample ratio of columns when constructing each tree.
default: 1 range: (0, 1]
- ‘sketch_eps’: This roughly translates into O(1 / sketch_eps) number of bins.
default: 0.1 range: (0, 1]
- ‘base_score’: The initial prediction score of all instances, global bias.
default: 0
- ‘seed’: Pseudorandom number generator seed.
default: 42
- ‘fixed_point_parameter’: int. Any floating point number encoded by heu,
will multiply a scale and take the round, scale = 2 ** fixed_point_parameter. larger value may mean more numerical accurate, but too large will lead to overflow problem. See HEU’s document for more details.
default: 20
- 返回:
SgbModel
- class secretflow.ml.boost.sgb_v.SGBFactory[源代码]#
基类:
object
You can customize your own boosting algorithms which are based on any combination of ideas of secureboost, XGB, and lightGBM. The parameters for the produced booster algorithm depends on what components it consists of. See components’ parameters.
- params_dict#
A dict contain params for the factory, booster and its components.
- Type:
dict
- factory_params#
validated params for the factory.
- Type:
- heu#
the device for HE computations. must be set before training.
Methods:
__init__
()set_params
(params)Set params by a dictionary.
set_heu
(heu)get_params
([detailed])get the params set
fit
(dataset, label)train
(params, dataset, label)- get_params(detailed: bool = False) dict [源代码]#
get the params set
- 参数:
detailed (bool, optional) – If include default settings. Defaults to False.
- 返回:
current params.
- 返回类型:
dict
- fit(dataset: Union[FedNdarray, VDataFrame], label: Union[FedNdarray, VDataFrame]) SgbModel [源代码]#
- train(params: dict, dataset: Union[FedNdarray, VDataFrame], label: Union[FedNdarray, VDataFrame]) SgbModel [源代码]#
secretflow.ml.boost.sgb_v.model#
Classes:
|
Sgboost Model & predict. |
Functions:
|
|
|
|
|
- class secretflow.ml.boost.sgb_v.model.SgbModel(label_holder: PYU, objective: RegType, base: float)[源代码]#
基类:
object
Sgboost Model & predict. It is a distributed tree in essence.
Methods:
__init__
(label_holder, objective, base)- param label_holder:
PYU device, label holder's PYU device.
predict
(dtrain[, to_pyu])predict on dtrain with this model.
to_dict
()save_model
(device_path_dict[, ...])Save model to different parties
- __init__(label_holder: PYU, objective: RegType, base: float) None [源代码]#
- 参数:
label_holder – PYU device, label holder’s PYU device.
objective – RegType, specifies doing logistic regression or regression
base – float
- predict(dtrain: Union[FedNdarray, VDataFrame], to_pyu: Optional[PYU] = None) Union[PYUObject, FedNdarray] [源代码]#
predict on dtrain with this model.
- 参数:
dtrain – [FedNdarray, VDataFrame] vertical split dataset.
to – the prediction initiator if not None predict result is reveal to to_pyu device and save as FedNdarray otherwise, keep predict result in plaintext and save as PYUObject in label_holder device.
- 返回:
Pred values store in pyu object or FedNdarray.
- save_model(device_path_dict: Dict, wait_before_proceed=True)[源代码]#
Save model to different parties
- 参数:
device_path_dict (Dict) – {device: a path to save model for the device}.
wait_before_process (bool) – if False, handle will be returned, to allow user to wait for model write to finish (and do something else in the meantime).
secretflow.ml.boost.sgb_v.sgb#
Classes:
|
This class provides both classification and regression tree boosting (also known as GBDT, GBM) for vertical split dataset setting by using secure boost. |
Functions:
|
|
|
- class secretflow.ml.boost.sgb_v.sgb.Sgb(heu: HEU)[源代码]#
基类:
object
This class provides both classification and regression tree boosting (also known as GBDT, GBM) for vertical split dataset setting by using secure boost.
SGB is short for SecureBoost. Compared to its safer counterpart SS-XGB, SecureBoost focused on protecting label holder.
- 参数:
heu – secret device running homomorphic encryptions
Methods:
__init__
(heu)train
(params, dtrain, label[, audit_paths])train on dtrain and label.
- train(params: Dict, dtrain: Union[FedNdarray, VDataFrame], label: Union[FedNdarray, VDataFrame], audit_paths: Dict = {}) SgbModel [源代码]#
train on dtrain and label.
- 参数:
params – Dict booster params, details are as follows
dtrain – {FedNdarray, VDataFrame} vertical split dataset.
label – {FedNdarray, VDataFrame} label column.
audit_paths – {party: party_audit_path} for each party. party_audit_path is a file location for gradients. Leave it empty if you do not need audit function.
- booster params details:
- num_boost_roundint, default=10
Number of boosting iterations. range: [1, 1024]
- ‘max_depth’: int, maximum depth of a tree.
default: 5 range: [1, 16]
- ‘learning_rate’: float, step size shrinkage used in update to prevent overfitting.
default: 0.3 range: (0, 1]
- ‘objective’: Specify the learning objective.
default: ‘logistic’ range: [‘linear’, ‘logistic’]
- ‘reg_lambda’: float. L2 regularization term on weights.
default: 0.1 range: [0, 10000]
- ‘gamma’: float. Greater than 0 means pre-pruning enabled.
Gain less than it will not induce split node. default: 0.1 range: [0, 10000]
- ‘subsample’: Subsample ratio of the training instances.
default: 1 range: (0, 1]
- ‘colsample_by_tree’: Subsample ratio of columns when constructing each tree.
default: 1 range: (0, 1]
- ‘sketch_eps’: This roughly translates into O(1 / sketch_eps) number of bins.
default: 0.1 range: (0, 1]
- ‘base_score’: The initial prediction score of all instances, global bias.
default: 0
- ‘seed’: Pseudorandom number generator seed.
default: 42
- ‘fixed_point_parameter’: int. Any floating point number encoded by heu,
will multiply a scale and take the round, scale = 2 ** fixed_point_parameter. larger value may mean more numerical accurate, but too large will lead to overflow problem. See HEU’s document for more details.
default: 20
- 返回:
SgbModel