secretflow.ml.boost.homo_boost#
Classes:
|
- class secretflow.ml.boost.homo_boost.SFXgboost(server, clients)[源代码]#
基类:
object
Methods:
__init__
(server, clients)check_params
(params)train
(train_hdf, valid_hdf[, params, ...])Federated xgboost interface for training
save_model
(model_path)Federated xgboost save model interface
dump_model
(model_path)Federated xgboost dump model interface
eval
(model_path, hdata, params)Federated xgboost eval interface
- train(train_hdf: HDataFrame, valid_hdf: HDataFrame, params: Optional[Dict] = None, num_boost_round: int = 10, obj=None, feval=None, maximize: Optional[bool] = None, early_stopping_rounds: Optional[int] = None, evals_result: Optional[Dict] = None, verbose_eval: Union[int, bool] = True, xgb_model: Optional[Dict] = None, callbacks: Optional[List[Callable]] = None) SFXgboost [源代码]#
Federated xgboost interface for training
- 参数:
train_hdf – horizontal federation table used for training
valid_hdf – horizontal federated table for validation
params – dictionary of parameters
num_boost_round – Number of spanning trees required
obj – custom objective function, objective type is squared_error
feval – custom eval evaluation function
maximize – whether feval is maximized
early_stopping_rounds – same as xgboost early_stooping_round option
evals_result – container for storing evaluation results
verbose_eval – same as xgboost verbose_eval
xgb_model – xgb model file path, used for breakpoint retraining (training continuation)
callbacks – list of callback functions
- save_model(model_path: Dict)[源代码]#
Federated xgboost save model interface
- 参数:
model_path – Path of the model stored
- dump_model(model_path: Dict)[源代码]#
Federated xgboost dump model interface
- 参数:
model_path – Path of the model stored
- eval(model_path: Union[str, Dict[PYU, str]], hdata: HDataFrame, params: Dict)[源代码]#
Federated xgboost eval interface
- 参数:
model_path – Path of the model stored
hdata – Horizontal dataframe to be evaluated
params – Xgboost params
- 返回:
Dict evaluate result
- 返回类型:
result
- secretflow.ml.boost.homo_boost.boost_core
- secretflow.ml.boost.homo_boost.tree_core
- secretflow.ml.boost.homo_boost.tree_core.criterion
- secretflow.ml.boost.homo_boost.tree_core.decision_tree
- secretflow.ml.boost.homo_boost.tree_core.feature_histogram
- secretflow.ml.boost.homo_boost.tree_core.feature_importance
- secretflow.ml.boost.homo_boost.tree_core.loss_function
- secretflow.ml.boost.homo_boost.tree_core.node
- secretflow.ml.boost.homo_boost.tree_core.splitter
secretflow.ml.boost.homo_boost.homo_booster#
Classes:
|
- class secretflow.ml.boost.homo_boost.homo_booster.SFXgboost(server, clients)[源代码]#
基类:
object
Methods:
__init__
(server, clients)check_params
(params)train
(train_hdf, valid_hdf[, params, ...])Federated xgboost interface for training
save_model
(model_path)Federated xgboost save model interface
dump_model
(model_path)Federated xgboost dump model interface
eval
(model_path, hdata, params)Federated xgboost eval interface
- train(train_hdf: HDataFrame, valid_hdf: HDataFrame, params: Optional[Dict] = None, num_boost_round: int = 10, obj=None, feval=None, maximize: Optional[bool] = None, early_stopping_rounds: Optional[int] = None, evals_result: Optional[Dict] = None, verbose_eval: Union[int, bool] = True, xgb_model: Optional[Dict] = None, callbacks: Optional[List[Callable]] = None) SFXgboost [源代码]#
Federated xgboost interface for training
- 参数:
train_hdf – horizontal federation table used for training
valid_hdf – horizontal federated table for validation
params – dictionary of parameters
num_boost_round – Number of spanning trees required
obj – custom objective function, objective type is squared_error
feval – custom eval evaluation function
maximize – whether feval is maximized
early_stopping_rounds – same as xgboost early_stooping_round option
evals_result – container for storing evaluation results
verbose_eval – same as xgboost verbose_eval
xgb_model – xgb model file path, used for breakpoint retraining (training continuation)
callbacks – list of callback functions
- save_model(model_path: Dict)[源代码]#
Federated xgboost save model interface
- 参数:
model_path – Path of the model stored
- dump_model(model_path: Dict)[源代码]#
Federated xgboost dump model interface
- 参数:
model_path – Path of the model stored
- eval(model_path: Union[str, Dict[PYU, str]], hdata: HDataFrame, params: Dict)[源代码]#
Federated xgboost eval interface
- 参数:
model_path – Path of the model stored
hdata – Horizontal dataframe to be evaluated
params – Xgboost params
- 返回:
Dict evaluate result
- 返回类型:
result
secretflow.ml.boost.homo_boost.homo_booster_worker#
Homo Booster
Classes:
|
- secretflow.ml.boost.homo_boost.homo_booster_worker.HomoBooster[源代码]#
ActorProxy(HomoBooster)
的别名 Methods:__init__
(*args, **kwargs)Abstraction device object base class.
set_split_point
(bin_split_points)gen_mock_data
([data_num, columns, ...])mock data with the same schema for the SERVER to synchronize the training process
homo_train
(train_hdf, valid_hdf[, params, ...])Fed xgboost entrance
homo_eval
(eval_hdf, params, model_path)save_model
(model_path)dump_model
(model_path)initialize
(comm_or_links)recv
(name, src_device[, step_id])Receive messages from the source device.
send
(name, value, dst_device[, step_id])Send message to target device.
secretflow.ml.boost.homo_boost.homo_decision_tree#
Homo Decision Tree
Classes:
|
Class for federated version decision tree |
- class secretflow.ml.boost.homo_boost.homo_decision_tree.HomoDecisionTree(tree_param: Optional[TreeParam] = None, data: Optional[HDataFrame] = None, bin_split_points: Optional[ndarray] = None, group_id: Optional[int] = None, tree_id: Optional[int] = None, iter_round: Optional[int] = None, hess_key: str = 'hess', grad_key: str = 'grad', label_key: str = 'label')[源代码]#
基类:
DecisionTree
Class for federated version decision tree
- tree_param#
params for tree build
- data#
training data, HdataFrame
- bin_split_points#
global binning infos
- tree_id#
tree id
- group_id#
group_id
- iter_round#
iter_round in the total XGBoost training progress
- hess_key#
unique column name for hess value
- grad_key#
unique column name for grad value
- label_key#
unique column name for label key
Methods:
__init__
([tree_param, data, ...])key
(name)cal_local_hist_bags
(cur_to_split, ...)cal_split_info_list
(agg_histograms)fit
()Enter for homo decision tree
- __init__(tree_param: Optional[TreeParam] = None, data: Optional[HDataFrame] = None, bin_split_points: Optional[ndarray] = None, group_id: Optional[int] = None, tree_id: Optional[int] = None, iter_round: Optional[int] = None, hess_key: str = 'hess', grad_key: str = 'grad', label_key: str = 'label')[源代码]#
secretflow.ml.boost.homo_boost.tree_param#
Classes:
|
Param class, externally exposed interface |
- class secretflow.ml.boost.homo_boost.tree_param.TreeParam(max_depth: int = 3, eta: float = 0.3, verbosity: int = 0, objective: Optional[Union[callable, str]] = None, tree_method: str = 'hist', criterion_method: str = 'xgboost', gamma: float = 0.0001, min_child_weight: float = 1, subsample: float = 1, colsample_bytree: float = 1, colsample_byleval: float = 1, reg_alpha: float = 0.0, reg_lambda: float = 0.1, base_score: float = 0.5, random_state: int = 1234, num_parallel: Optional[int] = None, importance_type: str = 'split', use_missing: bool = False, min_sample_split: int = 2, max_split_nodes: int = 20, min_leaf_node: int = 1, decimal: int = 10, num_class: int = 0)[源代码]#
基类:
object
Param class, externally exposed interface
- max_depth#
the max depth of a decision tree.
- Type:
int
- eta#
learning rate, same as xgb’s “eta”
- Type:
float
- verbosity#
int level of log printing. Valid values are 0 (silent) - 3 (debug).
- Type:
int
- objective#
Optional[callable , str] objective function, default ‘squareloss’
- Type:
Union[callable, str]
- tree_method#
Optional[str] tree type, only support hist
- Type:
str
- criterion_method#
str split criterion method, default xgboost
- Type:
str
- gamma#
Optional[float] same as min_impurity_split,minimum gain
- Type:
float
- min_child_weight#
Optional[float] sum of hessian needed in child nodes
- Type:
float
- subsample#
Optional[float] subsample rate for rows
- Type:
float
- colsample_bytree#
Optional[float] subsample rate for columns(by tree)
- Type:
float
- colsample_bylevel#
Optional[float] subsample rate for columns(by level)
- reg_alpha#
Optional[float] L1 regularization term on weights (xgb’s alpha).
- Type:
float
- reg_lambda#
Optional[float] L2 regularization term on weights (xgb’s lambda).
- Type:
float
- base_score#
Optional[float] base score, global bias.
- Type:
float
- random_state#
Optional[Union[numpy.random.RandomState, int]] Random number seed.
- Type:
int
- num_parallel#
Optional[int] num of parallel when built tree
- Type:
int
- importance_type#
Optional[str] importance type, in [‘gain’,’split’]
- Type:
str
- use_missing#
bool whether missing value participate in train
- Type:
bool
- min_sample_split#
minimum sample split of splitting, default to 2
- Type:
int
- max_split_nodes#
max_split_nodes to parallel finding their splits in a batch
- Type:
int
- min_leaf_node#
minimum samples on node to split
- Type:
int
- decimal#
decimal reserved of gain
- Type:
int
- num_class#
num of class
- Type:
int
Attributes:
Methods:
__init__
([max_depth, eta, verbosity, ...])- max_depth: int = 3#
- eta: float = 0.3#
- verbosity: int = 0#
- objective: Union[callable, str] = None#
- tree_method: str = 'hist'#
- criterion_method: str = 'xgboost'#
- gamma: float = 0.0001#
- min_child_weight: float = 1#
- subsample: float = 1#
- colsample_bytree: float = 1#
- colsample_byleval: float = 1#
- reg_alpha: float = 0.0#
- reg_lambda: float = 0.1#
- base_score: float = 0.5#
- random_state: int = 1234#
- num_parallel: int = None#
- importance_type: str = 'split'#
- use_missing: bool = False#
- min_sample_split: int = 2#
- max_split_nodes: int = 20#
- min_leaf_node: int = 1#
- decimal: int = 10#
- num_class: int = 0#
- __init__(max_depth: int = 3, eta: float = 0.3, verbosity: int = 0, objective: Optional[Union[callable, str]] = None, tree_method: str = 'hist', criterion_method: str = 'xgboost', gamma: float = 0.0001, min_child_weight: float = 1, subsample: float = 1, colsample_bytree: float = 1, colsample_byleval: float = 1, reg_alpha: float = 0.0, reg_lambda: float = 0.1, base_score: float = 0.5, random_state: int = 1234, num_parallel: Optional[int] = None, importance_type: str = 'split', use_missing: bool = False, min_sample_split: int = 2, max_split_nodes: int = 20, min_leaf_node: int = 1, decimal: int = 10, num_class: int = 0) None #