secretflow.ml.boost.ss_xgb_v.core#

secretflow.ml.boost.ss_xgb_v.core.node_split#

Classes:

RegType(value)

An enumeration.

Functions:

sigmoid(pred)

compute_obj(G, H, reg_lambda)

compute objective values of input buckets.

compute_weight(G, H, reg_lambda, learning_rate)

compute weight values of tree leaf nodes.

compute_gh(y, pred, objective)

compute first and second order gradient of each sample.

tree_setup(pred, y, sub_choices, objective)

Set up pre-tree context.

compute_gradient_sums(nodes_s, cache, ...)

find_best_split_bucket(GHs, reg_lambda)

compute the gradient sums of the containing instances in each split bucket and find best split bucket for each node which has the max split gain.

init_pred(base, samples)

root_select(samples)

get_child_select(nodes_s, lchilds_ss, fragments)

compute the next level's select indexes.

predict_tree_weight(selects, weights)

get final pred for this tree.

get_weight(sums, reg_lambda, learning_rate)

sum_leaf(ss, gh, sub_choices)

update_train_pred(pred, current, fragments)

class secretflow.ml.boost.ss_xgb_v.core.node_split.RegType(value)[源代码]#

基类:Enum

An enumeration.

Attributes:

Linear

Logistic

Linear = 'linear'#
Logistic = 'logistic'#
secretflow.ml.boost.ss_xgb_v.core.node_split.sigmoid(pred: ndarray) ndarray[源代码]#
secretflow.ml.boost.ss_xgb_v.core.node_split.compute_obj(G: ndarray, H: ndarray, reg_lambda: float) ndarray[源代码]#

compute objective values of input buckets.

参数:
  • G/H – sum of first and second order gradient in each bucket.

  • reg_lambda – L2 regularization term

返回:

objective values.

secretflow.ml.boost.ss_xgb_v.core.node_split.compute_weight(G: float, H: float, reg_lambda: float, learning_rate: float) ndarray[源代码]#

compute weight values of tree leaf nodes.

参数:
  • G/H – sum of first and second order gradient in each node.

  • reg_lambda – L2 regularization term

  • learning_rate – Step size shrinkage used in update to prevents overfitting.

返回:

weight values.

secretflow.ml.boost.ss_xgb_v.core.node_split.compute_gh(y: ndarray, pred: ndarray, objective: RegType) Tuple[ndarray, ndarray][源代码]#

compute first and second order gradient of each sample.

参数:
  • y – sample true label of each sample.

  • pred – prediction of each sample.

  • objective – regression learning objective,

返回:

weight values.

secretflow.ml.boost.ss_xgb_v.core.node_split.tree_setup(pred: ndarray, y: ndarray, sub_choices: ndarray, objective: RegType) Tuple[ndarray, ndarray][源代码]#

Set up pre-tree context.

secretflow.ml.boost.ss_xgb_v.core.node_split.compute_gradient_sums(nodes_s: List[ndarray], cache: List[List[ndarray]], col_choices: ndarray, sub_choices: ndarray, gh: List[ndarray], buckets_map: ndarray)[源代码]#
secretflow.ml.boost.ss_xgb_v.core.node_split.find_best_split_bucket(GHs: List[List[ndarray]], reg_lambda: float) Tuple[ndarray, Dict[str, Any]][源代码]#

compute the gradient sums of the containing instances in each split bucket and find best split bucket for each node which has the max split gain.

参数:
  • context – comparison context.

  • nodes_s – sample select indexes of each node from same tree level.

  • last_level – if this split is last level, next level is leaf nodes.

返回:

idx of split bucket for each node.

secretflow.ml.boost.ss_xgb_v.core.node_split.init_pred(base: float, samples: int)[源代码]#
secretflow.ml.boost.ss_xgb_v.core.node_split.root_select(samples: int) List[ndarray][源代码]#
secretflow.ml.boost.ss_xgb_v.core.node_split.get_child_select(nodes_s: List[ndarray], lchilds_ss: List[ndarray], fragments: int) List[ndarray][源代码]#

compute the next level’s select indexes.

参数:
  • nodes_s – sample select indexes of each node from current level’s nodes.

  • lchilds_ss – left children’s sample select idx for current level’s nodes.

返回:

sample select indexes for nodes in next tree level.

secretflow.ml.boost.ss_xgb_v.core.node_split.predict_tree_weight(selects: List[ndarray], weights: ndarray) ndarray[源代码]#

get final pred for this tree.

参数:
  • selects – leaf nodes’ sample selects from each model handler.

  • weights – leaf weights in secure share.

返回:

pred

secretflow.ml.boost.ss_xgb_v.core.node_split.get_weight(sums: List[List[ndarray]], reg_lambda: float, learning_rate: float) ndarray[源代码]#
secretflow.ml.boost.ss_xgb_v.core.node_split.sum_leaf(ss: List[ndarray], gh: List[ndarray], sub_choices: ndarray)[源代码]#
secretflow.ml.boost.ss_xgb_v.core.node_split.update_train_pred(pred: List[ndarray], current: ndarray, fragments: int)[源代码]#

secretflow.ml.boost.ss_xgb_v.core.tree_worker#

Classes:

XgbTreeWorker

ActorProxy(XgbTreeWorker) 的别名

secretflow.ml.boost.ss_xgb_v.core.tree_worker.XgbTreeWorker[源代码]#

ActorProxy(XgbTreeWorker) 的别名 Methods:

__init__(*args, **kwargs)

Abstraction device object base class.

predict_weight_select(x, tree)

computer leaf nodes' sample selects known by this partition.

build_bucket_map(start, length)

Build bucket_map fragment base on order_map.

global_setup(x, buckets, seed)

Set up global context.

update_buckets_count(buckets_count, ...)

save how many buckets in each partition's features.

tree_setup(colsample)

Set up tree context and do col sample if colsample < 1

tree_finish()

do_split(split_buckets)

record split info and generate next level's left children select.

secretflow.ml.boost.ss_xgb_v.core.utils#

Functions:

prepare_dataset(ds)

check data setting and get total shape.

secretflow.ml.boost.ss_xgb_v.core.utils.prepare_dataset(ds: Union[FedNdarray, VDataFrame]) Tuple[FedNdarray, Tuple[int, int]][源代码]#

check data setting and get total shape.

参数:

ds – input dataset

返回:

dataset in unified type Second: shape concat all partition.

返回类型:

First

secretflow.ml.boost.ss_xgb_v.core.xgb_tree#

Classes:

XgbTree()

class secretflow.ml.boost.ss_xgb_v.core.xgb_tree.XgbTree[源代码]#

基类:object

Methods:

__init__()

insert_split_node(feature, value)

__init__() None[源代码]#
insert_split_node(feature: int, value: float) None[源代码]#