secretflow.ml.nn#

Classes:

FLModel([server, device_list, model, ...])

SLModel([base_model_dict, device_y, ...])

class secretflow.ml.nn.FLModel(server=None, device_list: List[PYU] = [], model: Union[TorchModel, Callable[[], tensorflow.keras.Model]] = None, aggregator=None, strategy='fed_avg_w', consensus_num=1, backend='tensorflow', random_seed=None, **kwargs)[源代码]#

基类:object

Methods:

__init__([server, device_list, model, ...])

Interface for horizontal federated learning .

init_workers(model, device_list, strategy, ...)

initialize_weights()

fit(x, y[, batch_size, batch_sampling_rate, ...])

Horizontal federated training interface

predict(x[, batch_size, label_decoder, ...])

Horizontal federated offline prediction interface

evaluate(x[, y, batch_size, sample_weight, ...])

Horizontal federated offline evaluation interface

save_model(model_path[, is_test, saved_model])

Horizontal federated save model interface

load_model(model_path[, is_test, ...])

Horizontal federated load model interface

__init__(server=None, device_list: List[PYU] = [], model: Union[TorchModel, Callable[[], tensorflow.keras.Model]] = None, aggregator=None, strategy='fed_avg_w', consensus_num=1, backend='tensorflow', random_seed=None, **kwargs)[源代码]#

Interface for horizontal federated learning .. attribute:: server

PYU, Which PYU as a server

device_list#

party list

model#

model definition function

aggregator#

Security aggregators can be selected according to the security level

strategy#

Federated training strategy

consensus_num#

Num parties of consensus,Some strategies require multiple parties to reach consensus,

backend#

Engine backend, the backend needs to be consistent with the model type

random_seed#

If specified, the initial value of the model will remain the same, which ensures reproducible

init_workers(model, device_list, strategy, backend, random_seed)[源代码]#
initialize_weights()[源代码]#
fit(x: Union[HDataFrame, FedNdarray, Dict[PYU, str]], y: Union[HDataFrame, FedNdarray, str], batch_size: Union[int, Dict[PYU, int]] = 32, batch_sampling_rate: Optional[float] = None, epochs: int = 1, verbose: int = 1, callbacks=None, validation_data=None, shuffle=False, class_weight=None, sample_weight=None, validation_freq=1, aggregate_freq=1, label_decoder=None, max_batch_size=20000, prefetch_buffer_size=None, sampler_method='batch', random_seed=None, dp_spent_step_freq=None, audit_log_dir=None, dataset_builder: Optional[Dict[PYU, Callable]] = None) History[源代码]#

Horizontal federated training interface

参数:
  • x – feature, FedNdArray, HDataFrame or Dict {PYU: model_path}

  • y – label, FedNdArray, HDataFrame or str(column name of label)

  • batch_size – Number of samples per gradient update, int or Dict, recommend 64 or more for safety

  • batch_sampling_rate – Ratio of sample per batch, float

  • epochs – Number of epochs to train the model

  • verbose – 0, 1. Verbosity mode

  • callbacks – List of keras.callbacks.Callback instances.

  • validation_data – Data on which to evaluate

  • shuffle – whether to shuffle the training data

  • class_weight – Dict mapping class indices (integers) to a weight (float)

  • sample_weight – weights for the training samples

  • validation_freq – specifies how many training epochs to run before a new validation run is performed

  • aggregate_freq – Number of steps of aggregation

  • label_decoder – Only used for CSV reading, for label preprocess

  • max_batch_size – Max limit of batch size

  • prefetch_buffer_size – An int specifying the number of feature batches to prefetch for performance improvement. Only for csv reader

  • sampler_method – The name of sampler method

  • random_seed – Prg seed for shuffling

  • dp_spent_step_freq – specifies how many training steps to check the budget of dp

  • audit_log_dir – path of audit log dir, checkpoint will be save if audit_log_dir is not None

  • dataset_builder – Callable function about hot to build the dataset. must return (dataset, steps_per_epoch)

返回:

A history object. It’s history.global_history attribute is a aggregated record of training loss values and metrics, while history.local_history attribute is a record of training loss values and metrics of each party.

predict(x: Union[HDataFrame, FedNdarray, Dict], batch_size=None, label_decoder=None, sampler_method='batch', random_seed=1234, dataset_builder: Optional[Dict[PYU, Callable]] = None) Dict[PYU, PYUObject][源代码]#

Horizontal federated offline prediction interface

参数:
  • x – feature, FedNdArray or HDataFrame

  • batch_size – Number of samples per gradient update, int or Dict

  • label_decoder – Only used for CSV reading, for label preprocess

  • sampler_method – The name of sampler method

  • random_seed – Prg seed for shuffling

  • dataset_builder – Callable function about hot to build the dataset. must return (dataset, steps_per_epoch)

返回:

predict results, numpy.array

evaluate(x: Union[HDataFrame, FedNdarray, Dict], y: Optional[Union[HDataFrame, FedNdarray, str]] = None, batch_size: Union[int, Dict[PYU, int]] = 32, sample_weight: Optional[Union[HDataFrame, FedNdarray]] = None, label_decoder=None, return_dict=False, sampler_method='batch', random_seed=None, dataset_builder: Optional[Dict[PYU, Callable]] = None) Tuple[Union[List[Metric], Dict[str, Metric]], Union[Dict[str, List[Metric]], Dict[str, Dict[str, Metric]]]][源代码]#

Horizontal federated offline evaluation interface

参数:
  • x – Input data. It could be: - FedNdArray - HDataFrame - Dict {PYU: model_path}

  • y – Label. It could be: - FedNdArray - HDataFrame - str column name of csv

  • batch_size – Integer or Dict. Number of samples per batch of computation. If unspecified, batch_size will default to 32.

  • sample_weight – Optional Numpy array of weights for the test samples, used for weighting the loss function.

  • label_decoder – User define how to handle label column when use csv reader

  • return_dict – If True, loss and metric results are returned as a dict, with each key being the name of the metric. If False, they are returned as a list.

  • sampler_method – The name of sampler method.

  • dataset_builder – Callable function about hot to build the dataset. must return (dataset, steps_per_epoch)

返回:

A tuple of two objects. The first object is a aggregated record of metrics, and the second object is a record of training loss values and metrics of each party.

save_model(model_path: Union[str, Dict[PYU, str]], is_test=False, saved_model=False)[源代码]#

Horizontal federated save model interface

参数:
  • model_path – model path, only support format like ‘a/b/c’, where c is the model name

  • is_test – whether is test mode

  • saved_model – bool Whether to save as savedmodel or torchscript format

load_model(model_path: Union[str, Dict[PYU, str]], is_test=False, saved_model=False, force_all_participate=False)[源代码]#

Horizontal federated load model interface

参数:
  • model_path – model path

  • is_test – whether is test mode

  • saved_model – bool Whether to load from savedmodel or torchscript format

class secretflow.ml.nn.SLModel(base_model_dict: Dict[Device, Callable[[], tensorflow.keras.Model]] = {}, device_y: PYU = None, model_fuse: Callable[[], tensorflow.keras.Model] = None, compressor: Compressor = None, dp_strategy_dict: Dict[Device, DPStrategy] = None, random_seed: int = None, strategy='split_nn', **kwargs)[源代码]#

基类:object

Methods:

__init__([base_model_dict, device_y, ...])

Interface for vertical split learning .

handle_data(x[, y, sample_weight, ...])

fit(x, y[, batch_size, epochs, verbose, ...])

Vertical split learning training interface

predict(x[, batch_size, verbose, ...])

Vertical split learning offline prediction interface

evaluate(x, y[, batch_size, sample_weight, ...])

Vertical split learning evaluate interface

save_model([base_model_path, ...])

Vertical split learning save model interface

load_model([base_model_path, ...])

Vertical split learning load model interface

export_model([base_model_path, ...])

Vertical split learning export model interface

get_cpus()

__init__(base_model_dict: Dict[Device, Callable[[], tensorflow.keras.Model]] = {}, device_y: PYU = None, model_fuse: Callable[[], tensorflow.keras.Model] = None, compressor: Compressor = None, dp_strategy_dict: Dict[Device, DPStrategy] = None, random_seed: int = None, strategy='split_nn', **kwargs)[源代码]#

Interface for vertical split learning .. attribute:: base_model_dict

Basemodel dictionary, key is PYU, value is the Basemodel defined by party.

device_y#

Define which model have label.

model_fuse#

Fuse model definition.

compressor#

Define strategy tensor compression algorithms to speed up transmission.

dp_strategy_dict#

Dp strategy dictionary.

random_seed#

If specified, the initial value of the model will remain the same, which ensures reproducible.

strategy#

Strategy of split learning.

handle_data(x: Union[VDataFrame, FedNdarray, List[Union[HDataFrame, VDataFrame, FedNdarray]]], y: Optional[Union[FedNdarray, VDataFrame, PYUObject]] = None, sample_weight: Optional[Union[FedNdarray, VDataFrame]] = None, batch_size=32, shuffle=False, epochs=1, stage='train', random_seed=1234, dataset_builder: Optional[Dict] = None)[源代码]#
fit(x: Union[VDataFrame, FedNdarray, List[Union[HDataFrame, VDataFrame, FedNdarray]]], y: Union[VDataFrame, FedNdarray, PYUObject], batch_size=32, epochs=1, verbose=1, callbacks=None, validation_data=None, shuffle=False, sample_weight=None, validation_freq=1, dp_spent_step_freq=None, dataset_builder: Optional[Callable[[List], Tuple[int, Iterable]]] = None, audit_log_dir: Optional[str] = None, audit_log_params: dict = {}, random_seed: Optional[int] = None)[源代码]#

Vertical split learning training interface

参数:
  • x – Input data. It could be:

  • VDataFrame (-) – a vertically aligned dataframe.

  • FedNdArray (-) – a vertically aligned ndarray.

  • List[Union[HDataFrame (-) – list of dataframe or ndarray.

  • VDataFrame – list of dataframe or ndarray.

  • FedNdarray]] – list of dataframe or ndarray.

  • y – Target data. It could be a VDataFrame or FedNdarray which has only one partition, or a PYUObject.

  • batch_size – Number of samples per gradient update.

  • epochs – Number of epochs to train the model

  • verbose – 0, 1. Verbosity mode

  • callbacks – List of keras.callbacks.Callback instances.

  • validation_data – Data on which to validate

  • shuffle – Whether shuffle dataset or not

  • validation_freq – specifies how many training epochs to run before a new validation run is performed

  • sample_weight – weights for the training samples

  • dp_spent_step_freq – specifies how many training steps to check the budget of dp

  • dataset_builder – Callable function, its input is x or [x, y] if y is set, it should return a dataset.

  • audit_log_dir – If audit_log_dir is set, audit model will be enabled

  • audit_log_params – Kwargs for saving audit model, eg: {‘save_traces’=True, ‘save_format’=’h5’}

  • random_seed – seed for prg, will only affect dataset shuffle

predict(x: Union[VDataFrame, FedNdarray, List[Union[HDataFrame, VDataFrame, FedNdarray]]], batch_size=32, verbose=0, dataset_builder: Optional[Callable[[List], Tuple[int, Iterable]]] = None, compress: bool = False)[源代码]#

Vertical split learning offline prediction interface

参数:
  • x – Input data. It could be:

  • VDataFrame (-) – a vertically aligned dataframe.

  • FedNdArray (-) – a vertically aligned ndarray.

  • List[Union[HDataFrame (-) – list of dataframe or ndarray.

  • VDataFrame – list of dataframe or ndarray.

  • FedNdarray]] – list of dataframe or ndarray.

  • batch_size – Number of samples per gradient update, Int

  • verbose – 0, 1. Verbosity mode

  • dataset_builder – Callable function, its input is x or [x, y] if y is set, it should return steps_per_epoch and iterable dataset. Dataset builder is mainly for building graph dataset.

  • compress – Whether to use compressor to compress cross device data.

evaluate(x: Union[VDataFrame, FedNdarray, List[Union[HDataFrame, VDataFrame, FedNdarray]]], y: Union[VDataFrame, FedNdarray, PYUObject], batch_size: int = 32, sample_weight=None, verbose=1, dataset_builder: Dict = None, random_seed: int = None, compress: bool = False)[源代码]#

Vertical split learning evaluate interface

参数:
  • x – Input data. It could be:

  • VDataFrame (-) – a vertically aligned dataframe.

  • FedNdArray (-) – a vertically aligned ndarray.

  • List[Union[HDataFrame (-) – list of dataframe or ndarray.

  • VDataFrame – list of dataframe or ndarray.

  • FedNdarray]] – list of dataframe or ndarray.

  • y – Target data. It could be a VDataFrame or FedNdarray which has only one partition, or a PYUObject.

  • batch_size – Integer or Dict. Number of samples per batch of computation. If unspecified, batch_size will default to 32.

  • sample_weight – Optional Numpy array of weights for the test samples, used for weighting the loss function.

  • verbose – Verbosity mode. 0 = silent, 1 = progress bar.

  • dataset_builder – Callable function, its input is x or [x, y] if y is set, it should return dataset.

  • random_seed – Seed for prgs, will only affect shuffle

  • compress – Whether to use compressor to compress cross device data.

返回:

federate evaluate result

返回类型:

metrics

save_model(base_model_path: Optional[Union[str, Dict[PYU, str]]] = None, fuse_model_path: Optional[str] = None, is_test=False, **kwargs)[源代码]#

Vertical split learning save model interface

参数:
  • base_model_path – base model path,only support format like ‘a/b/c’, where c is the model name

  • fuse_model_path – fuse model path

  • is_test – whether is test mode

  • kwargs – other argument inherit from tf or torch

示例

>>> save_params = {'save_traces' : True,
>>>                'save_format' : 'h5',}
>>> slmodel.save_model(base_model_path,
>>>                    fuse_model_path,)
>>>                    is_test=True,)
>>> # just passing params in
>>> slmodel.save_model(base_model_path,
>>>                    fuse_model_path,)
>>>                    is_test=True,
>>>                    save_traces=True,
>>>                    save_format='h5')
load_model(base_model_path: Optional[Union[str, Dict[PYU, str]]] = None, fuse_model_path: Optional[str] = None, is_test=False, base_custom_objects=None, fuse_custom_objects=None)[源代码]#

Vertical split learning load model interface

参数:
  • base_model_path – base model path

  • fuse_model_path – fuse model path

  • is_test – whether is test mode

  • base_custom_objects – Optional dictionary mapping names (strings) to custom classes or functions of the base model to be considered during deserialization

  • fuse_custom_objects – Optional dictionary mapping names (strings) to custom classes or functions of the base model to be considered during deserialization.

export_model(base_model_path: Optional[Union[str, Dict[PYU, str]]] = None, fuse_model_path: Optional[str] = None, save_format='tf', is_test=False, **kwargs)[源代码]#

Vertical split learning export model interface

参数:
  • base_model_path – base model path,only support format like ‘a/b/c’, where c is the model name

  • fuse_model_path – fuse model path

  • save_format – what format to export

  • kwargs – other argument inherit from onnx safer

get_cpus() List[int][源代码]#

secretflow.ml.nn.metrics#

keras global evaluation metrics

Classes:

Metric()

Default(name, total, count)

Mean(name, total, count)

keras.metrics.Mean on fede

AUC(name, thresholds, true_positives, ...[, ...])

Federated keras.metrics.AUC

Precision(name, thresholds, true_positives, ...)

Federated keras.metrics.Precision

Recall(name, thresholds, true_positives, ...)

Federated keras.metrics.Recall

Functions:

aggregate_metrics(local_metrics)

Aggregate Model metrics values of each party and calculate global metrics.

class secretflow.ml.nn.metrics.Metric[源代码]#

基类:ABC

Methods:

result()

abstract result()[源代码]#
class secretflow.ml.nn.metrics.Default(name: str, total: float, count: float)[源代码]#

基类:Metric

Attributes:

name

total

count

Methods:

result()

__init__(name, total, count)

name: str#
total: float#
count: float#
result()[源代码]#
__init__(name: str, total: float, count: float) None#
class secretflow.ml.nn.metrics.Mean(name: str, total: float, count: float)[源代码]#

基类:Metric

keras.metrics.Mean on fede

total#

sum of metrics

Type:

float

count#

num of samples

Type:

float

Attributes:

name

total

count

Methods:

result()

__init__(name, total, count)

name: str#
total: float#
count: float#
result()[源代码]#
__init__(name: str, total: float, count: float) None#
class secretflow.ml.nn.metrics.AUC(name: str, thresholds: List[float], true_positives: List[float], true_negatives: List[float], false_positives: List[float], false_negatives: List[float], curve=None)[源代码]#

基类:Metric

Federated keras.metrics.AUC

thresholds#

threshold of buckets. same to tf.keras.metrics.AUC,must contain 0 and 1.

true_positives#

num samples of true positive.

true_negatives#

num samples of true negative.

false_positives#

num samples of false positive.

false_negatives#

num samples of false negative.

curve#

type of AUC curve, same to ‘tf.keras.metrics.AUC’, it can be ‘ROC’ or ‘PR’.

Methods:

__init__(name, thresholds, true_positives, ...)

result()

__init__(name: str, thresholds: List[float], true_positives: List[float], true_negatives: List[float], false_positives: List[float], false_negatives: List[float], curve=None)[源代码]#
result()[源代码]#
class secretflow.ml.nn.metrics.Precision(name: str, thresholds: float, true_positives: float, false_positives: float)[源代码]#

基类:Metric

Federated keras.metrics.Precision

thresholds#

value of threshold, float or list, in [0, 1].

Type:

float

true_positives#

num samples of true positive

Type:

float

false_positives#

num samples of false positive

Type:

float

Attributes:

name

thresholds

true_positives

false_positives

Methods:

result()

__init__(name, thresholds, true_positives, ...)

name: str#
thresholds: float#
true_positives: float#
false_positives: float#
result()[源代码]#
__init__(name: str, thresholds: float, true_positives: float, false_positives: float) None#
class secretflow.ml.nn.metrics.Recall(name: str, thresholds: float, true_positives: float, false_negatives: float)[源代码]#

基类:Metric

Federated keras.metrics.Recall

thresholds#

value of threshold, float or list, in [0, 1].

Type:

float

true_positives#

num samples of true positive

Type:

float

false_negatives#

num samples of false negative

Type:

float

Attributes:

name

thresholds

true_positives

false_negatives

Methods:

result()

__init__(name, thresholds, true_positives, ...)

name: str#
thresholds: float#
true_positives: float#
false_negatives: float#
result()[源代码]#
__init__(name: str, thresholds: float, true_positives: float, false_negatives: float) None#
secretflow.ml.nn.metrics.aggregate_metrics(local_metrics: List[List]) List[源代码]#

Aggregate Model metrics values of each party and calculate global metrics.

参数:

local_metrics – Model metrics values in this party.

返回:

A list of aggregations of each party metrics.