secretflow.ml.nn.fl.backend.torch#

secretflow.ml.nn.fl.backend.torch.strategy

secretflow.ml.nn.fl.backend.torch.fl_base#

Classes:

BaseTorchModel(builder_base[, random_seed])

class secretflow.ml.nn.fl.backend.torch.fl_base.BaseTorchModel(builder_base: Callable[[], TorchModel], random_seed: Optional[int] = None)[source]#

Bases: ABC

Methods:

`__init__`(builder_base[, random_seed])
`build_dataset_from_csv`(csv_file_path, label)	build torch.dataloader
`build_dataset`(x[, y, s_w, sampling_rate, ...])	build torch.dataloader
`build_dataset_from_builder`(dataset_builder, x)	build tf.data.Dataset
`get_rows_count`(filename)
`get_weights`()
`set_weights`(weights)	set weights of client model
`set_validation_metrics`(global_metrics)
`wrap_local_metrics`()
`evaluate`([evaluate_steps])
`predict`([predict_steps])
`init_training`(callbacks[, epochs, steps, ...])
`on_train_begin`()
`on_epoch_begin`(epoch)
`on_epoch_end`(epoch)
`transform_metrics`(logs[, stage])
`on_train_end`()
`get_stop_training`()
`train_step`(weights, cur_steps, train_steps, ...)
`save_model`(model_path)	For compatibility reasons it is recommended to instead save only its state dict Ref:https://pytorch.org/docs/master/notes/serialization.html#id5
`load_model`(model_path)	load model from state dict, model structure must be defined before load

__init__(builder_base: Callable[[], TorchModel], random_seed: Optional[int] = None)[source]#

build_dataset_from_csv(csv_file_path: str, label: str, sampling_rate=None, shuffle=False, random_seed=1234, na_value='?', repeat_count=1, sample_length=0, buffer_size=None, ignore_errors=True, prefetch_buffer_size=None, stage='train', label_decoder=None)[source]#

build torch.dataloader

Parameters:

csv_file_path – Dict of csv file path
label – label column name
sampling_rate – Sampling rate of a batch
shuffle – A bool that indicates whether the input should be shuffled
random_seed – Randomization seed to use for shuffling.
na_value – Additional string to recognize as NA/NaN.
repeat_count – num of repeats
sample_length – num of sample length
buffer_size – shuffle size
ignore_errors – if True, ignores errors with CSV file parsing,
prefetch_buffer_size – An int specifying the number of feature batches to prefetch for performance improvement.
stage – the stage of the datset
label_decoder – callable function for label preprocess

build_dataset(x: ndarray, y: Optional[ndarray] = None, s_w: Optional[ndarray] = None, sampling_rate=None, buffer_size=None, shuffle=False, random_seed=1234, repeat_count=1, sampler_method='batch', stage='train')[source]#

build torch.dataloader

Parameters:

x – feature, FedNdArray or HDataFrame
y – label, FedNdArray or HDataFrame
s_w – sample weight of this dataset
sampling_rate – Sampling rate of a batch
buffer_size – shuffle size
shuffle – A bool that indicates whether the input should be shuffled
random_seed – Prg seed for shuffling
repeat_count – num of repeats
sampler – method of sampler

build_dataset_from_builder(dataset_builder: Callable, x: Union[DataFrame, str], y: Optional[ndarray] = None, s_w: Optional[ndarray] = None, repeat_count=1, stage='train')[source]#

build tf.data.Dataset

Parameters:

dataset_builder – Function of how to build dataset, must return dataset and step_per_epoch
x – A pandas Dataframe or A string representing the path to a CSV file or data folder containing the input data.
y – label, An optional NumPy array containing the labels for the dataset. Defaults to None.
s_w – An optional NumPy array containing the sample weights for the dataset. Defaults to None.
repeat_count – An integer specifying the number of times to repeat the dataset. This is useful for increasing the effective size of the dataset.
stage – A string indicating the stage of the dataset (either “train”, “eval”). Defaults to “train”.

Returns:

A tensorflow dataset

get_rows_count(filename)[source]#

get_weights()[source]#

set_weights(weights)[source]#: set weights of client model

set_validation_metrics(global_metrics)[source]#

wrap_local_metrics()[source]#

evaluate(evaluate_steps=0)[source]#

predict(predict_steps=0)[source]#

init_training(callbacks, epochs=1, steps=0, verbose=0)[source]#

on_train_begin()[source]#

on_epoch_begin(epoch)[source]#

on_epoch_end(epoch)[source]#

transform_metrics(logs, stage='train')[source]#

on_train_end()[source]#

get_stop_training()[source]#

abstract train_step(weights, cur_steps, train_steps, **kwargs)[source]#

save_model(model_path: str)[source]#: For compatibility reasons it is recommended to instead save only its state dict Ref:https://pytorch.org/docs/master/notes/serialization.html#id5

load_model(model_path: str)[source]#: load model from state dict, model structure must be defined before load

secretflow.ml.nn.fl.backend.torch.sampler#

Functions:

`batch_sampler`(x, y, s_w, sampling_rate, ...)	implementation of batch sampler
`possion_sampler`(x, y, s_w, sampling_rate, ...)	implementation of possion sampler
`sampler_data`([sampler_method, x, y, s_w, ...])	do sample data by sampler_method

secretflow.ml.nn.fl.backend.torch.sampler.batch_sampler(x, y, s_w, sampling_rate, buffer_size, shuffle, repeat_count, random_seed)[source]#

implementation of batch sampler

Parameters:

x – feature, FedNdArray or HDataFrame
y – label, FedNdArray or HDataFrame
s_w – sample weight of this dataset
sampling_rate – Sampling rate of a batch
buffer_size – shuffle size
shuffle – A bool that indicates whether the input should be shuffled
repeat_count – num of repeats
random_seed – Prg seed for shuffling

Returns:

tf.data.Dataset

Return type:

data_set

secretflow.ml.nn.fl.backend.torch.sampler.possion_sampler(x, y, s_w, sampling_rate, random_seed)[source]#

implementation of possion sampler

Parameters:

x – feature, FedNdArray or HDataFrame
y – label, FedNdArray or HDataFrame
s_w – sample weight of this dataset
sampling_rate – Sampling rate of a batch
random_seed – Prg seed for shuffling

Returns:

tf.data.Dataset

Return type:

dataloader

secretflow.ml.nn.fl.backend.torch.sampler.sampler_data(sampler_method='batch', x=None, y=None, s_w=None, sampling_rate=None, buffer_size=None, shuffle=False, repeat_count=1, random_seed=1234)[source]#

do sample data by sampler_method

Parameters:

x – feature, FedNdArray or HDataFrame
y – label, FedNdArray or HDataFrame
s_w – sample weight of this dataset
sampling_rate – Sampling rate of a batch
buffer_size – shuffle size
shuffle – A bool that indicates whether the input should be shuffled
repeat_count – num of repeats
random_seed – Prg seed for shuffling

Returns:

tf.data.Dataset

Return type:

data_set

secretflow.ml.nn.fl.backend.torch.utils#

Classes:

`BaseModule`(args, *kwargs)
`TorchModel`([model_fn, loss_fn, optim_fn, ...])

class secretflow.ml.nn.fl.backend.torch.utils.BaseModule(*args, **kwargs)[source]#

Bases: ABC, Module

Methods:

`forward`(x)	Defines the computation performed at every call.
`get_weights`([return_numpy])
`set_weights`(weights)
`update_weights`(weights)
`get_gradients`([parameters])
`set_gradients`(gradients[, parameters])

Attributes:

abstract forward(x)[source]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

get_weights(return_numpy=False)[source]#

set_weights(weights)[source]#

update_weights(weights)[source]#

get_gradients(parameters=None)[source]#

set_gradients(gradients: List[Union[Tensor, ndarray]], parameters: Optional[List[Tensor]] = None)[source]#

training: bool#

class secretflow.ml.nn.fl.backend.torch.utils.TorchModel(model_fn: Optional[BaseModule] = None, loss_fn: Optional[_Loss] = None, optim_fn: Optional[Optimizer] = None, metrics: List[Metric] = [])[source]#

Bases: object

Methods:

__init__([model_fn, loss_fn, optim_fn, metrics])

__init__(model_fn: Optional[BaseModule] = None, loss_fn: Optional[_Loss] = None, optim_fn: Optional[Optimizer] = None, metrics: List[Metric] = [])[source]#