secretflow.ml.nn.fl.backend.torch#

secretflow.ml.nn.fl.backend.torch.fl_base#

Classes:

BaseTorchModel(builder_base[, random_seed])

class secretflow.ml.nn.fl.backend.torch.fl_base.BaseTorchModel(builder_base: Callable[[], TorchModel], random_seed: Optional[int] = None)[源代码]#

基类:ABC

Methods:

__init__(builder_base[, random_seed])

build_dataset_from_csv(csv_file_path, label)

build torch.dataloader

build_dataset(x[, y, s_w, sampling_rate, ...])

build torch.dataloader

build_dataset_from_builder(dataset_builder, x)

build tf.data.Dataset

get_rows_count(filename)

get_weights()

set_weights(weights)

set weights of client model

set_validation_metrics(global_metrics)

wrap_local_metrics()

evaluate([evaluate_steps])

predict([predict_steps])

init_training(callbacks[, epochs, steps, ...])

on_train_begin()

on_epoch_begin(epoch)

on_epoch_end(epoch)

transform_metrics(logs[, stage])

on_train_end()

get_stop_training()

train_step(weights, cur_steps, train_steps, ...)

save_model(model_path)

For compatibility reasons it is recommended to instead save only its state dict Ref:https://pytorch.org/docs/master/notes/serialization.html#id5

load_model(model_path)

load model from state dict, model structure must be defined before load

__init__(builder_base: Callable[[], TorchModel], random_seed: Optional[int] = None)[源代码]#
build_dataset_from_csv(csv_file_path: str, label: str, sampling_rate=None, shuffle=False, random_seed=1234, na_value='?', repeat_count=1, sample_length=0, buffer_size=None, ignore_errors=True, prefetch_buffer_size=None, stage='train', label_decoder=None)[源代码]#

build torch.dataloader

参数:
  • csv_file_path – Dict of csv file path

  • label – label column name

  • sampling_rate – Sampling rate of a batch

  • shuffle – A bool that indicates whether the input should be shuffled

  • random_seed – Randomization seed to use for shuffling.

  • na_value – Additional string to recognize as NA/NaN.

  • repeat_count – num of repeats

  • sample_length – num of sample length

  • buffer_size – shuffle size

  • ignore_errors – if True, ignores errors with CSV file parsing,

  • prefetch_buffer_size – An int specifying the number of feature batches to prefetch for performance improvement.

  • stage – the stage of the datset

  • label_decoder – callable function for label preprocess

build_dataset(x: ndarray, y: Optional[ndarray] = None, s_w: Optional[ndarray] = None, sampling_rate=None, buffer_size=None, shuffle=False, random_seed=1234, repeat_count=1, sampler_method='batch', stage='train')[源代码]#

build torch.dataloader

参数:
  • x – feature, FedNdArray or HDataFrame

  • y – label, FedNdArray or HDataFrame

  • s_w – sample weight of this dataset

  • sampling_rate – Sampling rate of a batch

  • buffer_size – shuffle size

  • shuffle – A bool that indicates whether the input should be shuffled

  • random_seed – Prg seed for shuffling

  • repeat_count – num of repeats

  • sampler – method of sampler

build_dataset_from_builder(dataset_builder: Callable, x: Union[DataFrame, str], y: Optional[ndarray] = None, s_w: Optional[ndarray] = None, repeat_count=1, stage='train')[源代码]#

build tf.data.Dataset

参数:
  • dataset_builder – Function of how to build dataset, must return dataset and step_per_epoch

  • x – A pandas Dataframe or A string representing the path to a CSV file or data folder containing the input data.

  • y – label, An optional NumPy array containing the labels for the dataset. Defaults to None.

  • s_w – An optional NumPy array containing the sample weights for the dataset. Defaults to None.

  • repeat_count – An integer specifying the number of times to repeat the dataset. This is useful for increasing the effective size of the dataset.

  • stage – A string indicating the stage of the dataset (either “train”, “eval”). Defaults to “train”.

返回:

A tensorflow dataset

get_rows_count(filename)[源代码]#
get_weights()[源代码]#
set_weights(weights)[源代码]#

set weights of client model

set_validation_metrics(global_metrics)[源代码]#
wrap_local_metrics()[源代码]#
evaluate(evaluate_steps=0)[源代码]#
predict(predict_steps=0)[源代码]#
init_training(callbacks, epochs=1, steps=0, verbose=0)[源代码]#
on_train_begin()[源代码]#
on_epoch_begin(epoch)[源代码]#
on_epoch_end(epoch)[源代码]#
transform_metrics(logs, stage='train')[源代码]#
on_train_end()[源代码]#
get_stop_training()[源代码]#
abstract train_step(weights, cur_steps, train_steps, **kwargs)[源代码]#
save_model(model_path: str)[源代码]#

For compatibility reasons it is recommended to instead save only its state dict Ref:https://pytorch.org/docs/master/notes/serialization.html#id5

load_model(model_path: str)[源代码]#

load model from state dict, model structure must be defined before load

secretflow.ml.nn.fl.backend.torch.sampler#

Functions:

batch_sampler(x, y, s_w, sampling_rate, ...)

implementation of batch sampler

possion_sampler(x, y, s_w, sampling_rate, ...)

implementation of possion sampler

sampler_data([sampler_method, x, y, s_w, ...])

do sample data by sampler_method

secretflow.ml.nn.fl.backend.torch.sampler.batch_sampler(x, y, s_w, sampling_rate, buffer_size, shuffle, repeat_count, random_seed)[源代码]#

implementation of batch sampler

参数:
  • x – feature, FedNdArray or HDataFrame

  • y – label, FedNdArray or HDataFrame

  • s_w – sample weight of this dataset

  • sampling_rate – Sampling rate of a batch

  • buffer_size – shuffle size

  • shuffle – A bool that indicates whether the input should be shuffled

  • repeat_count – num of repeats

  • random_seed – Prg seed for shuffling

返回:

tf.data.Dataset

返回类型:

data_set

secretflow.ml.nn.fl.backend.torch.sampler.possion_sampler(x, y, s_w, sampling_rate, random_seed)[源代码]#

implementation of possion sampler

参数:
  • x – feature, FedNdArray or HDataFrame

  • y – label, FedNdArray or HDataFrame

  • s_w – sample weight of this dataset

  • sampling_rate – Sampling rate of a batch

  • random_seed – Prg seed for shuffling

返回:

tf.data.Dataset

返回类型:

dataloader

secretflow.ml.nn.fl.backend.torch.sampler.sampler_data(sampler_method='batch', x=None, y=None, s_w=None, sampling_rate=None, buffer_size=None, shuffle=False, repeat_count=1, random_seed=1234)[源代码]#

do sample data by sampler_method

参数:
  • x – feature, FedNdArray or HDataFrame

  • y – label, FedNdArray or HDataFrame

  • s_w – sample weight of this dataset

  • sampling_rate – Sampling rate of a batch

  • buffer_size – shuffle size

  • shuffle – A bool that indicates whether the input should be shuffled

  • repeat_count – num of repeats

  • random_seed – Prg seed for shuffling

返回:

tf.data.Dataset

返回类型:

data_set

secretflow.ml.nn.fl.backend.torch.utils#

Classes:

BaseModule(*args, **kwargs)

TorchModel([model_fn, loss_fn, optim_fn, ...])

class secretflow.ml.nn.fl.backend.torch.utils.BaseModule(*args, **kwargs)[源代码]#

基类:ABC, Module

Methods:

forward(x)

Defines the computation performed at every call.

get_weights([return_numpy])

set_weights(weights)

update_weights(weights)

get_gradients([parameters])

set_gradients(gradients[, parameters])

Attributes:

abstract forward(x)[源代码]#

Defines the computation performed at every call.

Should be overridden by all subclasses.

备注

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

get_weights(return_numpy=False)[源代码]#
set_weights(weights)[源代码]#
update_weights(weights)[源代码]#
get_gradients(parameters=None)[源代码]#
set_gradients(gradients: List[Union[Tensor, ndarray]], parameters: Optional[List[Tensor]] = None)[源代码]#
training: bool#
class secretflow.ml.nn.fl.backend.torch.utils.TorchModel(model_fn: Optional[BaseModule] = None, loss_fn: Optional[_Loss] = None, optim_fn: Optional[Optimizer] = None, metrics: List[Metric] = [])[源代码]#

基类:object

Methods:

__init__([model_fn, loss_fn, optim_fn, metrics])

__init__(model_fn: Optional[BaseModule] = None, loss_fn: Optional[_Loss] = None, optim_fn: Optional[Optimizer] = None, metrics: List[Metric] = [])[源代码]#