secretflow.ml.linear.ss_sgd#
Classes:
|
This method provides both linear and logistic regression linear models for vertical split dataset setting by using secret sharing with mini batch SGD training solver. |
- class secretflow.ml.linear.ss_sgd.SSRegression(spu: SPU)[源代码]#
基类:
object
This method provides both linear and logistic regression linear models for vertical split dataset setting by using secret sharing with mini batch SGD training solver. SS-SGD is short for secret sharing SGD training.
more detail for SGD: https://stats.stackexchange.com/questions/488017/understanding-mini-batch-gradient-descent
Linear regression fits a linear model with coefficients w = (w1, …, wp) to minimize the residual sum of squares between the observed targets in the dataset, and the targets predicted by the linear approximation.
more detail for linear regression: https://en.wikipedia.org/wiki/Linear_regression
Logistic regression, despite its name, is a linear model for classification rather than regression. logistic regression is also known in the literature as logit regression, maximum-entropy classification (MaxEnt) or the log-linear classifier. the probabilities describing the possible outcomes of a single trial are modeled using a logistic function. This method can fit binary regularization with optional L2 regularization.
more detail for logistic regression: https://en.wikipedia.org/wiki/Logistic_regression
SPU is a verifiable and measurable secure computing device that running under various MPC protocols to provide provable security.
More detail for SPU: https://www.secretflow.org.cn/docs/spu/en/
This method protects the original dataset and the final model by secret sharing the dataset to SPU device and running model fit under SPU.
- 参数:
spu – secure device.
备注
training dataset should be normalized or standardized, otherwise the SGD solver will not converge.
Methods:
__init__
(spu)fit
(x, y, epochs[, learning_rate, ...])Fit the model according to the given training data.
Save fit model in LinearModel format.
load_model
(m)Load LinearModel format model.
predict
(x[, batch_size, to_pyu])Predict using the model.
- fit(x: Union[FedNdarray, VDataFrame], y: Union[FedNdarray, VDataFrame], epochs: int, learning_rate: float = 0.1, batch_size: int = 1024, sig_type: str = 't1', reg_type: str = 'logistic', penalty: str = 'None', l2_norm: float = 0.5, eps: float = 0.001, decay_epoch: Optional[int] = None, decay_rate: Optional[float] = None, strategy: str = 'naive_sgd') None [源代码]#
Fit the model according to the given training data.
- 参数:
x – {FedNdarray, VDataFrame} of shape (n_samples, n_features) Training vector, where n_samples is the number of samples and n_features is the number of features.
y – {FedNdarray, VDataFrame} of shape (n_samples,) Target vector relative to X.
epochs – int iteration rounds.
learning_rate – float, default=0.1 controls how much to change the model in one epoch.
batch_size – int, default=1024 how many samples use in one calculation.
sig_type – str, default=t1 sigmoid approximation type.
reg_type – str, default=logistic Linear or Logistic regression.
penalty – str, default=None The penalty (aka regularization term) to be used.
l2_norm – float, default=0.5 L2 regularization term.
eps – float, default=1e-3 If the W’s change rate is less than this threshold, the model is considered to be converged, and the training stops early. 0 disable.
decay_rate (decay_epoch /) – int, default=None decay learning rate, learning_rate * (decay_rate ** floor(epoch / decay_epoch)). None disable If strategy=policy_sgd, then decay_rate and decay_epoch have default value 0.5, 5.
strategy –
str, default=naive_sgd optimization strategy used in training
naive_sgd means origin sgd policy_sgd(LR only) will scale the learning_rate in each update like adam but with unify factor,
so the batch_size can be larger and the early stop strategy can be more aggressive, which accelerates training in most scenery(But not recommend for training with large regularization).
- 返回:
Final weights in SPUObject.
- save_model() LinearModel [源代码]#
Save fit model in LinearModel format.
- load_model(m: LinearModel) None [源代码]#
Load LinearModel format model.
- predict(x: Union[FedNdarray, VDataFrame], batch_size: int = 1024, to_pyu: Optional[PYU] = None) Union[SPUObject, FedNdarray] [源代码]#
Predict using the model.
- 参数:
x – {FedNdarray, VDataFrame} of shape (n_samples, n_features) Predict samples.
batch_size – int, default=1024 how many samples use in one calculation.
to_pyu – the prediction initiator if not None predict result is reveal to to_pyu device and save as FedNdarray otherwise, keep predict result in secret and save as SPUObject.
- 返回:
pred scores in SPUObject or FedNdarray, shape (n_samples,)
secretflow.ml.linear.ss_sgd.model#
Classes:
|
An enumeration. |
|
An enumeration. |
|
This method provides both linear and logistic regression linear models for vertical split dataset setting by using secret sharing with mini batch SGD training solver. |
- class secretflow.ml.linear.ss_sgd.model.Penalty(value)[源代码]#
基类:
Enum
An enumeration.
Attributes:
- NONE = 'None'#
- L1 = 'l1'#
- L2 = 'l2'#
- class secretflow.ml.linear.ss_sgd.model.Strategy(value)[源代码]#
基类:
Enum
An enumeration.
Attributes:
- NAIVE_SGD = 'naive_sgd'#
- POLICY_SGD = 'policy_sgd'#
- class secretflow.ml.linear.ss_sgd.model.SSRegression(spu: SPU)[源代码]#
基类:
object
This method provides both linear and logistic regression linear models for vertical split dataset setting by using secret sharing with mini batch SGD training solver. SS-SGD is short for secret sharing SGD training.
more detail for SGD: https://stats.stackexchange.com/questions/488017/understanding-mini-batch-gradient-descent
Linear regression fits a linear model with coefficients w = (w1, …, wp) to minimize the residual sum of squares between the observed targets in the dataset, and the targets predicted by the linear approximation.
more detail for linear regression: https://en.wikipedia.org/wiki/Linear_regression
Logistic regression, despite its name, is a linear model for classification rather than regression. logistic regression is also known in the literature as logit regression, maximum-entropy classification (MaxEnt) or the log-linear classifier. the probabilities describing the possible outcomes of a single trial are modeled using a logistic function. This method can fit binary regularization with optional L2 regularization.
more detail for logistic regression: https://en.wikipedia.org/wiki/Logistic_regression
SPU is a verifiable and measurable secure computing device that running under various MPC protocols to provide provable security.
More detail for SPU: https://www.secretflow.org.cn/docs/spu/en/
This method protects the original dataset and the final model by secret sharing the dataset to SPU device and running model fit under SPU.
- 参数:
spu – secure device.
备注
training dataset should be normalized or standardized, otherwise the SGD solver will not converge.
Methods:
__init__
(spu)fit
(x, y, epochs[, learning_rate, ...])Fit the model according to the given training data.
Save fit model in LinearModel format.
load_model
(m)Load LinearModel format model.
predict
(x[, batch_size, to_pyu])Predict using the model.
- fit(x: Union[FedNdarray, VDataFrame], y: Union[FedNdarray, VDataFrame], epochs: int, learning_rate: float = 0.1, batch_size: int = 1024, sig_type: str = 't1', reg_type: str = 'logistic', penalty: str = 'None', l2_norm: float = 0.5, eps: float = 0.001, decay_epoch: Optional[int] = None, decay_rate: Optional[float] = None, strategy: str = 'naive_sgd') None [源代码]#
Fit the model according to the given training data.
- 参数:
x – {FedNdarray, VDataFrame} of shape (n_samples, n_features) Training vector, where n_samples is the number of samples and n_features is the number of features.
y – {FedNdarray, VDataFrame} of shape (n_samples,) Target vector relative to X.
epochs – int iteration rounds.
learning_rate – float, default=0.1 controls how much to change the model in one epoch.
batch_size – int, default=1024 how many samples use in one calculation.
sig_type – str, default=t1 sigmoid approximation type.
reg_type – str, default=logistic Linear or Logistic regression.
penalty – str, default=None The penalty (aka regularization term) to be used.
l2_norm – float, default=0.5 L2 regularization term.
eps – float, default=1e-3 If the W’s change rate is less than this threshold, the model is considered to be converged, and the training stops early. 0 disable.
decay_rate (decay_epoch /) – int, default=None decay learning rate, learning_rate * (decay_rate ** floor(epoch / decay_epoch)). None disable If strategy=policy_sgd, then decay_rate and decay_epoch have default value 0.5, 5.
strategy –
str, default=naive_sgd optimization strategy used in training
naive_sgd means origin sgd policy_sgd(LR only) will scale the learning_rate in each update like adam but with unify factor,
so the batch_size can be larger and the early stop strategy can be more aggressive, which accelerates training in most scenery(But not recommend for training with large regularization).
- 返回:
Final weights in SPUObject.
- save_model() LinearModel [源代码]#
Save fit model in LinearModel format.
- load_model(m: LinearModel) None [源代码]#
Load LinearModel format model.
- predict(x: Union[FedNdarray, VDataFrame], batch_size: int = 1024, to_pyu: Optional[PYU] = None) Union[SPUObject, FedNdarray] [源代码]#
Predict using the model.
- 参数:
x – {FedNdarray, VDataFrame} of shape (n_samples, n_features) Predict samples.
batch_size – int, default=1024 how many samples use in one calculation.
to_pyu – the prediction initiator if not None predict result is reveal to to_pyu device and save as FedNdarray otherwise, keep predict result in secret and save as SPUObject.
- 返回:
pred scores in SPUObject or FedNdarray, shape (n_samples,)