使用 SPU 进行逻辑回归#

以下代码仅作为示例，请勿在生产环境直接使用。

SPU 是一个特定领域的编译器和运行时套件，提供可证明的安全计算服务。SPU编译器使用 XLA 作为前端IR，支持多种AI框架（如Tensorflow、JAX和PyTorch）。SPU 编译器将 XLA 转换为可由 SPU 运行时解释的 IR。目前 SPU 团队强烈推荐使用 JAX 作为前端。

学习目标：#

完成本实验后，您将知道如何：

如何使用 JAX 编写逻辑回归训练程序。
如何轻松地将 JAX 程序转换为 SPU(MPC) 程序。

在本实验室中，我们选择 Breast Cancer 作为数据集。我们需要通过 30 个特征来判断癌症是恶性还是良性。在 MPC 程序中，两方共同训练模型，每一方提供一半的特征（15）。

首先，让我们忘记 MPC 语意，直接使用 JAX 编写逻辑回归训练程序。

使用 JAX 训练模型#

加载数据集#

我们将在使用“breast_cancer”标准化后将整个数据集拆分为训练和测试子集。如果 train 是 True，返回训练子集，另外，为了模拟垂直数据集拆分的训练，还需要提供“party_id”参数。否则，返回测试子集。

[1]:

import numpy as np
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import Normalizer


def breast_cancer(party_id=None, train: bool = True) -> (np.ndarray, np.ndarray):
    x, y = load_breast_cancer(return_X_y=True)
    x = (x - np.min(x)) / (np.max(x) - np.min(x))
    x_train, x_test, y_train, y_test = train_test_split(
        x, y, test_size=0.2, random_state=42
    )

    if train:
        if party_id:
            if party_id == 1:
                return x_train[:, :15], _
            else:
                return x_train[:, 15:], y_train
        else:
            return x_train, y_train
    else:
        return x_test, y_test

定义模型#

首先，让我们定义损失函数，在我们的例子中它是一个负对数似然。

[2]:

import jax.numpy as jnp


def sigmoid(x):
    return 1 / (1 + jnp.exp(-x))


# Outputs probability of a label being true.
def predict(W, b, inputs):
    return sigmoid(jnp.dot(inputs, W) + b)


# Training loss is the negative log-likelihood of the training examples.
def loss(W, b, inputs, targets):
    preds = predict(W, b, inputs)
    label_probs = preds * targets + (1 - preds) * (1 - targets)
    return -jnp.mean(jnp.log(label_probs))

其次，让我们使用 SGD 优化器定义单个训练步骤。提醒一下，x1 代表来自一方的 15 个特征，而 x2 代表来自另一方的其他 15 个特征。

[3]:

from jax import grad


def train_step(W, b, x1, x2, y, learning_rate):
    x = jnp.concatenate([x1, x2], axis=1)
    Wb_grad = grad(loss, (0, 1))(W, b, x, y)
    W -= learning_rate * Wb_grad[0]
    b -= learning_rate * Wb_grad[1]
    return W, b

最后，让我们将所有内容构建为“fit”方法，该方法返回每个epoch的模型和损失。

[4]:

def fit(W, b, x1, x2, y, epochs=1, learning_rate=1e-2):
    for _ in range(epochs):
        W, b = train_step(W, b, x1, x2, y, learning_rate=learning_rate)
    return W, b

验证模型#

我们可以使用 AUC 来验证二元分类模型。

[5]:

from sklearn.metrics import roc_auc_score


def validate_model(W, b, X_test, y_test):
    y_pred = predict(W, b, X_test)
    return roc_auc_score(y_test, y_pred)

试试！#

让我们把所有的东西放在一起，训练一个 LR 模型！

[6]:

%matplotlib inline

# Load the data
x1, _ = breast_cancer(party_id=1,train=True)
x2, y = breast_cancer(party_id=2,train=True)

# Hyperparameter
W = jnp.zeros((30,))
b = 0.0
epochs = 10
learning_rate = 1e-2

# Train the model
W, b = fit(W, b, x1, x2, y, epochs=10, learning_rate=1e-2)

# Validate the model
X_test, y_test = breast_cancer(train=False)
auc=validate_model(W,b, X_test, y_test)
print(f'auc={auc}')

No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)

auc=0.9878807730101539

请记住这里的 AUC，因为我们想用 SPU 做类似的事情！

使用 SPU 训练模型#

在这一部分，我们将向您展示如何安全地使用 MPC 进行类似的训练！

初始化环境#

我们将在我们的物理环境中初始化三个虚拟设备。 - alice, bob：两个用于本地明文计算的 PYU 设备。- spu：SPU 设备由 alice 和 bob 组成，用于 MPC 安全计算。

[7]:

import secretflow as sf

# In case you have a running secretflow runtime already.
sf.shutdown()

sf.init(['alice', 'bob'], address='local')

alice, bob = sf.PYU('alice'), sf.PYU('bob')
spu = sf.SPU(sf.utils.testing.cluster_def(['alice', 'bob']))

INFO:root:Run secretflow in simulation mode.
2023-03-14 12:59:36,141 INFO worker.py:1538 -- Started a local Ray instance.

加载数据集#

我们指示 alice 和 bob 分别加载训练子集。

[8]:

x1, _ = alice(breast_cancer)(party_id=1)
x2, y = bob(breast_cancer)(party_id=2)

x1, x2, y

[8]:

(<secretflow.device.device.pyu.PYUObject at 0x7f74b6586a30>,
 <secretflow.device.device.pyu.PYUObject at 0x7f74b58fd6d0>,
 <secretflow.device.device.pyu.PYUObject at 0x7f74b58fd8e0>)

在训练之前，我们需要将超参数和所有数据传递给 SPU 设备。 SecretFlow 提供两种方法： - secretflow.to：将 PythonObject 或 DeviceObject 传输到特定设备。 - DeviceObject.to：将 DeviceObject 传输到特定设备。

[9]:

device = spu

W = jnp.zeros((30,))
b = 0.0

W_, b_, x1_, x2_, y_ = (
    sf.to(alice, W).to(device),
    sf.to(alice, b).to(device),
    x1.to(device),
    x2.to(device),
    y.to(device),
)

(_run pid=1517660) INFO:jax._src.lib.xla_bridge:Unable to initialize backend 'tpu_driver': NOT_FOUND: Unable to find driver in registry given worker:
(_run pid=1517660) INFO:jax._src.lib.xla_bridge:Unable to initialize backend 'cuda': module 'jaxlib.xla_extension' has no attribute 'GpuAllocatorConfig'
(_run pid=1517660) INFO:jax._src.lib.xla_bridge:Unable to initialize backend 'rocm': module 'jaxlib.xla_extension' has no attribute 'GpuAllocatorConfig'
(_run pid=1517660) INFO:jax._src.lib.xla_bridge:Unable to initialize backend 'tpu': INVALID_ARGUMENT: TpuPlatform is not available.
(_run pid=1517660) INFO:jax._src.lib.xla_bridge:Unable to initialize backend 'plugin': xla_extension has no attributes named get_plugin_device_client. Compile TensorFlow with //tensorflow/compiler/xla/python:enable_plugin_device set to true (defaults to false) to enable this.
(_run pid=1517660) WARNING:jax._src.lib.xla_bridge:No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)
(_run pid=1515629) INFO:jax._src.lib.xla_bridge:Unable to initialize backend 'tpu_driver': NOT_FOUND: Unable to find driver in registry given worker:
(_run pid=1515629) INFO:jax._src.lib.xla_bridge:Unable to initialize backend 'cuda': module 'jaxlib.xla_extension' has no attribute 'GpuAllocatorConfig'
(_run pid=1515629) INFO:jax._src.lib.xla_bridge:Unable to initialize backend 'rocm': module 'jaxlib.xla_extension' has no attribute 'GpuAllocatorConfig'
(_run pid=1515629) INFO:jax._src.lib.xla_bridge:Unable to initialize backend 'tpu': INVALID_ARGUMENT: TpuPlatform is not available.
(_run pid=1515629) INFO:jax._src.lib.xla_bridge:Unable to initialize backend 'plugin': xla_extension has no attributes named get_plugin_device_client. Compile TensorFlow with //tensorflow/compiler/xla/python:enable_plugin_device set to true (defaults to false) to enable this.
(_run pid=1515629) WARNING:jax._src.lib.xla_bridge:No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)
(_run pid=1517309) INFO:jax._src.lib.xla_bridge:Unable to initialize backend 'tpu_driver': NOT_FOUND: Unable to find driver in registry given worker:
(_run pid=1517309) INFO:jax._src.lib.xla_bridge:Unable to initialize backend 'cuda': module 'jaxlib.xla_extension' has no attribute 'GpuAllocatorConfig'
(_run pid=1517309) INFO:jax._src.lib.xla_bridge:Unable to initialize backend 'rocm': module 'jaxlib.xla_extension' has no attribute 'GpuAllocatorConfig'
(_run pid=1517309) INFO:jax._src.lib.xla_bridge:Unable to initialize backend 'tpu': INVALID_ARGUMENT: TpuPlatform is not available.
(_run pid=1517309) INFO:jax._src.lib.xla_bridge:Unable to initialize backend 'plugin': xla_extension has no attributes named get_plugin_device_client. Compile TensorFlow with //tensorflow/compiler/xla/python:enable_plugin_device set to true (defaults to false) to enable this.
(_run pid=1517309) WARNING:jax._src.lib.xla_bridge:No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)
(_run pid=1516708) INFO:jax._src.lib.xla_bridge:Unable to initialize backend 'tpu_driver': NOT_FOUND: Unable to find driver in registry given worker:
(_run pid=1516708) INFO:jax._src.lib.xla_bridge:Unable to initialize backend 'cuda': module 'jaxlib.xla_extension' has no attribute 'GpuAllocatorConfig'
(_run pid=1516708) INFO:jax._src.lib.xla_bridge:Unable to initialize backend 'rocm': module 'jaxlib.xla_extension' has no attribute 'GpuAllocatorConfig'
(_run pid=1516708) INFO:jax._src.lib.xla_bridge:Unable to initialize backend 'tpu': INVALID_ARGUMENT: TpuPlatform is not available.
(_run pid=1516708) INFO:jax._src.lib.xla_bridge:Unable to initialize backend 'plugin': xla_extension has no attributes named get_plugin_device_client. Compile TensorFlow with //tensorflow/compiler/xla/python:enable_plugin_device set to true (defaults to false) to enable this.
(_run pid=1516708) WARNING:jax._src.lib.xla_bridge:No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)

训练模型#

现在我们准备好用 SPU 训练一个 LR 模型。经过训练，损失和模型是仍然保密的 SPU 对象。

[10]:

W_, b_ = device(
    fit,
    static_argnames=['epochs'],
    num_returns_policy=sf.device.SPUCompilerNumReturnsPolicy.FROM_USER,
    user_specified_num_returns=2,
)(W_, b_, x1_, x2_, y_, epochs=10, learning_rate=1e-2)

W_, b_

[10]:

(<secretflow.device.device.spu.SPUObject at 0x7f7608ad88e0>,
 <secretflow.device.device.spu.SPUObject at 0x7f7608ad83a0>)

(SPURuntime pid=1525949) 2023-03-14 12:59:46.548 [info] [thread_pool.cc:ThreadPool:30] Create a fixed thread pool with size 127
(SPURuntime pid=1525959) 2023-03-14 12:59:46.548 [info] [thread_pool.cc:ThreadPool:30] Create a fixed thread pool with size 127

揭示结果#

为了检查训练的模型，我们需要将 SPUObject(secret) 转换为 Python object(明文)。 SecretFlow 提供 sf.reveal 将任何 DeviceObject 转换为 Python object。

请小心使用 sf.reveal，因为它可能导致秘密泄露。

最后，让我们用 AUC 验证模型。

[11]:

auc = validate_model(sf.reveal(W_), sf.reveal(b_), X_test, y_test)
print(f'auc={auc}')

auc=0.987880773010154

你可能会发现 SPU 训练程序中的模型达到了与 JAX 程序相同的 AUC。

实验到此结束。