secretflow.device.device#
secretflow.device.device.base#
Classes:
|
|
|
Functions:
|
Register to as device kernel. |
- class secretflow.device.device.base.Device(device_type: DeviceType)[源代码]#
基类:
ABC
Methods:
__init__
(device_type)Abstraction device base class.
Attributes:
Get underlying device type
- __init__(device_type: DeviceType)[源代码]#
Abstraction device base class.
- 参数:
device_type (DeviceType) – underlying device type
- property device_type#
Get underlying device type
- class secretflow.device.device.base.DeviceObject(device: Device)[源代码]#
基类:
ABC
Methods:
__init__
(device)Abstraction device object base class.
to
(device, *args, **kwargs)Device object conversion.
Attributes:
Get underlying device type
- __init__(device: Device)[源代码]#
Abstraction device object base class.
- 参数:
device (Device) – Device where this object is located.
- property device_type#
Get underlying device type
secretflow.device.device.heu#
Classes:
|
|
|
|
|
|
|
|
|
Homomorphic encryption device |
- class secretflow.device.device.heu.HEUMoveConfig(heu_dest_party: str = 'auto', heu_encoder: Union[heu.phe.IntegerEncoder, heu.phe.FloatEncoder, heu.phe.BigintEncoder, heu.phe.IntegerEncoderParams, heu.phe.FloatEncoderParams, heu.phe.BigintEncoderParams, heu.phe.BatchFloatEncoderParams, heu.phe.BatchIntegerEncoderParams] = None, heu_audit_log: str = None)[源代码]#
基类:
object
Attributes:
Where the encrypted data is located
Do encode before move data to heu
file path to record audit log
Methods:
__init__
([heu_dest_party, heu_encoder, ...])- heu_dest_party: str = 'auto'#
Where the encrypted data is located
- heu_encoder: Union[IntegerEncoder, FloatEncoder, BigintEncoder, IntegerEncoderParams, FloatEncoderParams, BigintEncoderParams, BatchFloatEncoderParams, BatchIntegerEncoderParams] = None#
Do encode before move data to heu
- heu_audit_log: str = None#
file path to record audit log
- __init__(heu_dest_party: str = 'auto', heu_encoder: Optional[Union[IntegerEncoder, FloatEncoder, BigintEncoder, IntegerEncoderParams, FloatEncoderParams, BigintEncoderParams, BatchFloatEncoderParams, BatchIntegerEncoderParams]] = None, heu_audit_log: Optional[str] = None) None #
- class secretflow.device.device.heu.HEUActor(heu_id, party: str, hekit: Union[HeKit, DestinationHeKit], cleartext_type: dtype, encoder)[源代码]#
基类:
object
Methods:
__init__
(heu_id, party, hekit, ...)Init heu actor class
getitem
(data, item)Delegate of hnp ndarray.__getitem___()
setitem
(data, key, value)Delegate of hnp ndarray.__setitem___()
sum
(data)sum of data elements
select_sum
(data, item)sum of data on selected elements
batch_select_sum
(data, item)sum of data on selected elements
feature_wise_bucket_sum
(data, subgroup_map, ...)sum of data on selected elements
batch_feature_wise_bucket_sum
(data, ...[, ...])sum of data on selected elements
encode
(data[, edr])encode cleartext to plaintext
decode
(data[, edr])decode plaintext to cleartext
encrypt
(data[, heu_audit_log])Encrypt data
do_binary_op
(fn_name, data1, data2)perform math operation :param fn: hnp.Evaluator functions, such as hnp.Evaluator.add, hnp.Evaluator.sub
- __init__(heu_id, party: str, hekit: Union[HeKit, DestinationHeKit], cleartext_type: dtype, encoder)[源代码]#
Init heu actor class
- 参数:
heu_id – Heu instance id, globally unique
party – The party id
hekit – hnp.HeKit for sk_keeper or hnp.DestinationHeKit for evaluator
encoder –
Encode cleartext (float value) to plaintext (big int value). available encoders:
phe.IntegerEncoder
phe.FloatEncoder
phe.BigintEncoder
phe.BatchIntegerEncoder
phe.BatchFloatEncoder
- feature_wise_bucket_sum(data, subgroup_map, order_map, bucket_num, cumsum=False)[源代码]#
sum of data on selected elements
- batch_feature_wise_bucket_sum(data, subgroup_map, order_map, bucket_num, cumsum=False)[源代码]#
sum of data on selected elements
- encode(data: ndarray, edr=None)[源代码]#
encode cleartext to plaintext
- 参数:
data – cleartext
edr – encoder
- decode(data: PlaintextArray, edr=None)[源代码]#
decode plaintext to cleartext
- 参数:
data – plaintext
edr – encoder
- encrypt(data: PlaintextArray, heu_audit_log: Optional[str] = None) CiphertextArray [源代码]#
Encrypt data
If the data has already been encoded, the data will be encrypted directly, you don’t have to worry about the data being encoded repeatedly
Even if the data has been encrypted, you still need to pass in the encoder param, because decryption will use it
- 参数:
data – The data to be encrypted
heu_audit_log – file path to log audit info
- 返回:
The encrypted ndarray data
- class secretflow.device.device.heu.HEUSkKeeper(heu_id, config, cleartext_type: dtype, encoder)[源代码]#
基类:
HEUActor
Methods:
__init__
(heu_id, config, cleartext_type, encoder)Init heu actor class
dump_pk
(path)Dump public key to the specified file.
decrypt
(data)Decrypt data: ciphertext -> plaintext
decrypt_and_decode
(data[, edr])Decrypt data: ciphertext -> cleartext
h2a_decrypt_make_share
(data_with_mask, ...)H2A: Decrypt the masked data array
- __init__(heu_id, config, cleartext_type: dtype, encoder)[源代码]#
Init heu actor class
- 参数:
heu_id – Heu instance id, globally unique
party – The party id
hekit – hnp.HeKit for sk_keeper or hnp.DestinationHeKit for evaluator
encoder –
Encode cleartext (float value) to plaintext (big int value). available encoders:
phe.IntegerEncoder
phe.FloatEncoder
phe.BigintEncoder
phe.BatchIntegerEncoder
phe.BatchFloatEncoder
- decrypt_and_decode(data: CiphertextArray, edr=None)[源代码]#
Decrypt data: ciphertext -> cleartext
- 参数:
data – ciphertext
edr – encoder
H2A: Decrypt the masked data array
- class secretflow.device.device.heu.HEUEvaluator(heu_id, party: str, config, pk, cleartext_type: dtype, encoder)[源代码]#
基类:
HEUActor
Methods:
__init__
(heu_id, party, config, pk, ...)Init heu actor class
dump
(data, path)Dump data to file.
dump_pk
(path)Dump public key to the specified file.
a2h_sum_shards
(*shards)A2H: get sum of arithmetic shares
h2a_make_share
(data, evaluator_parties, ...)H2A: make share of data, runs on the side (party) where the data resides
- __init__(heu_id, party: str, config, pk, cleartext_type: dtype, encoder)[源代码]#
Init heu actor class
- 参数:
heu_id – Heu instance id, globally unique
party – The party id
hekit – hnp.HeKit for sk_keeper or hnp.DestinationHeKit for evaluator
encoder –
Encode cleartext (float value) to plaintext (big int value). available encoders:
phe.IntegerEncoder
phe.FloatEncoder
phe.BigintEncoder
phe.BatchIntegerEncoder
phe.BatchFloatEncoder
H2A: make share of data, runs on the side (party) where the data resides
- 参数:
data – HeCiphertext array
evaluator_parties –
spu_protocol – part of spu runtime config.
spu_field_type – part of spu runtime config.
spu_fxp_fraction_bits – part of spu runtime config.
- 返回:
Dynamical number of return values, equal to len(evaluator_parties) + 2 Return: spu_meta_info, sk_keeper’s shard, and each evaluator’s shard
- class secretflow.device.device.heu.HEU(config: dict, spu_field_type)[源代码]#
基类:
Device
Homomorphic encryption device
Methods:
__init__
(config, spu_field_type)Initialize HEU
init
()get_participant
(party)Get ray actor by name
has_party
(party)- __init__(config: dict, spu_field_type)[源代码]#
Initialize HEU
- 参数:
config –
HEU init config, for example
{ 'sk_keeper': { 'party': 'alice' }, 'evaluators': [{ 'party': 'bob' }], # The HEU working mode, choose from PHEU / LHEU / FHEU_ROUGH / FHEU 'mode': 'PHEU', # TODO: cleartext_type should be migrated to HeObject. 'encoding': { # DT_I1, DT_I8, DT_I16, DT_I32, DT_I64 or DT_FXP (default) 'cleartext_type': "DT_FXP" # see https://www.secretflow.org.cn/docs/heu/en/getting_started/quick_start.html#id3 for detail # available encoders: # - IntegerEncoder: Plaintext = Cleartext * scale # - FloatEncoder (default): Plaintext = Cleartext * scale # - BigintEncoder: Plaintext = Cleartext # - BatchIntegerEncoder: Plaintext = Pack[Cleartext, Cleartext] # - BatchFloatEncoder: Plaintext = Pack[Cleartext, Cleartext] 'encoder': 'FloatEncoder' } 'he_parameters': { 'schema': 'paillier', 'key_pair': { 'generate': { 'bit_size': 2048, }, } } }
spu_field_type – Field type in spu, Device.to operation requires the data scale of HEU to be aligned with SPU
secretflow.device.device.heu_object#
Classes:
|
HEU Object |
- class secretflow.device.device.heu_object.HEUObject(device, data: ObjectRef, location_party: str, is_plain: bool = False)[源代码]#
基类:
DeviceObject
HEU Object
- data#
The data hold by this Heu object
- location#
The party where the data actually resides
- is_plain#
Is the data encrypted or not
Methods:
__init__
(device, data, location_party[, ...])Abstraction device object base class.
encrypt
([heu_audit_log])Force encrypt if data is plaintext
sum
()Sum of HeObject elements over a given axis.
dump
(path)Dump ciphertext into files.
select_sum
(item)Sum of HEUObject selected elements
batch_select_sum
(item)Sum of HEUObject selected elements
feature_wise_bucket_sum
(subgroup_map, ...[, ...])Sum of HEUObject selected elements
batch_feature_wise_bucket_sum
(subgroup_map, ...)Sum of HEUObject selected elements
- __init__(device, data: ObjectRef, location_party: str, is_plain: bool = False)[源代码]#
Abstraction device object base class.
- 参数:
device (Device) – Device where this object is located.
secretflow.device.device.pyu#
Classes:
|
PYU device object. |
|
PYU is the device doing computation in single domain. |
- class secretflow.device.device.pyu.PYUObject(device: PYU, data: Union[ObjectRef, FedObject])[源代码]#
基类:
DeviceObject
PYU device object.
- data#
Reference to underlying data.
Methods:
__init__
(device, data)Abstraction device object base class.
- class secretflow.device.device.pyu.PYU(party: str)[源代码]#
基类:
Device
PYU is the device doing computation in single domain.
Essentially PYU is a python worker who can execute any python code.
Methods:
__init__
(party)PYU contructor.
dump
(obj, path)load
(path)
secretflow.device.device.register#
Classes:
|
An enumeration. |
Device kernel registry |
Functions:
|
Register device kernel |
|
Dispatch device kernel. |
- class secretflow.device.device.register.DeviceType(value)[源代码]#
基类:
IntEnum
An enumeration.
Attributes:
- PYU = 0#
- SPU = 1#
- TEEU = 2#
- HEU = 3#
- NUM = 4#
- class secretflow.device.device.register.Registrar[源代码]#
基类:
object
Device kernel registry
Methods:
__init__
()register
(device_type, name, op)Register device kernel.
dispatch
(device_type, name, *args, **kwargs)Dispatch device kernel.
- register(device_type: DeviceType, name: str, op: Callable)[源代码]#
Register device kernel.
- 参数:
device_type (DeviceType) – Device type.
name (str) – Op kernel name.
op (Callable) – Op kernel implementaion.
- 抛出:
KeyError – Duplicate device kernel registered.
- dispatch(device_type: DeviceType, name: str, *args, **kwargs)[源代码]#
Dispatch device kernel.
- 参数:
device_type (DeviceType) – Device type.
name (str) – Op kernel name.
- 抛出:
KeyError – Device Kernel not registered.
- 返回:
Kernel execution result.
- secretflow.device.device.register.register(device_type: DeviceType, op_name: Optional[str] = None)[源代码]#
Register device kernel
- 参数:
device_type (DeviceType) – Device type.
op_name (str, optional) – Op kernel name. Defaults to None.
secretflow.device.device.spu#
Classes:
|
The metadata of an SPU value, which is a Numpy array or equivalent. |
|
|
|
|
|
Tell SPU device how to decide num of returns of called function. |
|
|
|
- class secretflow.device.device.spu.SPUValueMeta(shape: ~typing.Sequence[int], dtype: ~numpy.dtype, vtype: <google.protobuf.internal.enum_type_wrapper.EnumTypeWrapper object at 0x7f5f82b8b9d0>, protocol: <google.protobuf.internal.enum_type_wrapper.EnumTypeWrapper object at 0x7f5f82b8be20>, field: <google.protobuf.internal.enum_type_wrapper.EnumTypeWrapper object at 0x7f5f82b8bd90>, fxp_fraction_bits: int)[源代码]#
基类:
object
The metadata of an SPU value, which is a Numpy array or equivalent.
Attributes:
Methods:
__init__
(shape, dtype, vtype, protocol, ...)- shape: Sequence[int]#
- dtype: dtype#
- vtype: <google.protobuf.internal.enum_type_wrapper.EnumTypeWrapper object at 0x7f5f82b8b9d0>#
- protocol: <google.protobuf.internal.enum_type_wrapper.EnumTypeWrapper object at 0x7f5f82b8be20>#
- field: <google.protobuf.internal.enum_type_wrapper.EnumTypeWrapper object at 0x7f5f82b8bd90>#
- fxp_fraction_bits: int#
- __init__(shape: ~typing.Sequence[int], dtype: ~numpy.dtype, vtype: <google.protobuf.internal.enum_type_wrapper.EnumTypeWrapper object at 0x7f5f82b8b9d0>, protocol: <google.protobuf.internal.enum_type_wrapper.EnumTypeWrapper object at 0x7f5f82b8be20>, field: <google.protobuf.internal.enum_type_wrapper.EnumTypeWrapper object at 0x7f5f82b8bd90>, fxp_fraction_bits: int) None #
- class secretflow.device.device.spu.SPUObject(device: Device, meta: Union[ObjectRef, FedObject], shares_name: Sequence[Union[ObjectRef, FedObject]])[源代码]#
基类:
DeviceObject
Methods:
__init__
(device, meta, shares_name)SPUObject refers to a Python Object which could be flattened to a list of SPU Values.
- __init__(device: Device, meta: Union[ObjectRef, FedObject], shares_name: Sequence[Union[ObjectRef, FedObject]])[源代码]#
SPUObject refers to a Python Object which could be flattened to a list of SPU Values. An SPU value is a Numpy array or equivalent. e.g.
1. If referred Python object is [1,2,3] Then meta would be referred to a single SPUValueMeta, and shares is a list of referrence to pieces of share of [1,2,3].
2. If referred Python object is {‘a’: 1, ‘b’: [3, np.array(…)]} The meta would be referred to something like {‘a’: SPUValueMeta1, ‘b’: [SPUValueMeta2, SPUValueMeta3]} Each element of shares would be referred to something like {‘a’: share1, ‘b’: [share2, share3]}
3. shares is a list of ObjectRef to share slices while these share slices are not necessarily located at SPU device. The data transfer would only happen when SPU device consumes SPU objects.
- 参数:
meta – Union[ray.ObjectRef, fed.FedObject]: Ref to the metadata.
shares_name – Sequence[Union[ray.ObjectRef, fed.FedObject]]: names of shares of data in each SPU node.
- class secretflow.device.device.spu.SPUIO(runtime_config: RuntimeConfig, world_size: int)[源代码]#
基类:
object
Methods:
__init__
(runtime_config, world_size)A wrapper of spu.Io.
make_shares
(data, vtype)Convert a Python object to meta and shares of an SPUObject.
reconstruct
(shares[, meta])Convert shares of an SPUObject to the origin Python object.
- __init__(runtime_config: RuntimeConfig, world_size: int) None [源代码]#
A wrapper of spu.Io.
- 参数:
runtime_config (RuntimeConfig) – runtime_config of SPU device.
world_size (int) – world_size of SPU device.
Convert a Python object to meta and shares of an SPUObject.
- 参数:
data (Any) – Any Python object.
vtype (Visibility) – Visibility
- 返回:
meta and shares of an SPUObject
- 返回类型:
Tuple[Any, List[Any]]
- reconstruct(shares: List[Any], meta: Optional[Any] = None) Any [源代码]#
Convert shares of an SPUObject to the origin Python object.
- 参数:
shares (List[Any]) – Shares of an SPUObject
meta (Any) – Meta of an SPUObject. If not provided, sanity check would be skipped.
- 返回:
the origin Python object.
- 返回类型:
Any
- class secretflow.device.device.spu.SPUCompilerNumReturnsPolicy(value)[源代码]#
基类:
Enum
Tell SPU device how to decide num of returns of called function.
Attributes:
num of returns is from compiler result.
If users are sure that returns is a list, they could specify the length of list.
num of returns is fixed to 1.
- FROM_COMPILER = 'from_compiler'#
num of returns is from compiler result.
- FROM_USER = 'from_user'#
If users are sure that returns is a list, they could specify the length of list.
- SINGLE = 'single'#
num of returns is fixed to 1.
- class secretflow.device.device.spu.SPURuntime(rank: int, cluster_def: ~typing.Dict, link_desc: ~typing.Optional[~typing.Dict] = None, log_options: ~spu.libspu.logging.LogOptions = <spu.libspu.logging.LogOptions object>, use_link: bool = True)[源代码]#
基类:
object
Methods:
__init__
(rank, cluster_def[, link_desc, ...])wrapper of spu.Runtime.
infeed_share
(val)outfeed_share
(val)del_share
(val)dump
(meta, val, path)load
(path)run
(num_returns_policy, out_shape, ...)run executable.
a2h
(value, exp_heu_data_type, schema)Convert SPUObject to HEUObject.
psi_df
(key, data, receiver[, protocol, ...])Private set intersection with DataFrame.
psi_csv
(key, input_path, output_path, receiver)Private set intersection with csv file.
psi_join_df
(key, data, receiver, join_party)Private set intersection with DataFrame.
psi_join_csv
(key, input_path, output_path, ...)Private set intersection with csv file.
pir_setup
(server, input_path, key_columns, ...)Private information retrival offline setup phase. :param server: Which party is pir server. :type server: str :param input_path: Server's CSV file path. comma separated and contains header. Use an absolute path. :type input_path: str :param key_columns: Column(s) used as pir key :type key_columns: str, List[str] :param label_columns: Column(s) used as pir label :type label_columns: str, List[str] :param oprf_key_path: Ecc oprf secret key path, 32B binary format. Use an absolute path. :type oprf_key_path: str :param setup_path: Offline/Setup phase output data dir. Use an absolute path. :type setup_path: str :param num_per_query: Items number per query. :type num_per_query: int :param label_max_len: Max number bytes of label, padding data to label_max_len Max label bytes length add 4 bytes(len). :type label_max_len: int.
pir_query
(server, config[, protocol])Private information retrival online query phase. :param server: Which party is pir server. :type server: str :param config: Server/Client config dict For example:.
- __init__(rank: int, cluster_def: ~typing.Dict, link_desc: ~typing.Optional[~typing.Dict] = None, log_options: ~spu.libspu.logging.LogOptions = <spu.libspu.logging.LogOptions object>, use_link: bool = True)[源代码]#
wrapper of spu.Runtime.
- 参数:
rank (int) – rank of runtime
cluster_def (Dict) – config of spu cluster
link_desc (Dict, optional) – link config. Defaults to None.
log_options (spu_logging.LogOptions, optional) – spu log options.
use_link – optional. flag for create brpc link, default True.
- run(num_returns_policy: SPUCompilerNumReturnsPolicy, out_shape, executable: ExecutableProto, *val)[源代码]#
run executable.
- 参数:
executable (spu_pb2.ExecutableProto) – the executable.
*inputs – input vars, need to follow the exec.input_names.
- 返回:
first parts are output vars following the exec.output_names. The last item is metadata.
- 返回类型:
List
- a2h(value, exp_heu_data_type: str, schema)[源代码]#
Convert SPUObject to HEUObject.
- 参数:
tree (PyTreeLeaf) – SPUObject meta info.
exp_heu_data_type (str) – HEU data type.
- 返回:
Array of phe.Plaintext.
- 返回类型:
np.ndarray
- psi_df(key: Union[str, List[str]], data: DataFrame, receiver: str, protocol='KKRT_PSI_2PC', precheck_input=True, sort=True, broadcast_result=True, bucket_size=1048576, curve_type='CURVE_25519', preprocess_path=None, ecdh_secret_key_path=None, dppsi_bob_sub_sampling=0.9, dppsi_epsilon=3, ic_mode: bool = False)[源代码]#
Private set intersection with DataFrame.
- 参数:
key (str, List[str]) – Column(s) used to join.
data (pd.DataFrame) – DataFrame to be joined.
receiver (str) – Which party can get joined data, others will get None.
protocol (str) – PSI protocol, See spu.psi.PsiType.
precheck_input (bool) – Whether to check input data before join.
sort (bool) – Whether sort data by key after join.
broadcast_result (bool) – Whether to broadcast joined data to all parties.
bucket_size (int) – Specified the hash bucket size used in psi. Larger values consume more memory.
curve_type (str) – curve for ecdh psi
preprocess_path (str) – preprocess file path for unbalanced psi.
ecdh_secret_key_path (str) – ecdh_oprf secretkey file path, binary format, 32B.
dppsi_bob_sub_sampling (double) – bob subsampling bernoulli_distribution probability of dp psi
dppsi_epsilon (double) – epsilon of dp psi
ic_mode (bool) – Whether to run psi in interconnection mode
- 返回:
joined DataFrame.
- 返回类型:
pd.DataFrame or None
- psi_csv(key: Union[str, List[str]], input_path: str, output_path: str, receiver: str, protocol='KKRT_PSI_2PC', precheck_input=True, sort=True, broadcast_result=True, bucket_size=1048576, curve_type='CURVE_25519', preprocess_path=None, ecdh_secret_key_path=None, dppsi_bob_sub_sampling=0.9, dppsi_epsilon=3, ic_mode: bool = False)[源代码]#
Private set intersection with csv file.
示例
>>> spu = sf.SPU(utils.cluster_def) >>> alice = sf.PYU('alice'), sf.PYU('bob') >>> input_path = {alice: '/path/to/alice.csv', bob: '/path/to/bob.csv'} >>> output_path = {alice: '/path/to/alice_psi.csv', bob: '/path/to/bob_psi.csv'} >>> spu.psi_csv(['c1', 'c2'], input_path, output_path, 'alice')
- 参数:
key (str, List[str]) – Column(s) used to join.
input_path – CSV file to be joined, comma separated and contains header. Use an absolute path.
output_path – Joined csv file, comma separated and contains header. Use an absolute path.
receiver (str) – Which party can get joined data. Others won’t generate output file and intersection_count get -1. for unbalanced PSI, receiver is client(small dataset party) unbalanced PSI offline phase, receiver(client) get preprocess_path data unbalanced PSI online phase, receiver(client) get psi result unbalanced PSI shuffle online phase, only receiver(large set party) get psi result
protocol (str) – PSI protocol.
precheck_input (bool) – Whether to check input data before join. check input file whether have duplicated data and csv column ids.
sort (bool) – Whether sort data by key after join.
broadcast_result (bool) – Whether to broadcast joined data to all parties.
bucket_size (int) – Specified the hash bucket size used in psi. Larger values consume more memory.
curve_type (str) – curve for ecdh psi
dppsi_bob_sub_sampling (double) – bob subsampling bernoulli_distribution probability of dp psi
dppsi_epsilon (double) – epsilon of dp psi
ic_mode (bool) – Whether to run psi in interconnection mode
- 返回:
PSI report output by SPU.
- 返回类型:
Dict
- psi_join_df(key: Union[str, List[str]], data: DataFrame, receiver: str, join_party: str, protocol='KKRT_PSI_2PC', precheck_input=True, bucket_size=1048576, curve_type='CURVE_25519', ic_mode: bool = False)[源代码]#
Private set intersection with DataFrame.
示例
>>> spu = sf.SPU(utils.cluster_def) >>> spu.psi_join_df(['c1', 'c2'], [df_alice, df_bob], 'alice', 'alice')
- 参数:
key (str, List[str]) – Column(s) used to join.
data (pd.DataFrame) – DataFrame to be joined.
receiver (str) – Which party can get joined data, others will get None.
join_party (str) – party joined data
protocol (str) – PSI protocol, See spu.psi.PsiType.
precheck_input (bool) – Whether to check input data before join.
bucket_size (int) – Specified the hash bucket size used in psi. Larger values consume more memory.
curve_type (str) – curve for ecdh psi
ic_mode (bool) – Whether to run psi in interconnection mode
- 返回:
joined DataFrame.
- 返回类型:
pd.DataFrame or None
- psi_join_csv(key: Union[str, List[str]], input_path: str, output_path: str, receiver: str, join_party: str, protocol='KKRT_PSI_2PC', precheck_input=True, bucket_size=1048576, curve_type='CURVE_25519', ic_mode: bool = False)[源代码]#
Private set intersection with csv file.
示例
>>> spu = sf.SPU(utils.cluster_def) >>> alice = sf.PYU('alice'), sf.PYU('bob') >>> input_path = {alice: '/path/to/alice.csv', bob: '/path/to/bob.csv'} >>> output_path = {alice: '/path/to/alice_psi.csv', bob: '/path/to/bob_psi.csv'} >>> spu.psi_join_csv(['c1', 'c2'], input_path, output_path, 'alice', 'alice')
- 参数:
key (str, List[str]) – Column(s) used to join.
input_path – CSV file to be joined, comma separated and contains header. Use an absolute path.
output_path – Joined csv file, comma separated and contains header. Use an absolute path.
receiver (str) – Which party can get joined data. Others won’t generate output file and intersection_count get -1
join_party (str) – party joined data
protocol (str) – PSI protocol.
precheck_input (bool) – Whether to check input data before join.
bucket_size (int) – Specified the hash bucket size used in psi. Larger values consume more memory.
curve_type (str) – curve for ecdh psi
ic_mode (bool) – Whether to run psi in interconnection mode
- 返回:
PSI report output by SPU.
- 返回类型:
Dict
- pir_setup(server: str, input_path: str, key_columns: Union[str, List[str]], label_columns: Union[str, List[str]], oprf_key_path: str, setup_path: str, num_per_query: int, label_max_len: int, protocol='KEYWORD_PIR_LABELED_PSI')[源代码]#
Private information retrival offline setup phase. :param server: Which party is pir server. :type server: str :param input_path: Server’s CSV file path. comma separated and contains header.
Use an absolute path.
- 参数:
key_columns (str, List[str]) – Column(s) used as pir key
label_columns (str, List[str]) – Column(s) used as pir label
oprf_key_path (str) – Ecc oprf secret key path, 32B binary format. Use an absolute path.
setup_path (str) – Offline/Setup phase output data dir. Use an absolute path.
num_per_query (int) – Items number per query.
label_max_len (int) – Max number bytes of label, padding data to label_max_len Max label bytes length add 4 bytes(len).
- 返回:
PIR report output by SPU.
- 返回类型:
Dict
- pir_query(server: str, config: Dict, protocol='KEYWORD_PIR_LABELED_PSI')[源代码]#
Private information retrival online query phase. :param server: Which party is pir server. :type server: str :param config: Server/Client config dict
For example:
{ # client config alice: { 'input_path': '/path/intput.csv', 'key_columns': 'id', 'output_path': '/path/output.csv', }, # server config bob: { 'oprf_key_path': '/path/oprf_key.bin', 'setup_path': '/path/setup_dir', }, }
- server config dict must have:
‘oprf_key_path’,’setup_path’ oprf_key_path (str): Ecc oprf secret key path, 32B binary format.
Use an absolute path.
setup_path (str): Offline/Setup phase output data dir. Use an absolute path.
- client config dict must have:
‘input_path’,’key_columns’, ‘output_path’ input_path (str): Client’s CSV file path. comma separated and contains header.
Use an absolute path.
key_columns (str, List[str]): Column(s) used as pir key output_path (str): Query result save to output_path, csv format.
- 返回:
PIR report output by SPU.
- 返回类型:
Dict
- class secretflow.device.device.spu.SPU(cluster_def: ~typing.Dict, link_desc: ~typing.Optional[~typing.Dict] = None, log_options: ~spu.libspu.logging.LogOptions = <spu.libspu.logging.LogOptions object>, use_link: bool = True)[源代码]#
基类:
Device
Methods:
__init__
(cluster_def[, link_desc, ...])SPU device constructor.
init
()Init SPU runtime in each party
reset
()Reset spu to clear corrupted internal state, for test only
shutdown
()dump
(obj, paths)load
(paths)infeed_shares
(shares)outfeed_shares
(shares_name)psi_df
(key, dfs, receiver[, protocol, ...])Private set intersection with DataFrame.
psi_csv
(key, input_path, output_path, receiver)Private set intersection with csv file.
psi_join_df
(key, dfs, receiver, join_party)Private set intersection with DataFrame.
psi_join_csv
(key, input_path, output_path, ...)Private set intersection with csv file.
pir_setup
(server, input_path, key_columns, ...)Private information retrival offline setup. :param server: Which party is pir server. :type server: str :param input_path: Server's CSV file path. comma separated and contains header. Use an absolute path. :type input_path: str :param key_columns: Column(s) used as pir key :type key_columns: str, List[str] :param label_columns: Column(s) used as pir label :type label_columns: str, List[str] :param oprf_key_path: Ecc oprf secret key path, 32B binary format. Use an absolute path. :type oprf_key_path: str :param setup_path: Offline/Setup phase output data dir. Use an absolute path. :type setup_path: str :param num_per_query: Items number per query. :type num_per_query: int :param label_max_len: Max number bytes of label, padding data to label_max_len Max label bytes length add 4 bytes(len). :type label_max_len: int.
pir_query
(server, config[, protocol])Private information retrival online query. :param server: Which party is pir server. :type server: str :param config: Server/Client config dict For example.
- __init__(cluster_def: ~typing.Dict, link_desc: ~typing.Optional[~typing.Dict] = None, log_options: ~spu.libspu.logging.LogOptions = <spu.libspu.logging.LogOptions object>, use_link: bool = True)[源代码]#
SPU device constructor.
- 参数:
cluster_def –
SPU cluster definition. More details refer to SPU runtime config.
For example
{ 'nodes': [ { 'party': 'alice', # The address for other peers. 'address': '127.0.0.1:9001', # The listen address of this node. # Optional. Address will be used if listen_address is empty. 'listen_address': '', # Optional. TLS related options. 'tls_opts': { 'server_ssl_opts': { 'certificate_path': 'servercert.pem', 'private_key_path': 'serverkey.pem', # The options used for verify peer's client certificate 'ca_file_path': 'cacert.pem', # Maximum depth of the certificate chain for verification 'verify_depth': 1 }, 'client_ssl_opts': { 'certificate_path': 'clientcert.pem', 'private_key_path': 'clientkey.pem', # The options used for verify peer's server certificate 'ca_file_path': 'cacert.pem', # Maximum depth of the certificate chain for verification 'verify_depth': 1 } } }, { 'party': 'bob', 'address': '127.0.0.1:9002', 'listen_address': '', 'tls_opts': { 'server_ssl_opts': { 'certificate_path': "bob's servercert.pem", 'private_key_path': "bob's serverkey.pem", 'ca_file_path': "other's client cacert.pem", 'verify_depth': 1 }, 'client_ssl_opts': { 'certificate_path': "bob's clientcert.pem", 'private_key_path': "bob's clientkey.pem", 'ca_file_path': "other's server cacert.pem", 'verify_depth': 1 } } }, ], 'runtime_config': { 'protocol': spu.spu_pb2.SEMI2K, 'field': spu.spu_pb2.FM128, 'sigmoid_mode': spu.spu_pb2.RuntimeConfig.SIGMOID_REAL, } }
link_desc –
Optional. A dict specifies the link parameters. Available parameters are:
connect_retry_times
connect_retry_interval_ms
recv_timeout_ms
http_max_payload_size
http_timeout_ms
throttle_window_size
brpc_channel_protocol refer to https://github.com/apache/brpc/blob/master/docs/en/client.md#protocols
brpc_channel_connection_type refer to https://github.com/apache/brpc/blob/master/docs/en/client.md#connection-type
log_options – Optional. Options of spu logging.
use_link – Optional. flag for create brpc link, default True.
- psi_df(key: Union[str, List[str], Dict[Device, List[str]]], dfs: List[PYUObject], receiver: str, protocol='KKRT_PSI_2PC', precheck_input=True, sort=True, broadcast_result=True, bucket_size=1048576, curve_type='CURVE_25519', preprocess_path=None, ecdh_secret_key_path=None, dppsi_bob_sub_sampling=0.9, dppsi_epsilon=3)[源代码]#
Private set intersection with DataFrame.
- 参数:
key (str, List[str], Dict[Device, List[str]]) – Column(s) used to join.
dfs (List[PYUObject]) – DataFrames to be joined, which
runtimes. (should be colocated with SPU) –
receiver (str) – Which party can get joined data, others will get None.
protocol (str) – PSI protocol.
precheck_input (bool) – Whether to check input data before join.
sort (bool) – Whether sort data by key after join.
broadcast_result (bool) – Whether to broadcast joined data to all parties.
bucket_size (int) – Specified the hash bucket size used in psi.
memory. (Larger values consume more) –
curve_type (str) – curve for ecdh psi.
preprocess_path (str) – preprocess file path for unbalanced psi.
ecdh_secret_key_path (str) – ecdh_oprf secretkey file path, binary format, 32B, for unbalanced psi.
dppsi_bob_sub_sampling (double) – bob subsampling bernoulli_distribution probability of dp psi
dppsi_epsilon (double) – epsilon of dp psi
- 返回:
Joined DataFrames with order reserved.
- 返回类型:
List[PYUObject]
- psi_csv(key: Union[str, List[str], Dict[Device, List[str]]], input_path: Union[str, Dict[Device, str]], output_path: Union[str, Dict[Device, str]], receiver: str, protocol='KKRT_PSI_2PC', precheck_input=True, sort=True, broadcast_result=True, bucket_size=1048576, curve_type='CURVE_25519', preprocess_path=None, ecdh_secret_key_path=None, dppsi_bob_sub_sampling=0.9, dppsi_epsilon=3)[源代码]#
Private set intersection with csv file.
- 参数:
key (str, List[str], Dict[Device, List[str]]) – Column(s) used to join.
input_path – CSV files to be joined, comma separated and contains header. Use an absolute path.
output_path – Joined csv files, comma separated and contains header. Use an absolute path.
receiver (str) – Which party can get joined data.
-1. (Others won't generate output file and intersection_count get) –
protocol (str) – PSI protocol.
precheck_input (bool) – Whether check input data before joining,
now (for) –
duplicate. (it will check if key) –
sort (bool) – Whether sort data by key after joining.
broadcast_result (bool) – Whether broadcast joined data to all parties.
bucket_size (int) – Specified the hash bucket size used in psi.
memory. (Larger values consume more) –
curve_type (str) – curve for ecdh psi.
preprocess_path (str) – preprocess file path for unbalanced psi.
ecdh_secret_key_path (str) – ecdh_oprf secretkey file path, binary format, 32B.
dppsi_bob_sub_sampling (double) – bob subsampling bernoulli_distribution probability of dp psi
dppsi_epsilon (double) – epsilon of dp psi
- 返回:
PSI reports output by SPU with order reserved.
- 返回类型:
List[Dict]
- psi_join_df(key: Union[str, List[str], Dict[Device, List[str]]], dfs: List[PYUObject], receiver: str, join_party: str, protocol='KKRT_PSI_2PC', precheck_input=True, bucket_size=1048576, curve_type='CURVE_25519')[源代码]#
Private set intersection with DataFrame.
- 参数:
key (str, List[str], Dict[Device, List[str]]) – Column(s) used to join.
dfs (List[PYUObject]) – DataFrames to be joined, which should be colocated with SPU runtimes.
receiver (str) – Which party can get joined data. Others won’t generate output file and intersection_count get -1
join_party (str) – party can get joined data
protocol (str) – PSI protocol.
precheck_input (bool) – Whether check input data before joining, for now, it will check if key duplicate.
bucket_size (int) – Specified the hash bucket size used in psi. Larger values consume more memory.
curve_type (str) – curve for ecdh psi
- 返回:
Joined DataFrames with order reserved.
- 返回类型:
List[PYUObject]
- psi_join_csv(key: Union[str, List[str], Dict[Device, List[str]]], input_path: Union[str, Dict[Device, str]], output_path: Union[str, Dict[Device, str]], receiver: str, join_party: str, protocol='KKRT_PSI_2PC', precheck_input=True, bucket_size=1048576, curve_type='CURVE_25519')[源代码]#
Private set intersection with csv file.
- 参数:
key (str, List[str], Dict[Device, List[str]]) – Column(s) used to join.
input_path – CSV files to be joined, comma separated and contains header. Use an absolute path.
output_path – Joined csv files, comma separated and contains header. Use an absolute path.
receiver (str) – Which party can get joined data. Others won’t generate output file and intersection_count get -1
join_party (str) – party can get joined data
protocol (str) – PSI protocol.
precheck_input (bool) – Whether check input data before joining, for now, it will check if key duplicate.
bucket_size (int) – Specified the hash bucket size used in psi. Larger values consume more memory.
curve_type (str) – curve for ecdh psi
- 返回:
PSI reports output by SPU with order reserved.
- 返回类型:
List[Dict]
- pir_setup(server: str, input_path: Union[str, Dict[Device, str]], key_columns: Union[str, List[str]], label_columns: Union[str, List[str]], oprf_key_path: str, setup_path: str, num_per_query: int, label_max_len: int, protocol='KEYWORD_PIR_LABELED_PSI')[源代码]#
Private information retrival offline setup. :param server: Which party is pir server. :type server: str :param input_path: Server’s CSV file path. comma separated and contains header.
Use an absolute path.
- 参数:
key_columns (str, List[str]) – Column(s) used as pir key
label_columns (str, List[str]) – Column(s) used as pir label
oprf_key_path (str) – Ecc oprf secret key path, 32B binary format. Use an absolute path.
setup_path (str) – Offline/Setup phase output data dir. Use an absolute path.
num_per_query (int) – Items number per query.
label_max_len (int) – Max number bytes of label, padding data to label_max_len Max label bytes length add 4 bytes(len).
- 返回:
PIR report output by SPU.
- 返回类型:
Dict
- pir_query(server: str, config: Dict, protocol='KEYWORD_PIR_LABELED_PSI')[源代码]#
Private information retrival online query. :param server: Which party is pir server. :type server: str :param config: Server/Client config dict
For example
{ # client config alice: { 'input_path': '/path/intput.csv', 'key_columns': 'id', 'output_path': '/path/output.csv', }, # server config bob: { 'oprf_key_path': '/path/oprf_key.bin', 'setup_path': '/path/setup_dir', }, }
- server config dict must have:
‘oprf_key_path’,’setup_path’ oprf_key_path (str): Ecc oprf secret key path, 32B binary format.
Use an absolute path.
setup_path (str): Offline/Setup phase output data dir. Use an absolute path.
- client config dict must have:
‘input_path’,’key_columns’, ‘output_path’ input_path (str): Client’s CSV file path. comma separated and contains header.
Use an absolute path.
key_columns (str, List[str]): Column(s) used as pir key output_path (str): Query result save to output_path, csv format.
- 返回:
PIR report output by SPU.
- 返回类型:
Dict
secretflow.device.device.teeu#
Classes:
|
Input/output data for teeu. |
|
|
|
The teeu worker which runs inside TEE as an actor. |
|
TEEU is the python processing uint of TEE. |
- class secretflow.device.device.teeu.TEEUData(data: Any, data_uuid: str, nonce: Optional[bytes] = None, aad: Optional[bytes] = None)[源代码]#
基类:
object
Input/output data for teeu.
Attributes:
The underlying data, can be plaintext or ciphertext (encrypted with AES256-GCM).
The uuid of data for authority manager.
The nonce of AES-GCM.
The associated data of AES-GCM.
Methods:
__init__
(data, data_uuid[, nonce, aad])- data: Any#
The underlying data, can be plaintext or ciphertext (encrypted with AES256-GCM).
- data_uuid: str#
The uuid of data for authority manager.
- nonce: bytes = None#
The nonce of AES-GCM.
- aad: bytes = None#
The associated data of AES-GCM.
- __init__(data: Any, data_uuid: str, nonce: Optional[bytes] = None, aad: Optional[bytes] = None) None #
- class secretflow.device.device.teeu.TEEUObject(device: TEEU, data: Union[ObjectRef, FedObject])[源代码]#
基类:
DeviceObject
- data#
a reference to TEEUData.
Methods:
__init__
(device, data)Abstraction device object base class.
- class secretflow.device.device.teeu.TEEUWorker(auth_host: str, auth_mr_enclave: str, auth_ca_cert: Optional[str] = None, tls_cert: Optional[str] = None, tls_key: Optional[str] = None, simluation: bool = False)[源代码]#
基类:
object
The teeu worker which runs inside TEE as an actor.
Methods:
__init__
(auth_host, auth_mr_enclave[, ...])run
(func, *args, **kwargs)
- class secretflow.device.device.teeu.TEEU(party: str, mr_enclave: str)[源代码]#
基类:
Device
TEEU is the python processing uint of TEE.
TEEU is designed to run python function in TEE and allows doing some computation safely. The input data of TEEU will be encrypted and nobody can open it unless TEEU itself. But be careful that the result of the function is plaintext by now, that means all parties can read the result. Please be cautious unless you are very aware of the risk.
- party#
the party this TEEU belongs to.
- mr_enclave#
the measurement of the TEEU enclave.
示例
>>> # Here is an example showing alice and bob calculate their average. >>> alice = PYU('alice') >>> bob = PYU('bob') >>> teeu = TEEU('carol', mr_enclave='the mr_enclave of TEEU.') >>> def average(data): >>> return np.average(data, axis=0) >>> a = self.alice(lambda: np.random.random([2, 4]))() >>> b = self.bob(lambda: np.random.random([2, 4]))() >>> a_tee = a.to(teeu, allow_funcs=average) >>> b_tee = b.to(teeu, allow_funcs=average) >>> avg_val = teeu(average)([a_tee, b_tee])
Methods:
__init__
(party, mr_enclave)Init function.
secretflow.device.device.type_traits#
Functions:
|
Fixed point integer default precision bits |
|
Fixed point integer size in bytes |
|
|
|
|
|
- secretflow.device.device.type_traits.spu_fxp_precision(field_type)[源代码]#
Fixed point integer default precision bits