secretflow.data.mix#
Classes:
|
Mixed DataFrame consisting of HDataFrame/VDataFrame. |
|
The partitioning. |
- class secretflow.data.mix.MixDataFrame(partitions: Optional[Tuple[Union[HDataFrame, VDataFrame]]] = None)[源代码]#
基类:
object
Mixed DataFrame consisting of HDataFrame/VDataFrame.
MixDataFrame provides two perspectives based on how the data is partitioned. Let’s illustrate with an example, assuming the following partitions: alice_part0, alice_part1, bob, carol, dave_part0/dave_part1.
Among them, (alice_part0, bob, dave_part0) is aligned, (alice_part1, carol, dave_part1) is aligned.
col1
col2, col3
col4, col5
alice_part0
bob
dave_part0
alice_part1
carol
dave_part1
1. If horizontal partitioned(PartitionWay.HORIZONTAL), the perspective of the mixed DataFrame is as follows:
col1, col2, col3, col4, col5
alice_part0, bob, dave_part0
alice_part1, carol, dave_part1
2. If vertical partitioned(PartitionWay.VERTICAL), the perspective of the mixed DataFrame is as follows:
col1
col2, col3
col4, col5
alice_part0 alice_part1
bob carol
dave_part0 dave_part1
MixDataFrame has the following characteristics.
1. Multiple Partitions corresponding to a column can be provided by different parties or by the same party.
The number of Partitions corresponding to each column is the same
The number of aligned Partition samples is the same.
Attributes:
The blocks that make up a mixed DataFrame.
Data partitioning.
Returns the dtypes in the DataFrame.
The column labels of the DataFrame.
Returns a tuple representing the dimensionality of the DataFrame.
Methods:
mean
(*args, **kwargs)Returns the mean of the values over the requested axis.
min
(*args, **kwargs)Returns the min of the values over the requested axis.
max
(*args, **kwargs)Returns the max of the values over the requested axis.
count
(*args, **kwargs)Count non-NA cells for each column or row.
isna
()quantile
([q, axis])kurtosis
(*args, **kwargs)skew
(*args, **kwargs)sem
(*args, **kwargs)std
(*args, **kwargs)var
(*args, **kwargs)astype
(dtype[, copy, errors])Cast object to a specified dtype
dtype
.copy
()Shallow copy of this dataframe.
drop
([labels, axis, index, columns, level, ...])Drop specified labels from rows or columns.
fillna
([value, method, axis, inplace, ...])Fill NA/NaN values using the specified method.
__init__
([partitions])- partitions: Tuple[Union[HDataFrame, VDataFrame]] = None#
The blocks that make up a mixed DataFrame. Shall all be HDataFrame or VDataFrame, and shall not be mixed.
- property partition_way: PartitionWay#
Data partitioning.
- mean(*args, **kwargs) Series [源代码]#
Returns the mean of the values over the requested axis.
All arguments are same with
pandas.DataFrame.mean()
.- 返回:
pd.Series
- min(*args, **kwargs) Series [源代码]#
Returns the min of the values over the requested axis.
All arguments are same with
pandas.DataFrame.min()
.- 返回:
pd.Series
- max(*args, **kwargs) Series [源代码]#
Returns the max of the values over the requested axis.
All arguments are same with
pandas.DataFrame.max()
.- 返回:
pd.Series
- count(*args, **kwargs) Series [源代码]#
Count non-NA cells for each column or row.
All arguments are same with
pandas.DataFrame.count()
.- 返回:
pd.Series
- property values#
- property dtypes: Series#
Returns the dtypes in the DataFrame.
- 返回:
the data type of each column.
- 返回类型:
pd.Series
- astype(dtype, copy: bool = True, errors: str = 'raise')[源代码]#
Cast object to a specified dtype
dtype
.All args are same as
pandas.DataFrame.astype()
.
- property columns#
The column labels of the DataFrame.
- property shape: Tuple#
Returns a tuple representing the dimensionality of the DataFrame.
- copy() MixDataFrame [源代码]#
Shallow copy of this dataframe.
- 返回:
MixDataFrame.
- drop(labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise') Optional[MixDataFrame] [源代码]#
Drop specified labels from rows or columns.
All arguments are same with
pandas.DataFrame.drop()
.- 返回:
MixDataFrame without the removed index or column labels or None if inplace=True.
- fillna(value=None, method=None, axis=None, inplace=False, limit=None, downcast=None) Optional[MixDataFrame] [源代码]#
Fill NA/NaN values using the specified method.
All arguments are same with
pandas.DataFrame.fillna()
.- 返回:
MixDataFrame with missing values filled or None if inplace=True.
- __init__(partitions: Optional[Tuple[Union[HDataFrame, VDataFrame]]] = None) None #
- class secretflow.data.mix.PartitionWay(value)[源代码]#
基类:
Enum
The partitioning. HORIZONTAL: horizontal partitioning. VERATICAL: vertical partitioning.
Attributes:
- HORIZONTAL = 'horizontal'#
- VERTICAL = 'vertical'#
secretflow.data.mix.dataframe#
Classes:
|
The partitioning. |
|
Mixed DataFrame consisting of HDataFrame/VDataFrame. |
- class secretflow.data.mix.dataframe.PartitionWay(value)[源代码]#
基类:
Enum
The partitioning. HORIZONTAL: horizontal partitioning. VERATICAL: vertical partitioning.
Attributes:
- HORIZONTAL = 'horizontal'#
- VERTICAL = 'vertical'#
- class secretflow.data.mix.dataframe.MixDataFrame(partitions: Optional[Tuple[Union[HDataFrame, VDataFrame]]] = None)[源代码]#
基类:
object
Mixed DataFrame consisting of HDataFrame/VDataFrame.
MixDataFrame provides two perspectives based on how the data is partitioned. Let’s illustrate with an example, assuming the following partitions: alice_part0, alice_part1, bob, carol, dave_part0/dave_part1.
Among them, (alice_part0, bob, dave_part0) is aligned, (alice_part1, carol, dave_part1) is aligned.
col1
col2, col3
col4, col5
alice_part0
bob
dave_part0
alice_part1
carol
dave_part1
1. If horizontal partitioned(PartitionWay.HORIZONTAL), the perspective of the mixed DataFrame is as follows:
col1, col2, col3, col4, col5
alice_part0, bob, dave_part0
alice_part1, carol, dave_part1
2. If vertical partitioned(PartitionWay.VERTICAL), the perspective of the mixed DataFrame is as follows:
col1
col2, col3
col4, col5
alice_part0 alice_part1
bob carol
dave_part0 dave_part1
MixDataFrame has the following characteristics.
1. Multiple Partitions corresponding to a column can be provided by different parties or by the same party.
The number of Partitions corresponding to each column is the same
The number of aligned Partition samples is the same.
Attributes:
The blocks that make up a mixed DataFrame.
Data partitioning.
Returns the dtypes in the DataFrame.
The column labels of the DataFrame.
Returns a tuple representing the dimensionality of the DataFrame.
Methods:
mean
(*args, **kwargs)Returns the mean of the values over the requested axis.
min
(*args, **kwargs)Returns the min of the values over the requested axis.
max
(*args, **kwargs)Returns the max of the values over the requested axis.
count
(*args, **kwargs)Count non-NA cells for each column or row.
isna
()quantile
([q, axis])kurtosis
(*args, **kwargs)skew
(*args, **kwargs)sem
(*args, **kwargs)std
(*args, **kwargs)var
(*args, **kwargs)astype
(dtype[, copy, errors])Cast object to a specified dtype
dtype
.copy
()Shallow copy of this dataframe.
drop
([labels, axis, index, columns, level, ...])Drop specified labels from rows or columns.
fillna
([value, method, axis, inplace, ...])Fill NA/NaN values using the specified method.
__init__
([partitions])- partitions: Tuple[Union[HDataFrame, VDataFrame]] = None#
The blocks that make up a mixed DataFrame. Shall all be HDataFrame or VDataFrame, and shall not be mixed.
- property partition_way: PartitionWay#
Data partitioning.
- mean(*args, **kwargs) Series [源代码]#
Returns the mean of the values over the requested axis.
All arguments are same with
pandas.DataFrame.mean()
.- 返回:
pd.Series
- min(*args, **kwargs) Series [源代码]#
Returns the min of the values over the requested axis.
All arguments are same with
pandas.DataFrame.min()
.- 返回:
pd.Series
- max(*args, **kwargs) Series [源代码]#
Returns the max of the values over the requested axis.
All arguments are same with
pandas.DataFrame.max()
.- 返回:
pd.Series
- count(*args, **kwargs) Series [源代码]#
Count non-NA cells for each column or row.
All arguments are same with
pandas.DataFrame.count()
.- 返回:
pd.Series
- property values#
- property dtypes: Series#
Returns the dtypes in the DataFrame.
- 返回:
the data type of each column.
- 返回类型:
pd.Series
- astype(dtype, copy: bool = True, errors: str = 'raise')[源代码]#
Cast object to a specified dtype
dtype
.All args are same as
pandas.DataFrame.astype()
.
- property columns#
The column labels of the DataFrame.
- property shape: Tuple#
Returns a tuple representing the dimensionality of the DataFrame.
- copy() MixDataFrame [源代码]#
Shallow copy of this dataframe.
- 返回:
MixDataFrame.
- drop(labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise') Optional[MixDataFrame] [源代码]#
Drop specified labels from rows or columns.
All arguments are same with
pandas.DataFrame.drop()
.- 返回:
MixDataFrame without the removed index or column labels or None if inplace=True.
- fillna(value=None, method=None, axis=None, inplace=False, limit=None, downcast=None) Optional[MixDataFrame] [源代码]#
Fill NA/NaN values using the specified method.
All arguments are same with
pandas.DataFrame.fillna()
.- 返回:
MixDataFrame with missing values filled or None if inplace=True.
- __init__(partitions: Optional[Tuple[Union[HDataFrame, VDataFrame]]] = None) None #