{
"cells": [
{
"cell_type": "markdown",
"id": "1701e451-b3a0-45b7-a6e4-50928fbf1636",
"metadata": {
"tags": []
},
"source": [
"# DataFrame"
]
},
{
"cell_type": "markdown",
"id": "c19b35a5",
"metadata": {},
"source": [
">The following codes are demos only. It's **NOT for production** due to system security concerns, please **DO NOT** use it directly in production."
]
},
{
"cell_type": "markdown",
"id": "d4bd0033",
"metadata": {},
"source": [
"It is recommended to use [jupyter](https://jupyter.org/) to run this tutorial."
]
},
{
"cell_type": "markdown",
"id": "7ad971ee-e595-42ab-a1af-60269a82c6f8",
"metadata": {},
"source": [
"Secretflow provides federated data encapsulation in the form of DataFrame. DataFrame is composed of data blocks of multiple parties and supports horizontal or vertical partitioned data.\n",
"\n",
"\n",
"\n",
"Currently secretflow.DataFrame provides a subset of pandas operations, which are basically the same as pandas. During the calculation process, the original data is kept in the data holder and will not go out of the domain.\n",
"\n",
"\n",
"\n",
"The following will demonstrate how to use a DataFrame."
]
},
{
"cell_type": "markdown",
"id": "2e1bee0f-115a-4ba1-8ffb-06fa9f7aac10",
"metadata": {
"tags": []
},
"source": [
"## Preparation\n",
"\n",
"Initialize secretflow and create three parties alice, bob and carol."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "34143d30-4177-4470-88a1-b1d0fd96671d",
"metadata": {},
"outputs": [],
"source": [
"import secretflow as sf\n",
"\n",
"# In case you have a running secretflow runtime already.\n",
"sf.shutdown()\n",
"\n",
"sf.init(['alice', 'bob', 'carol'], address='local')\n",
"alice, bob, carol = sf.PYU('alice'), sf.PYU('bob'), sf.PYU('carol')"
]
},
{
"cell_type": "markdown",
"id": "f4a3923c-d722-4d77-a9c1-d405fd8800d3",
"metadata": {
"tags": []
},
"source": [
"## Data preparation"
]
},
{
"cell_type": "markdown",
"id": "89e73a98-322e-4afe-a03a-5da7e9e30671",
"metadata": {},
"source": [
"Here we use [iris](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_iris.html) as example data."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "372a0d20-e081-460f-9850-d62c8550c146",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n", " | sepal length (cm) | \n", "sepal width (cm) | \n", "petal length (cm) | \n", "petal width (cm) | \n", "target | \n", "
---|---|---|---|---|---|
0 | \n", "5.1 | \n", "3.5 | \n", "1.4 | \n", "0.2 | \n", "0 | \n", "
1 | \n", "4.9 | \n", "3.0 | \n", "1.4 | \n", "0.2 | \n", "0 | \n", "
2 | \n", "4.7 | \n", "3.2 | \n", "1.3 | \n", "0.2 | \n", "0 | \n", "
3 | \n", "4.6 | \n", "3.1 | \n", "1.5 | \n", "0.2 | \n", "0 | \n", "
4 | \n", "5.0 | \n", "3.6 | \n", "1.4 | \n", "0.2 | \n", "0 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
145 | \n", "6.7 | \n", "3.0 | \n", "5.2 | \n", "2.3 | \n", "2 | \n", "
146 | \n", "6.3 | \n", "2.5 | \n", "5.0 | \n", "1.9 | \n", "2 | \n", "
147 | \n", "6.5 | \n", "3.0 | \n", "5.2 | \n", "2.0 | \n", "2 | \n", "
148 | \n", "6.2 | \n", "3.4 | \n", "5.4 | \n", "2.3 | \n", "2 | \n", "
149 | \n", "5.9 | \n", "3.0 | \n", "5.1 | \n", "1.8 | \n", "2 | \n", "
150 rows × 5 columns
\n", "