Create a Data Owner client.
# Google Colab
do_client = login_do(email="owner@example.com")
# Jupyter Lab (local)
do_client = login_do(email="owner@example.com", token_path="path/to/token.json")Create a Data Scientist client.
# Google Colab
ds_client = login_ds(email="scientist@example.com")
# Jupyter Lab (local)
ds_client = login_ds(email="scientist@example.com", token_path="path/to/token.json")The email address of the client.
Get the list of peers. Auto-syncs before returning.
- DO: Returns approved peers followed by pending peer requests.
- DS: Returns all connected peers.
Returns a PeerList.
Get the list of jobs. Auto-syncs before returning.
Returns a JobsList.
Get the dataset manager. Auto-syncs before returning.
Returns a SyftDatasetManager. Use .get_all() or .get(name, datasite) to query datasets.
Request a peer connection.
- DS calls this to request access to a DO.
- The DO must approve the request before syncing is enabled.
ds_client.add_peer("owner@example.com")Reload the peer list from the transport layer.
Approve a pending peer request. DO only.
do_client.approve_peer_request("scientist@example.com")Reject a pending peer request. DO only.
do_client.reject_peer_request("scientist@example.com")Sync local state with Google Drive.
- DO: Pulls incoming messages from approved peers and optionally creates a checkpoint.
- DS: Pushes pending changes and pulls results from peers.
client.sync()client.create_dataset(name, mock_path, private_path=None, summary=None, users=None, upload_private=False)
Create and upload a dataset. DO only.
mock_path: Path to public mock data (shared with approved peers).private_path: Path to private data (never leaves the DO).users: List of emails to share with, or"any"for all approved peers.
do_client.create_dataset(
name="my dataset",
mock_path="/path/to/mock.csv",
private_path="/path/to/private.csv",
summary="Example dataset",
users=["scientist@example.com"],
)Delete a dataset. DO only.
do_client.delete_dataset(name="my dataset", datasite="owner@example.com")Share an existing dataset with additional users. DO only.
tag: Dataset name.users: List of email addresses or"any".
do_client.share_dataset("my dataset", users=["new_user@example.com"])Submit a Python job to a Data Owner. DS only.
user: DO email to submit the job to.code_path: Path to a Python script or folder.entrypoint: Entry script (auto-detected ifmain.pyexists in folder).
ds_client.submit_python_job(
user="owner@example.com",
code_path="/path/to/script.py",
)Submit a bash job to a Data Owner. DS only.
ds_client.submit_bash_job(
user="owner@example.com",
code_path="/path/to/script.sh",
)Run all approved jobs. DO only.
stream_output: Stream stdout/stderr in real-time.timeout: Timeout in seconds per job (default: 300).force_execution: Skip version compatibility checks.
do_client.process_approved_jobs()Delete all SyftBox state: Google Drive files, local caches, and local folder.
broadcast_delete_events: Notify approved peers about deleted files before cleanup.