Merge branch 'main' into kmeans

NVIDIA · Feb 21, 2025 · c3e359c · c3e359c
2 parents 82ebb21 + 8662a98
commit c3e359c
Show file tree

Hide file tree

Showing 65 changed files with 4,686 additions and 145 deletions.
diff --git a/...erated_computing_platform/03.1_federated_computing_architecture/system_architecture.ipynb b/...erated_computing_platform/03.1_federated_computing_architecture/system_architecture.ipynb
@@ -85,12 +85,16 @@
     "From top to bottom, FCI has the following layers:\n",
     "\n",
     "* **API Layer**: This is the API exposed to application developers, like Communicator and Cellnet.\n",
+    "\n",
     "* **Streamable Framed Message (SFM)**: This is the core of FCI and it provides abstraction on top of different communication protocols. It manages endpoints and connections.\n",
+    "\n",
     "* **Transport Drivers**: This layer is responsible for sending frames to other endpoints. It treats the frame as opaque bytes.\n",
+    "One can use one of driver out of box such as gRPC, TCP, HTTP/Websocket. One can also develop custom driver for alternative protocols. Switch driver will not affect the application layers \n",
     "\n",
     "<img src=\"./fci.png\" alt=\"FLARE Communication Interface\" width=\"300\" height=\"400\">\n",
     "\n",
-    "## Federated Computing Architecture\n",
+    "\n",
+    "## Federated Job Processing Architecture\n",
     "\n",
     "There are two parent control processes with corresponding job processes on each site. This enables support of concurrent, multi-job processes.\n",
     "\n",

diff --git a/...r-6_Security_in_federated_compute_system/06.0_introduction/federated_policy.png b/...r-6_Security_in_federated_compute_system/06.0_introduction/federated_policy.png
diff --git a/...ty_in_federated_compute_system/06.0_introduction/filters_and_privacy_policy.png b/...ty_in_federated_compute_system/06.0_introduction/filters_and_privacy_policy.png
diff --git a/...ivacy/chapter-6_Security_in_federated_compute_system/06.0_introduction/introduction.ipynb b/...ivacy/chapter-6_Security_in_federated_compute_system/06.0_introduction/introduction.ipynb
@@ -1,11 +1,206 @@
 {
  "cells": [
   {
-   "cell_type": "code",
-   "execution_count": null,
+   "cell_type": "markdown",
    "id": "ceca45d8-437c-44ae-8ed9-7a784983731f",
    "metadata": {},
-   "outputs": [],
+   "source": [
+    "# Security in NVIDIA FLARE Federated Computing Systems \n",
+    "\n",
+    "\n",
+    "### Critical Security Concerns in Federated Learning System\n",
+    "\n",
+    "#### Data Privacy\n",
+    "* Model inversion attacks (reconstructing training data from model parameters)\n",
+    "* Membership inference attacks (determining if specific data was used in training)\n",
+    "* Property inference attacks (learning properties about training data)\n",
+    "* Gradient leakage during parameter sharing\n",
+    "\n",
+    "#### System Security\n",
+    "* Authentication of participants\n",
+    "* Man-in-the-middle attacks\n",
+    "* Sybil attacks (malicious entities creating multiple fake identities)\n",
+    "* Denial of Service (DoS) attacks\n",
+    "* Network security during model/gradient transmission\n",
+    "\n",
+    "#### Model Security\n",
+    "* Model poisoning attacks\n",
+    "* Backdoor attacks\n",
+    "* Model stealing/extraction\n",
+    "* Adversarial attacks on the trained model\n",
+    "\n",
+    "#### Participant Privacy\n",
+    "* Protection of participant identities\n",
+    "* Confidentiality of participation in the FL system\n",
+    "* Protection of organizational intellectual property\n",
+    "\n",
+    "#### Computation Integrity\n",
+    "* Verification of correct computation by participants\n",
+    "* Detection of malicious or faulty updates\n",
+    "* Ensuring honest execution of the FL protocol\n",
+    "\n",
+    "#### Access Control\n",
+    "* Role-based access control\n",
+    "* Resource usage control\n",
+    "* Model access permissions\n",
+    "* Data access restrictions\n",
+    "\n",
+    "#### Regulatory Compliance\n",
+    "* Adherence to data protection regulations (GDPR, HIPAA, etc.)\n",
+    "* Cross-border data governance\n",
+    "* Audit trails and accountability\n",
+    "\n",
+    "#### Infrastructure Security\n",
+    "* Edge device security\n",
+    "* Server security\n",
+    "* Communication channel security\n",
+    "* Storage security for model checkpoints\n",
+    "\n",
+    "#### Trust Management\n",
+    "* Reputation systems for participants\n",
+    "* Trust establishment between parties\n",
+    "* Verification of participant legitimacy\n",
+    "\n",
+    "#### Aggregation Security\n",
+    "* Secure aggregation protocols\n",
+    "* Protection against colluding participants\n",
+    "* Byzantine-robust aggregation\n",
+    "\n",
+    "----------\n",
+    "\n",
+    "### Security Mechanisms in Federated Learning System\n",
+    "\n",
+    "A Federated Computing System requires robust security mechanisms to ensure that only legitimate and trusted participants can contribute, while also protecting communication channels and enforcing authorization policies. Below are the critical security components of an Federated Learning system:\n",
+    "\n",
+    "\n",
+    "* **Authentication**\n",
+    "\n",
+    "Ensures communicating parties have sufficient confidence about each other's identities: everyone is who they claim to be.\n",
+    "\n",
+    "* **Authorization** \n",
+    "\n",
+    "Ensures that users can only perform actions they are authorized to do.\n",
+    "\n",
+    "Due to the distributed nature of federated computing systems, additional authentication and authorization are needed for each participating organization. \n",
+    "\n",
+    "You can learn how NVIDIA FLARE implements these through event-based Federated Authentication and Authorization.\n",
+    "\n",
+    "* **Privacy Protection**: \n",
+    "\n",
+    "\n",
+    "Privacy protection in Federated Learning (FL) refers to techniques and mechanisms that ensure sensitive user data remains private while enabling collaborative model training across decentralized devices or servers. Since FL involves training models without sharing raw data, privacy protection is crucial to prevent information leakage from model updates.\n",
+    "\n",
+    "\n",
+    "We have introduced different privacy-enhancing technologies (PETs) in [Chapter 5](../../chapter-5_Privacy_In_Federated_Learning/05.0_introduction/introduction.ipynb). Here, we are going to explore privacy protection mechanisms at the organization level. \n",
+    "\n",
+    "* **Trust-based Security** \n",
+    "\n",
+    "Trust-based mechanisms add another layer of protection to the security system by leveraging confidential computing's VM-based trusted execution environment (TEE). NVIDIA FLARE will enable end-to-end confidential federated AI. We will briefly touch on this topic in this chapter, with more details to be added in the future. \n",
+    "\n",
+    "* **Communication Security**\n",
+    "\n",
+    "Uses secure protocols – TLS for secure transmission. FLARE supports both mutual TLS (mTLS) as well as normal TLS with signed messages.\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c7a65a24",
+   "metadata": {},
+   "source": [
+    "# NVIDIA FLARE Security Architecture\n",
+    "\n",
+    "NVFLARE is an application that runs in the IT environment of each participating site. The overall security of this application is a combination of security measures implemented within the application and those provided by the site's IT infrastructure.\n",
+    "\n",
+    "\n",
+    "NVFLARE implements security measures in the following areas:\n",
+    "\n",
+    "* **Identity Security**: the authentication and authorization of communicating parties\n",
+    "\n",
+    "* **Site Policy Management**: the policies for resource management, authorization, and privacy protection defined by each site\n",
+    "\n",
+    "* **Communication Security**: the confidentiality of data communication messages\n",
+    "\n",
+    "* **Message Serialization**: techniques for ensuring safe serialization/deserialization process between communicating parties\n",
+    "\n",
+    "* **Data Privacy Protection**: techniques for preventing local data from being leaked and/or reverse-engineered\n",
+    "\n",
+    "\n",
+    "All other security concerns must be handled by the site’s IT security infrastructure. The security framework does not operate in vacuum; we assume that physical security is already in place for all participating server and client machines. TLS provides the authentication mechanism within the trusted environments.\n",
+    "\n",
+    "\n",
+    "--- \n",
+    "\n",
+    "## Terminologies and Roles\n",
+    "### Terminologies\n",
+    "NVIDIA FLARE uses the following terminology:\n",
+    "\n",
+    "* Project: A federated learning study with identified participants\n",
+    "* Org: An organization that participates in the study\n",
+    "* Site: The computing system that runs NVIDIA FLARE application as part of the study. There are two kinds of sites: Server and Clients. Each site belongs to an organization.\n",
+    "* FL Server: An application running on a Server site responsible for client coordination based on federation workflows\n",
+    "* FL Client: An application running on a client site that responds to the Server's task assignments and performs learning actions based on its local data\n",
+    "* User: A human that participates in the FL project\n",
+    "\n",
+    "### Roles\n",
+    "\n",
+    "A role defines a type of users that have certain privileges of system operations. Each user is assigned a role in the project. There are four defined roles: Project Admin, Org Admin, Lead Researcher, and Member Researcher.\n",
+    "\n",
+    "* Project Admin Role: The Project Admin is responsible for provisioning the participants and coordinating personnel from all sites for the project. There is only one Project Admin for each project.\n",
+    "\n",
+    "* Org Admin Role: This role is responsible for the management of the sites of his/her organization.\n",
+    "\n",
+    "* Lead Researcher Role: This role can be configured with a higher level of privileges for a scientist within an organization who collaborates with other researchers to ensure the project's success.\n",
+    "\n",
+    "* Member Researcher Role: This role can be configured with a lower level of privileges for a scientist who works with the Lead Researcher to ensure their site is properly prepared for the project\n",
+    "\n",
+    "* FLARE Console: A console application running on a user’s machine that allows the user to perform NVFLARE system operations with a command line interface.\n",
+    "\n",
+    "\n",
+    "## Security Architecture\n",
+    "\n",
+    "NVIDIA FLARE uses PKI for identity authentication and TLS for data transmission, in addition to the following security mechanisms:\n",
+    "\n",
+    "* Filter mechanism and local organization privacy policy\n",
+    "* Federated Authorization - allows local control of authorization rules\n",
+    "* Site-specific authentication - each site can have custom local authenticators\n",
+    "* Privacy Algorithms:\n",
+    "    * Differential privacy\n",
+    "    * Homomorphic Encryption\n",
+    "    * Multi-party computing (Private Set Intersection)\n",
+    "* Confidential Computing\n",
+    "\n",
+    "<img src=\"./federated_policy.png\" alt=\"Security Architecture\" width=\"60%\"/>  \n",
+    "\n",
+    "\n",
+    "<img src=\"./filters_and_privacy_policy.png\" alt=\"Security Architecture\" width=\"60%\"/>\n",
+    "\n",
+    "\n",
+    "\n",
+    "In this chapter, we will cover all these security mechanisms\n",
+    "\n",
+    "[6.1 Identity Security](../06.1_identity_security/identity_security.ipynb)\n",
+    " \n",
+    "[6.2 site security and privacy Policy](../06.2_site_security_privacy_policy/site_policy.ipynb)\n",
+    "  \n",
+    "[6.3 Customized site security](../06.3_customized_site_security/customized_site_security.ipynb) \n",
+    "\n",
+    "[6.4 Communication Security](../06.4_communication_security/communication_security.ipynb)   \n",
+    "\n",
+    "[6.5 Message Serialization](../06.5_message_serialization/message_serialization.ipynb)\n",
+    "\n",
+    "[6.6 Trust-based Security](../06.6_trust_based_security/trust_based_security.ipynb)\n",
+    "\n",
+    "\n",
+    " \n",
+    "\n",
+    "\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "1db61f71",
+   "metadata": {},
    "source": []
   }
  ],