amazon-braket · Mar 10, 2025
diff --git a/‎README.md
+6-1 b/‎README.md
+6-1
diff --git a/‎examples/nvidia_cuda_q/0_hello_cudaq_jobs.ipynb
+7-3 b/‎examples/nvidia_cuda_q/0_hello_cudaq_jobs.ipynb
+7-3
diff --git a/‎examples/nvidia_cuda_q/1_simulation_with_GPUs.ipynb
+5-3 b/‎examples/nvidia_cuda_q/1_simulation_with_GPUs.ipynb
+5-3
diff --git a/‎examples/nvidia_cuda_q/2_parallel_simulations.ipynb
+4-64 b/‎examples/nvidia_cuda_q/2_parallel_simulations.ipynb
+4-64
diff --git a/‎examples/nvidia_cuda_q/3_distributed_statevector_simulations.ipynb
+183 b/‎examples/nvidia_cuda_q/3_distributed_statevector_simulations.ipynb
+183
diff --git a/‎examples/nvidia_cuda_q/container/Dockerfile
+9-11 b/‎examples/nvidia_cuda_q/container/Dockerfile
+9-11
@@ -13,6 +13,7 @@ The examples in this repository are structured as follows:
 - [Pulse Control](#pulse)
 - [Analog Hamiltonian Simulation](#ahs)
 - [Qiskit with Braket](#qiskit)
+- [CUDA-Q](#cudaq)
 
 ---
 
@@ -312,7 +313,11 @@ This folder contains examples that illustrate the use of Amazon Braket Hybrid Jo
 
 - [**Parallel simulations on multiple GPUs**](examples/nvidia_cuda_q/2_parallel_simulations.ipynb)
 
-  This tutorial shows you how to parallelize the simulations of observables and circuit batches over multiple GPUs using Braket Hybrid Jobs.
+  This tutorial shows you how to parallelize the simulations of observables and circuit batches over multiple GPUs using CUDA-Q with Braket Hybrid Jobs.
+
+- [**Distributed state vector simulations on multiple GPUs (advanced)**](examples/nvidia_cuda_q/3_distributed_statevector_simulations.ipynb)
+
+  This tutorial shows you how to distribute a single state vector simulation across multiple GPUs using CUDA-Q with Braket Hybrid Jobs.
 
 ---
 
 
@@ -260,7 +260,7 @@
    "metadata": {},
    "source": [
     "## Summary\n",
-    "This notebook shows you how to run your first CUDA-Q program with Amazon Braket Hybrid Jobs. Using the BYOC feature of Amazon Braket and a shell script we provide, you can create a CUDA-Q environment with a few lines of code. Once you have registered your CUDA-Q container image, you can run CUDA-Q programs with Braket Hybrid Jobs and scale your workloads up and out with the range of compute options provided by AWS. In the following tutorials, we will show you how to run CUDA-Q simulations on GPUs ([notebook](1_simulation_with_GPUs.ipynb)) and distribute workloads across multiple instances ([notebook](2_parallel_simulations.ipynb))."
+    "This notebook shows you how to run your first CUDA-Q program with Amazon Braket Hybrid Jobs. Using the BYOC feature of Amazon Braket and a shell script we provide, you can create a CUDA-Q environment with a few lines of code. Once you have registered your CUDA-Q container image, you can run CUDA-Q programs with Braket Hybrid Jobs and scale your workloads up and out with the range of compute options provided by AWS. In the following tutorials, we will show you how to run CUDA-Q simulations on GPUs ([notebook](1_simulation_with_GPUs.ipynb)), distribute workloads across multiple instances ([notebook](2_parallel_simulations.ipynb)), and distribute a single state vector simulation across multiple GPUs ([notebook](3_distributed_statevector_simulations.ipynb))."
    ]
   },
   {
@@ -271,7 +271,8 @@
     "## Appendix: Procedure for building the container\n",
     "\n",
     "When the shell script `container_build_and_push.sh` is called, a Docker container is built with CUDA-Q and other GPU related settings are configured. The procedure for BYOC is presented in [this page from the Braket Developer Guide](https://docs.aws.amazon.com/braket/latest/developerguide/braket-jobs-byoc.html). The required files for building a container with CUDA-Q are in the \"container\" folder, including\n",
-    "- `Dockerfile`: Describes how the container is built.\n",
+    "- `Dockerfile`: Describes how the container is built for CUDA-Q scenarios.\n",
+    "- `Dockerfile.mgpu`: (advanced) Describes how the container is built for CUDA-Q scenarios which require multi-GPU support.\n",
     "- `requirements.txt`: Additional Python dependencies to include.\n",
     "- `braket_container.py`: The start-up script of a job container.\n",
     "\n",
@@ -306,6 +307,9 @@
    "metadata": {},
    "outputs": [],
    "source": [
+    "from braket.aws import AwsQuantumJob\n",
+    "from braket.devices import Devices\n",
+    "\n",
     "# create a hybrid job\n",
     "job = AwsQuantumJob.create(\n",
     "    device=Devices.Amazon.SV1,\n",
@@ -315,7 +319,7 @@
     "\n",
     "# view the ARN and the status of the job\n",
     "print(\"ARN of the job: \", job.arn)\n",
-    "print(\"Status of the job: \", job.status())"
+    "print(\"Status of the job: \", job.state())"
    ]
   }
  ],
 
@@ -7,7 +7,7 @@
    "source": [
     "# Simulating quantum programs on GPUs\n",
     "\n",
-    "In this notebook, you will learn how to simulate quantum circuits using GPUs with Amazon Braket. You can learn about BYOC and view how to build a container image that supports CUDA-Q in the notebook \"0_hello_cudaq_jobs.ipynb\"."
+    "In this notebook, you will learn how to simulate quantum circuits using GPUs with NVIDIA CUDA-Q and Braket Hybrid Jobs."
    ]
   },
   {
@@ -39,7 +39,9 @@
    "id": "0bd2cd58-420c-4bf3-b161-09c27a272f8f",
    "metadata": {},
    "source": [
-    "Next, specify the URI of you CUDA-Q container image. If you went through the \"0_hello_cudaq_jobs.ipynb\" notebook, you can use the same image URI."
+    "Next, specify the URI of your container image that supports CUDA-Q.\n",
+    "\n",
+    "If you don't have this URI already, see the notebook \"0_hello_cudaq_jobs.ipynb\", where you can learn about Braket Hybrid Jobs and how to build a container image that supports CUDA-Q. After following the steps in that notebook to upload the container image, you can use the same image URI here."
    ]
   },
   {
@@ -228,7 +230,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.10.15"
+   "version": "3.10.13"
   }
  },
  "nbformat": 4,
 
@@ -40,7 +40,9 @@
    "id": "1331136a-a369-4eef-8bfe-252c79103a3e",
    "metadata": {},
    "source": [
-    "Next, specify the URI of the container image that supports CUDA-Q. If you went through the \"0_hello_cudaq_jobs.ipynb\" notebook, you can use the same image URI."
+    "Next, specify the URI of your container image that supports CUDA-Q.\n",
+    "\n",
+    "If you don't have this URI already, see the notebook \"0_hello_cudaq_jobs.ipynb\", where you can learn about Braket Hybrid Jobs and how to build a container image that supports CUDA-Q. After following the steps in that notebook to upload the container image, you can use the same image URI here."
    ]
   },
   {
@@ -378,68 +380,6 @@
     "```"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "id": "a844344d-0978-4b11-8fe4-66b387e80c72",
-   "metadata": {},
-   "source": [
-    "## Distributed statevector simulations\n",
-    "The `nvidia` target with `mgpu` option supports distributing state vector simulations to multiple GPUs. This enables GPU simulations for circuits with higher qubit count, to up to 34 qubits. The example below shows how to submit a job with the `mgpu` option."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "d38b73de-6a7e-45b7-b64b-afec60c0a6c3",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "@hybrid_job(\n",
-    "    device=\"local:nvidia/nvidia-mgpu\",\n",
-    "    instance_config=InstanceConfig(instanceType=\"ml.p3.8xlarge\", instanceCount=1),\n",
-    "    image_uri=image_uri,\n",
-    ")\n",
-    "def distributed_gpu_job(\n",
-    "    n_qubits,\n",
-    "    n_shots,\n",
-    "    sagemaker_mpi_enabled=True,\n",
-    "):\n",
-    "    import cudaq\n",
-    "\n",
-    "    # Define target\n",
-    "    cudaq.set_target(\"nvidia\", option=\"mgpu\")\n",
-    "    print(\"CUDA-Q backend: \", cudaq.get_target())\n",
-    "    print(\"num_available_gpus: \", cudaq.num_available_gpus())\n",
-    "\n",
-    "    # Initialize MPI and view the MPI properties\n",
-    "    cudaq.mpi.initialize()\n",
-    "    rank = cudaq.mpi.rank()\n",
-    "\n",
-    "    # Define circuit and observables\n",
-    "    @cudaq.kernel\n",
-    "    def ghz():\n",
-    "        qubits = cudaq.qvector(n_qubits)\n",
-    "        h(qubits[0])\n",
-    "        for q in range(1, n_qubits):\n",
-    "            cx(qubits[0], qubits[q])\n",
-    "\n",
-    "    hamiltonian = cudaq.SpinOperator.random(n_qubits, 1)\n",
-    "\n",
-    "    # Parallelize circuit simulation\n",
-    "    result = cudaq.observe(ghz, hamiltonian, shots_count=n_shots)\n",
-    "\n",
-    "    # End the MPI interface\n",
-    "    cudaq.mpi.finalize()\n",
-    "\n",
-    "    if rank == 0:\n",
-    "        return {\"expectation\": result.expectation()}\n",
-    "\n",
-    "\n",
-    "n_qubits = 25\n",
-    "n_shots = 1000\n",
-    "distributed_job = distributed_gpu_job(n_qubits, n_shots)"
-   ]
-  },
   {
    "cell_type": "markdown",
    "id": "18004fe8-24a0-4316-ab0a-a4e0aac6ba1e",
@@ -466,7 +406,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.10.15"
+   "version": "3.10.13"
   }
  },
  "nbformat": 4,
 
@@ -0,0 +1,183 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "021cfd04-df68-414d-a864-48f62fc8ddfb",
+   "metadata": {},
+   "source": [
+    "# Distributed state vector simulations on multiple GPUs (advanced)\n",
+    "\n",
+    "In the notebook \"2_parallel_simulations.ipynb\", you learned how to use CUDA-Q and Braket Hybrid Jobs to parallelize the simulation of a batch of observables and circuits over multiple GPUs, where each GPU simulates a single QPU. For workloads with larger qubit counts, however, it may be necessary to distribute a single state vector simulation across multiple GPUs, so that multiple GPUs together simulate a single QPU.\n",
+    "\n",
+    "In this notebook, you will learn how to use CUDA-Q and Braket Hybrid Jobs to tackle this."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "32b46659-6dcc-4900-a13a-e971f8bf0590",
+   "metadata": {},
+   "source": [
+    "We start with necessary imports that are used in the examples below."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "8738f65f-969c-4b58-96f8-69bbc1bad5e1",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from braket.jobs import hybrid_job\n",
+    "from braket.jobs.config import InstanceConfig"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "1331136a-a369-4eef-8bfe-252c79103a3e",
+   "metadata": {},
+   "source": [
+    "Next, we need to create and upload a container which contains both CUDA-Q and the underlying CUDA support required for distributing our computation across multiple GPUs. Note: this container image will be different than the one used in the previous notebooks illustrating more basic CUDA-Q scenarios.\n",
+    "\n",
+    "To do this, we need to run the commands in the cell below. (For more information about what these commands are doing, please see the detailed documentation in \"0_hello_cudaq_jobs.ipynb\". The difference here is that we specify the dockerfile `Dockerfile.mgpu` in order to ensure full support for this advanced scenario.)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f552c738",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!chmod +x container/container_build_and_push.sh\n",
+    "!container/container_build_and_push.sh cudaq-mgpu-job us-west-2 Dockerfile.mgpu"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "fc46e446",
+   "metadata": {},
+   "source": [
+    "Now we prepare the URI of the container image. Fill the proper value of `aws_account_id`, `region_name` and `container_image_name` in the cell below. For example, with the shell command above, `region_name=\"us-west-2\"` and `container_image_name=\"cudaq-mgpu-job\"`. The cell below prints out the image URI. When you use a container image to run a job, it ensures that your code is run in the same environment every time. "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "25fdc720-3143-411a-8bef-f9623369b516",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "aws_account_id = \"<aws-account-id>\"\n",
+    "region_name = \"<region-name>\"\n",
+    "container_image_name = \"<container-image-name>\"\n",
+    "\n",
+    "image_uri = f\"{aws_account_id}.dkr.ecr.{region_name}.amazonaws.com/{container_image_name}:latest\"\n",
+    "print(image_uri)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a844344d-0978-4b11-8fe4-66b387e80c72",
+   "metadata": {},
+   "source": [
+    "## Distributed state vector simulations\n",
+    "Now that we have the container image URI, we are ready to run our workload. The `nvidia` target with `mgpu` option supports distributing state vector simulations to multiple GPUs. This enables GPU simulations for circuits with higher qubit count, to up to 34 qubits. The example below shows how to submit a job with the `mgpu` option."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "d38b73de-6a7e-45b7-b64b-afec60c0a6c3",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "@hybrid_job(\n",
+    "    device=\"local:nvidia/nvidia-mgpu\",\n",
+    "    instance_config=InstanceConfig(instanceType=\"ml.p3.8xlarge\", instanceCount=1),\n",
+    "    image_uri=image_uri,\n",
+    ")\n",
+    "def distributed_gpu_job(\n",
+    "    n_qubits,\n",
+    "    n_shots,\n",
+    "    sagemaker_mpi_enabled=True,\n",
+    "):\n",
+    "    import cudaq\n",
+    "\n",
+    "    # Define target\n",
+    "    cudaq.set_target(\"nvidia\", option=\"mgpu\")\n",
+    "    print(\"CUDA-Q backend: \", cudaq.get_target())\n",
+    "    print(\"num_available_gpus: \", cudaq.num_available_gpus())\n",
+    "\n",
+    "    # Initialize MPI and view the MPI properties\n",
+    "    cudaq.mpi.initialize()\n",
+    "    rank = cudaq.mpi.rank()\n",
+    "\n",
+    "    # Define circuit and observables\n",
+    "    @cudaq.kernel\n",
+    "    def ghz():\n",
+    "        qubits = cudaq.qvector(n_qubits)\n",
+    "        h(qubits[0])\n",
+    "        for q in range(1, n_qubits):\n",
+    "            cx(qubits[0], qubits[q])\n",
+    "\n",
+    "    hamiltonian = cudaq.SpinOperator.random(n_qubits, 1)\n",
+    "\n",
+    "    # Parallelize circuit simulation\n",
+    "    result = cudaq.observe(ghz, hamiltonian, shots_count=n_shots)\n",
+    "\n",
+    "    # End the MPI interface\n",
+    "    cudaq.mpi.finalize()\n",
+    "\n",
+    "    if rank == 0:\n",
+    "        return {\"expectation\": result.expectation()}\n",
+    "\n",
+    "\n",
+    "n_qubits = 25\n",
+    "n_shots = 1000\n",
+    "distributed_job = distributed_gpu_job(n_qubits, n_shots)\n",
+    "print(\"Job ARN: \", distributed_job.arn)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "a054c24e",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "distributed_job_result = distributed_job.result()\n",
+    "print(f\"result: {distributed_job_result['expectation']}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "18004fe8-24a0-4316-ab0a-a4e0aac6ba1e",
+   "metadata": {},
+   "source": [
+    "## Summary\n",
+    "This notebook shows you how to distribute a single state vector simulation across multiple GPUs, so that multiple GPUs together simulate a single QPU. If you have workloads with a qubit count that is too large to simulate on a single GPU, you can use this technique to make these large workloads feasible."
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.13"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
@@ -1,17 +1,15 @@
-FROM 292282985366.dkr.ecr.us-west-2.amazonaws.com/amazon-braket-pytorch-jobs:latest
-RUN python3 -m pip install --upgrade pip
+FROM 292282985366.dkr.ecr.us-west-2.amazonaws.com/amazon-braket-base-jobs:1.0-cpu-py310-ubuntu22.04-2025-02-15-18-01-26
 
-# install cudaq
 ARG SCRIPT_PATH
-ARG CUDAQ_PATH=/opt/conda/lib/python3.10/site-packages
-ENV MPI_PATH=/opt/amazon/openmpi
+ARG CUDAQ_PATH=/usr/local/lib/python3.10/site-packages
 
-RUN python3 -m pip install cudaq
-RUN bash "${CUDAQ_PATH}/distributed_interfaces/activate_custom_mpi.sh"
+ENV MPI_PATH=/usr/local \
+    SAGEMAKER_PROGRAM=braket_container.py
 
-# install additional python dependencies
-RUN python3 -m pip install --no-cache --upgrade -r requirements.txt
+# install Python dependencies including cudaq
+COPY "${SCRIPT_PATH}/requirements.txt" .
+RUN pip install --no-cache --upgrade -r requirements.txt && \
+    bash "${CUDAQ_PATH}/distributed_interfaces/activate_custom_mpi.sh"
 
-# Setup our entry point
+# setup the entry point  
 COPY "${SCRIPT_PATH}/braket_container.py" /opt/ml/code/braket_container.py
-ENV SAGEMAKER_PROGRAM=braket_container.py
Original file line number	Diff line number	Diff line change
`@@ -7,7 +7,7 @@`
`7`	`7`	`"source": [`
`8`	`8`	`"# Simulating quantum programs on GPUs\n",`
`9`	`9`	`"\n",`
`10`		`- "In this notebook, you will learn how to simulate quantum circuits using GPUs with Amazon Braket. You can learn about BYOC and view how to build a container image that supports CUDA-Q in the notebook \"0_hello_cudaq_jobs.ipynb\"."`
	`10`	`+ "In this notebook, you will learn how to simulate quantum circuits using GPUs with NVIDIA CUDA-Q and Braket Hybrid Jobs."`
`11`	`11`	`]`
`12`	`12`	`},`
`13`	`13`	`{`
`@@ -39,7 +39,9 @@`
`39`	`39`	`"id": "0bd2cd58-420c-4bf3-b161-09c27a272f8f",`
`40`	`40`	`"metadata": {},`
`41`	`41`	`"source": [`
`42`		`- "Next, specify the URI of you CUDA-Q container image. If you went through the \"0_hello_cudaq_jobs.ipynb\" notebook, you can use the same image URI."`
	`42`	`+ "Next, specify the URI of your container image that supports CUDA-Q.\n",`
	`43`	`+ "\n",`
	`44`	`+ "If you don't have this URI already, see the notebook \"0_hello_cudaq_jobs.ipynb\", where you can learn about Braket Hybrid Jobs and how to build a container image that supports CUDA-Q. After following the steps in that notebook to upload the container image, you can use the same image URI here."`
`43`	`45`	`]`
`44`	`46`	`},`
`45`	`47`	`{`
`@@ -228,7 +230,7 @@`
`228`	`230`	`"name": "python",`
`229`	`231`	`"nbconvert_exporter": "python",`
`230`	`232`	`"pygments_lexer": "ipython3",`
`231`		`- "version": "3.10.15"`
	`233`	`+ "version": "3.10.13"`
`232`	`234`	`}`
`233`	`235`	`},`
`234`	`236`	`"nbformat": 4,`