Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use DMA framework to create DMAble buffers #722

Draft
wants to merge 62 commits into
base: main
Choose a base branch
from

Conversation

bhargavshah1988
Copy link
Contributor

@bhargavshah1988 bhargavshah1988 commented Jan 24, 2025

Introduces the initial implementation of the DMA framework, providing a centralized mechanism for DMA-capable memory allocation. It establishes the foundation for managing direct memory access (DMA) operations efficiently in a unified manner. Future enhancements will include pin/unpin support and bounce buffer management to optimize memory handling and improve data transfer performance.

@mattkur
Copy link
Contributor

mattkur commented Jan 24, 2025

Please add more context in your PR description and update the title.

@bhargavshah1988 bhargavshah1988 changed the title User/bhsh/dma plumbing Use DMA framework to create DMAble buffers Jan 24, 2025
smalis-msft and others added 3 commits January 27, 2025 09:40
Our implementation of the hypercall supports it, so just leave it on.
Use the `expect_test` crate to generate inline test result values to
compare against. This makes the tests more reliable and easier to debug.
The one function provided by tempfile_helpers had the footgun that files
would be kept around forever instead of properly cleaned up. It also
forced a specific error handling pattern. Remove it, fix the usage of it
in pal_async to properly clean up test files, and inline it into the
only remaining uses in petri, with better error handling.
}
}

pub fn create_client(&mut self, pci_id: String) -> DmaClient {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we want to clone DmaClients, as it seems reasonable to me that there is at most one client per pci_id. I'd rather here we make the Arc external, ie don't implement clone, but instead return an Arc<DmaClient>

self.clients.get(pci_id).cloned()
}

pub fn get_dma_buffer_allocator(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't make sense here - why is this not a method on DmaClient itself?

save_restore_supported: bool,
saved_state: Option<NvmeSavedState>,
dma_manager: GlobalDmaManager,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing this by value raises questions as noted in the Clone derive for GlobalDmaManager.

@@ -1863,43 +1872,67 @@ async fn new_underhill_vm(
crate::inspect_proc::periodic_telemetry_task(driver_source.simple()),
);

let dma_pool = if let Some(shared_vis_pages_pool) = shared_vis_pages_pool.clone() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot allow cloning the pools, for the reasons stated above. Which means, you'll need to refactor DmaManager to either take ownership of the pool, or have some method it can use to spawn allocators.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can probably temporarily get away with just taking the pool spawner to allow you to spawn allocators, but you cannot allow multiple instances of the pool around because then usage / allocation is broken (multiple users of the same GPA range)

.device()
.host_allocator()

let dma_client = gdma.device().get_dma_client().context("Failed to get DMA client from device")?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We talked about this before, but this get_dma_client should not return an Option, IE the method should not be faillable. Every device will want a dma client.

pub async fn host_allocator(&self) -> DmaAllocator<T> {
self.inner.gdma.lock().await.device().host_allocator()
/// Returns an object that can allocate dma memory to be shared with the device.
pub async fn get_dma_client(&self) -> anyhow::Result<Arc<dyn DmaClient>>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prefer not using get_<thing> naming in Rust, generally just call it by it's name, so in this case dma_client like how previously it was host_dma_allocator.

tjones60 and others added 7 commits January 27, 2025 13:06
Adds support for running Hyper-V VMM tests to Petri. Includes
functionality to run basic boot test, more configuration options will be
added in future PRs to run more specialized tests.
…ft#731)

The nested cargo invocation limitation is unfortunate, but unavoidable
according to Wesley. At least we can tell people about it.

Also refactor all the argument handling here to be more robust.
…oft#732)

For our use of PowerShell, we should not need user profile
customizations (which are sometimes quite expensive and/or chatty). Pass
`-NoProfile` to disable loading of the user profile.
When running the nvme_fuzzer locally I came across an addition_overflow
related panic. Computing the queue size for the I/O queues can cause an
addition overflow related panic if value read from the mqes register is
MAX_U16.

The fix restructures how the io queue size is calculated in way that
eliminates the possibility of addition_overflow. Also added a check to
make sure that mqes_z() does not report an invalid value (< 1) in
accordance with the NVMe spec:

![image](https://github.com/user-attachments/assets/9506ff06-9f5a-4a33-8b9e-70794070bdd6)
bhargavshah1988 and others added 2 commits January 28, 2025 14:34
… dma) (microsoft#729)

Revert "Use the Shared Visibilty pool on aarch64. For reasons that we do
not yet"

This reverts commit a7c59b7.
smalis-msft and others added 9 commits January 30, 2025 21:30
)

This operating mode was originally added so that we could get automated
CI test coverage of our APIC emulation without needing to run a full
CVM. However, CVM CI tests are now just around the corner. And, as it
turns out, we accidentally weren't using this in our current tests at
all, and it's now broken. Seeing as we haven't been hit with any APIC
bugs in a very long time, and that proper coverage will be coming soon,
just remove it.

More cleanup is definitely possible here, such as removing support for
certain hypercalls that are no longer needed, or moving trait functions
off of Backing and onto HardwareIsolatedBacking. That can all come as
follow ups.
Add a safe abstraction for temporarily lending on-stack data into thread
local storage. Use it in various places across the stack.

This fixes a use-after-free in `pal_async`, and it reduces the overhead
of TLS in `pal_async` and `underhill_threadpool`.
…crosoft#748)

Attempts to mitigate an issue where on non-ephemeral ARM test runners,
if the previous job orphaned a running VM, all subsequent jobs would
fail. This is accomplished by stopping and then removing the VM both in
the destructor and before creation (if one exists).
Add infrastructure to simulate OpenHCL test failures
1. Introduced a OpenHCL command-line argument to define test scenarios.
2. Implemented custom actions based on the specified test scenario
string.

This enables better debugging and validation of OpenHCL failure
scenarios.
Don't remove the deferred action list from TLS until the
`ProcessorRunner` is actually dropped.

This fixes crashes after a failed servicing operation.
Get rid of the weird sidecar `elements_processed` field and have the
hypercall infra code construct the hypercall output directly.
…osoft#740)

Start bringing up missing coverage for private pool and NVMe keepalive
to VMM test suite.
This is not complete end-to-end test yet, but brings necessary
infrastructure changes.

Update device tree properties to sync with Windows host changes.

---------

Co-authored-by: Daniel Prilik <71350465+daprilik@users.noreply.github.com>
Refactoring work so that the startvp hypercall is not handled by OpenHCL
for non-CVMs. Will help to keep future CVM-only work isolated to the CVM
backings.

Tested:
- SNP VMs boot
- x86 and aarch64 TVMs boot
- TDX VMs boot
@@ -0,0 +1,88 @@
// Copyright (c) Microsoft Corporation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

module comments please

@@ -68,3 +71,9 @@ pub trait HostDmaAllocator: Send + Sync {
/// Attach to a previously allocated memory block with contiguous PFNs.
fn attach_dma_buffer(&self, len: usize, base_pfn: u64) -> anyhow::Result<MemoryBlock>;
}

pub trait DmaClient: Send + Sync {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please document this trait

jstarks and others added 20 commits January 31, 2025 14:12
`HvError` is supposed to represent a non-success status code, but it's
also used for a possibly-successful status code in some places. Fix this
by adding a new `HvStatus` type and making `HvError` a wrapper around
`NonZeroU16`.

This also gives us a nice niche optimization, so that `HvResult<(),
HvError>` is equivalent to `HvStatus`.
I noticed "hvlite" was in a few user-facing strings where we it should
not be. Fix the places where this is not a breaking change: in openvmm,
flowey, and petri.
Get the physical address width via device tree to determine the alias
map on systems that don't reliably report the physical address width
architecturally (ARM).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.