-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use DMA framework to create DMAble buffers #722
base: main
Are you sure you want to change the base?
Use DMA framework to create DMAble buffers #722
Conversation
…crosoft#706) Adds a new boolean cxl_memory_enabled that gets passed to mu_msvm via config flags and indicates that the CXL root device ACPI0017 should be exposed in the DSDT ACPI table. Co-authored-by: Patrick Payne <patpa@microsoft.com>
Please add more context in your PR description and update the title. |
Our implementation of the hypercall supports it, so just leave it on.
Use the `expect_test` crate to generate inline test result values to compare against. This makes the tests more reliable and easier to debug.
The one function provided by tempfile_helpers had the footgun that files would be kept around forever instead of properly cleaned up. It also forced a specific error handling pattern. Remove it, fix the usage of it in pal_async to properly clean up test files, and inline it into the only remaining uses in petri, with better error handling.
} | ||
} | ||
|
||
pub fn create_client(&mut self, pci_id: String) -> DmaClient { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we want to clone DmaClients, as it seems reasonable to me that there is at most one client per pci_id. I'd rather here we make the Arc external, ie don't implement clone, but instead return an Arc<DmaClient>
self.clients.get(pci_id).cloned() | ||
} | ||
|
||
pub fn get_dma_buffer_allocator( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't make sense here - why is this not a method on DmaClient itself?
save_restore_supported: bool, | ||
saved_state: Option<NvmeSavedState>, | ||
dma_manager: GlobalDmaManager, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing this by value raises questions as noted in the Clone
derive for GlobalDmaManager.
openhcl/underhill_core/src/worker.rs
Outdated
@@ -1863,43 +1872,67 @@ async fn new_underhill_vm( | |||
crate::inspect_proc::periodic_telemetry_task(driver_source.simple()), | |||
); | |||
|
|||
let dma_pool = if let Some(shared_vis_pages_pool) = shared_vis_pages_pool.clone() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We cannot allow cloning the pools, for the reasons stated above. Which means, you'll need to refactor DmaManager to either take ownership of the pool, or have some method it can use to spawn allocators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can probably temporarily get away with just taking the pool spawner to allow you to spawn allocators, but you cannot allow multiple instances of the pool around because then usage / allocation is broken (multiple users of the same GPA range)
.device() | ||
.host_allocator() | ||
|
||
let dma_client = gdma.device().get_dma_client().context("Failed to get DMA client from device")?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We talked about this before, but this get_dma_client
should not return an Option, IE the method should not be faillable. Every device will want a dma client.
pub async fn host_allocator(&self) -> DmaAllocator<T> { | ||
self.inner.gdma.lock().await.device().host_allocator() | ||
/// Returns an object that can allocate dma memory to be shared with the device. | ||
pub async fn get_dma_client(&self) -> anyhow::Result<Arc<dyn DmaClient>> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prefer not using get_<thing>
naming in Rust, generally just call it by it's name, so in this case dma_client
like how previously it was host_dma_allocator
.
Adds support for running Hyper-V VMM tests to Petri. Includes functionality to run basic boot test, more configuration options will be added in future PRs to run more specialized tests.
…ft#731) The nested cargo invocation limitation is unfortunate, but unavoidable according to Wesley. At least we can tell people about it. Also refactor all the argument handling here to be more robust.
…oft#732) For our use of PowerShell, we should not need user profile customizations (which are sometimes quite expensive and/or chatty). Pass `-NoProfile` to disable loading of the user profile.
When running the nvme_fuzzer locally I came across an addition_overflow related panic. Computing the queue size for the I/O queues can cause an addition overflow related panic if value read from the mqes register is MAX_U16. The fix restructures how the io queue size is calculated in way that eliminates the possibility of addition_overflow. Also added a check to make sure that mqes_z() does not report an invalid value (< 1) in accordance with the NVMe spec: 
This fixes vmbus relaying with SNP.
… dma) (microsoft#729) Revert "Use the Shared Visibilty pool on aarch64. For reasons that we do not yet" This reverts commit a7c59b7.
) This operating mode was originally added so that we could get automated CI test coverage of our APIC emulation without needing to run a full CVM. However, CVM CI tests are now just around the corner. And, as it turns out, we accidentally weren't using this in our current tests at all, and it's now broken. Seeing as we haven't been hit with any APIC bugs in a very long time, and that proper coverage will be coming soon, just remove it. More cleanup is definitely possible here, such as removing support for certain hypercalls that are no longer needed, or moving trait functions off of Backing and onto HardwareIsolatedBacking. That can all come as follow ups.
Add a safe abstraction for temporarily lending on-stack data into thread local storage. Use it in various places across the stack. This fixes a use-after-free in `pal_async`, and it reduces the overhead of TLS in `pal_async` and `underhill_threadpool`.
…crosoft#748) Attempts to mitigate an issue where on non-ephemeral ARM test runners, if the previous job orphaned a running VM, all subsequent jobs would fail. This is accomplished by stopping and then removing the VM both in the destructor and before creation (if one exists).
Add infrastructure to simulate OpenHCL test failures 1. Introduced a OpenHCL command-line argument to define test scenarios. 2. Implemented custom actions based on the specified test scenario string. This enables better debugging and validation of OpenHCL failure scenarios.
Don't remove the deferred action list from TLS until the `ProcessorRunner` is actually dropped. This fixes crashes after a failed servicing operation.
Get rid of the weird sidecar `elements_processed` field and have the hypercall infra code construct the hypercall output directly.
…osoft#740) Start bringing up missing coverage for private pool and NVMe keepalive to VMM test suite. This is not complete end-to-end test yet, but brings necessary infrastructure changes. Update device tree properties to sync with Windows host changes. --------- Co-authored-by: Daniel Prilik <71350465+daprilik@users.noreply.github.com>
Refactoring work so that the startvp hypercall is not handled by OpenHCL for non-CVMs. Will help to keep future CVM-only work isolated to the CVM backings. Tested: - SNP VMs boot - x86 and aarch64 TVMs boot - TDX VMs boot
@@ -0,0 +1,88 @@ | |||
// Copyright (c) Microsoft Corporation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
module comments please
@@ -68,3 +71,9 @@ pub trait HostDmaAllocator: Send + Sync { | |||
/// Attach to a previously allocated memory block with contiguous PFNs. | |||
fn attach_dma_buffer(&self, len: usize, base_pfn: u64) -> anyhow::Result<MemoryBlock>; | |||
} | |||
|
|||
pub trait DmaClient: Send + Sync { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please document this trait
`HvError` is supposed to represent a non-success status code, but it's also used for a possibly-successful status code in some places. Fix this by adding a new `HvStatus` type and making `HvError` a wrapper around `NonZeroU16`. This also gives us a nice niche optimization, so that `HvResult<(), HvError>` is equivalent to `HvStatus`.
I noticed "hvlite" was in a few user-facing strings where we it should not be. Fix the places where this is not a breaking change: in openvmm, flowey, and petri.
Get the physical address width via device tree to determine the alias map on systems that don't reliably report the physical address width architecturally (ARM).
…h1988/openvmm into user/bhsh/dmaPlumbing
Introduces the initial implementation of the DMA framework, providing a centralized mechanism for DMA-capable memory allocation. It establishes the foundation for managing direct memory access (DMA) operations efficiently in a unified manner. Future enhancements will include pin/unpin support and bounce buffer management to optimize memory handling and improve data transfer performance.