Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RwLock::read for UFT hits, kstats use AtomicU64 #636

Merged
merged 20 commits into from
Mar 14, 2025
Merged

RwLock::read for UFT hits, kstats use AtomicU64 #636

merged 20 commits into from
Mar 14, 2025

Conversation

FelixMcFelix
Copy link
Collaborator

@FelixMcFelix FelixMcFelix commented Dec 16, 2024

This PR replaces the per-Port KMutex with a KRwLock. This allows
ioctls to read port state without preventing packets from being served.
This also prevents successful UFT hits from blocking one another on the
port for any length of time.

A requirement is that we move to atomic accesses on kstat values, which
allows us to update kstats from outside a write-lock on the port.

The odd thing is that there doesn't seem to be a performance
improvement (in Mbps).
Contention according to lockstat is solved, and
apparently the time spent in xde_rx/xde_mc_tx is reduced -- we're no
longer:

  • S2C -- the 3rd and 10th most spin-happy locks,
  • C2S -- the 2rd and 3rd most spin-happy locks.

After some refactoring, this has moved from 2.8->2.9 Gbps (pre-LSO) on glasgow.

Nor are we egregiously blocking on the port lock.

Closes #435.

This PR replaces the per-`Port` `KMutex` with a `KRwLock`. This allows
ioctls to read port state without preventing packets from being served.
This also prevents successful UFT hits from blocking one another on the
port for any length of time.

A requirement is that we move to atomic accesses on kstat values, which
allows us to update kstats from outside a write-lock on the port.

The odd thing is that there doesn't seem to be a performance
improvement (in Mbps). Contention according to lockstat is solved, and
apparently the time spent in `xde_rx`/`xde_mc_tx` is reduced -- we're no
longer:

 * S2C -- the 3rd and 10th most spin-happy locks,
 * C2S -- the 2rd and 3rd most spin-happy locks.

Nor are we egregiously blocking on the port lock.
@FelixMcFelix FelixMcFelix self-assigned this Dec 18, 2024
@FelixMcFelix FelixMcFelix added this to the 13 milestone Dec 18, 2024
@morlandi7 morlandi7 modified the milestones: 13, 14 Feb 11, 2025
At the very least, this compiles.

We needed to regenerate `ip.rs`, on account of the many `extern` ->
`unsafe extern` block changes.
FelixMcFelix added a commit that referenced this pull request Feb 21, 2025
Pulled out of #636, we were creating some `Spin` locks which would end
up falling back to adaptive behaviour as part of LFT creation. Moreover,
they would hit a debug assert and panic on test bits. (!!!)

Given that we have no means of specifying priority in `mutex_init`, we
make sure that all created `KMutex`es are corret by removing the
illusion of free choice.
FelixMcFelix added a commit that referenced this pull request Feb 21, 2025
Pulled out of #636, we were creating some `Spin` locks which would end
up falling back to adaptive behaviour as part of LFT creation. Moreover,
they would hit a debug assert and panic on test bits. (!!!)

Given that we have no means of specifying priority in `mutex_init`, we
make sure that all created `KMutex`es are corret by removing the
illusion of free choice.
FelixMcFelix added a commit that referenced this pull request Feb 21, 2025
Pulled out of #636. We were creating some `Spin` locks which,
misconfigured, would end up falling back to adaptive behaviour as part
of LFT creation. Moreover, they would hit a debug assert and panic on
test bits. (!!!)

Given that we have no means of specifying priority in `mutex_init`, we
now make sure that all created `KMutex`es are correct by removing the
illusion of free choice.
CI on recent PRs is breaking, due to rustup 1.28.0+ no longer
autoinstalling the correct rust toolchain version. This hurts us
immediately since we have *two* toolchains (pinned nightly and stable),
and deliberately specified the nightly for some tooling.

This PR changes this over to use buildomat's auto-installation for the
stable variant, and the new toolchain show -> install pattern for
nightly. This also lets us place `$NIGHTLY` into most of our `cargo fmt`
invocations, which should reduce the busywork in future compiler bumps
for XDE.
@FelixMcFelix FelixMcFelix changed the base branch from master to rust-2024 March 11, 2025 09:12
Makes some aspects clearer -- no more fixing up wrongly incremented
stats, and makes it more obvious that a `reprocess` is a downgrade to
the slowpath.

Catches one locking invariant that was being upset by an accidental
epoch re-check.
@FelixMcFelix FelixMcFelix marked this pull request as ready for review March 11, 2025 19:35
@FelixMcFelix FelixMcFelix requested a review from luqmana March 11, 2025 19:38
Base automatically changed from rust-2024 to master March 11, 2025 19:49
Forgot an `UnsafeCell`, everything's good now!
Copy link
Contributor

@luqmana luqmana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

Includes some changes discussed out-of-band on Just Including The Atomic
Types in the union definition, getting us away from the explicit
UnsafeCell.

A couple of nice things fell out of this -- we have setup a pattern for
if/when we need the other KStat variants, but we can now deduplicate the
code between test/kernel since we always have an `AtomicU64` now.
@FelixMcFelix FelixMcFelix merged commit eafcc70 into master Mar 14, 2025
10 checks passed
@FelixMcFelix FelixMcFelix deleted the rwlock branch March 14, 2025 21:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Want less/finer-grained locking in the datapath
3 participants