Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(deps): upgrade to tower 0.5 #3744

Merged
merged 3 commits into from
Mar 21, 2025
Merged

Conversation

cratelyn
Copy link
Collaborator

@cratelyn cratelyn commented Mar 11, 2025

chore(deps)!: upgrade to tower 0.5

this commit updates our tower dependency from 0.4 to 0.5.

note that this commit does not affect the tower-service and
tower-layer crates, reëxported by tower itself. the Service<T>
trait and the closely related Layer<S> trait have not been changed.

the tower crate's utilities have changed in various ways, some of
particular note for the linkerd2 proxy. see these items, excerpted from
the tower changelog:

see:

the Either trait bounds are particularly impactful for us. because
this runs counter to how we treat errors (skewing towards boxed errors,
in general), we temporarily vendor a version of Either from the 0.4
release, whose variants have been renamed to match the 0.5 interface.

updating to box the inner A and B services' errors, so we satiate
the new A::Error = B::Error bounds, can be addressed as a follow-on.
that's intentionally left as a separate change, due to the net size of
our patchset between this branch and #3504.

this work is based upon #3504. for more information, see:

fix(stack/loadshed): update test affected by tower-rs/tower#635

this commit updates a test that was affected by breaking changes in
tower's Buffer middleware. see this excerpt from the description of
that change:

I had to change some of the integration tests slightly as part of this
change. This is because the buffer implementation using semaphore
permits is very subtly different from one using a bounded channel. In
the Semaphore-based implementation, a semaphore permit is stored in
the Message struct sent over the channel. This is so that the capacity
is used as long as the message is in flight. However, when the worker
task is processing a message that's been recieved from the channel,
the permit is still not dropped. Essentially, the one message actively
held by the worker task also occupies one "slot" of capacity, so the
actual channel capacity is one less than the value passed to the
constructor, once the first request has been sent to the worker. The
bounded MPSC changed this behavior so that capacity is only occupied
while a request is actually in the channel, which broke some tests
that relied on the old (and technically wrong) behavior.

bear particular attention to this:

The bounded MPSC changed this behavior so that capacity is only
occupied while a request is actually in the channel, which broke some
tests that relied on the old (and technically wrong) behavior.

that pr adds an additional message to the channel in tests exercising
the laod-shedding behavior, on account of the previous (incorrect)
behavior.

https://github.com/tower-rs/tower/pull/635/files#r797108274

this commit performs the same change for our corresponding test, adding
an additional ready() call before we hit the buffer's limit.

@olix0r
Copy link
Member

olix0r commented Mar 12, 2025

https://crates.io/crates/drain/0.2.0 is available

@cratelyn cratelyn force-pushed the kate/hyper-1.x-upgrade branch 3 times, most recently from 24f6e18 to 13cc7a0 Compare March 12, 2025 19:14
@cratelyn cratelyn force-pushed the kate/hyper-1.x-follow-up.tower-0.5 branch 2 times, most recently from 2553f4a to 959e0e6 Compare March 13, 2025 02:43
cratelyn added a commit that referenced this pull request Mar 13, 2025
this commit performs a small refactor to one of the unit tests in
`linkerd-stack`'s load-shedding middleware.

this adds a span to the worker tasks spawned in this test, so that
tracing logs can be associated with particular oneshot services.

see #3744 for more information on upgrading our tower dependency. this
is cherry-picked from investigations on that branch related to breaking
changes in 0.5 related to the `Buffer` middleware.

after this change, logs now look like this:

```
; RUST_LOG="trace" cargo test -p linkerd-stack buffer_load_shed -- --nocapture

running 1 test
[     0.002770s] TRACE worker{id=oneshot1}: tower::buffer::service: sending request to buffer worker
[     0.002809s] TRACE worker{id=oneshot2}: tower::buffer::service: sending request to buffer worker
[     0.002823s] TRACE worker{id=oneshot3}: tower::buffer::service: sending request to buffer worker
[     0.002843s] DEBUG worker{id=oneshot4}: linkerd_stack::loadshed: Service has become unavailable
[     0.002851s] DEBUG worker{id=oneshot4}: linkerd_stack::loadshed: Service shedding load
[     0.002878s] TRACE tower::buffer::worker: worker polling for next message
[     0.002885s] TRACE tower::buffer::worker: processing new request
[     0.002892s] TRACE worker{id=oneshot1}: tower::buffer::worker: resumed=false worker received request; waiting for service readiness
[     0.002901s] DEBUG worker{id=oneshot1}: tower::buffer::worker: service.ready=true processing request
[     0.002914s] TRACE worker{id=oneshot1}: tower::buffer::worker: returning response future
[     0.002926s] TRACE tower::buffer::worker: worker polling for next message
[     0.002931s] TRACE tower::buffer::worker: processing new request
[     0.002935s] TRACE worker{id=oneshot2}: tower::buffer::worker: resumed=false worker received request; waiting for service readiness
[     0.002946s] TRACE worker{id=oneshot2}: tower::buffer::worker: service.ready=false delay
[     0.002983s] TRACE worker{id=oneshot5}: tower::buffer::service: sending request to buffer worker
[     0.003001s] DEBUG worker{id=oneshot6}: linkerd_stack::loadshed: Service has become unavailable
[     0.003007s] DEBUG worker{id=oneshot6}: linkerd_stack::loadshed: Service shedding load
[     0.003017s] DEBUG worker{id=oneshot7}: linkerd_stack::loadshed: Service has become unavailable
[     0.003024s] DEBUG worker{id=oneshot7}: linkerd_stack::loadshed: Service shedding load
[     0.003035s] TRACE tower::buffer::worker: worker polling for next message
[     0.003041s] TRACE tower::buffer::worker: resuming buffered request
[     0.003045s] TRACE worker{id=oneshot2}: tower::buffer::worker: resumed=true worker received request; waiting for service readiness
[     0.003052s] DEBUG worker{id=oneshot2}: tower::buffer::worker: service.ready=true processing request
[     0.003060s] TRACE worker{id=oneshot2}: tower::buffer::worker: returning response future
[     0.003068s] TRACE tower::buffer::worker: worker polling for next message
[     0.003073s] TRACE tower::buffer::worker: processing new request
[     0.003077s] TRACE worker{id=oneshot3}: tower::buffer::worker: resumed=false worker received request; waiting for service readiness
[     0.003084s] DEBUG worker{id=oneshot3}: tower::buffer::worker: service.ready=true processing request
[     0.003091s] TRACE worker{id=oneshot3}: tower::buffer::worker: returning response future
[     0.003099s] TRACE tower::buffer::worker: worker polling for next message
[     0.003103s] TRACE tower::buffer::worker: processing new request
[     0.003107s] TRACE worker{id=oneshot5}: tower::buffer::worker: resumed=false worker received request; waiting for service readiness
[     0.003114s] DEBUG worker{id=oneshot5}: tower::buffer::worker: service.ready=true processing request
[     0.003121s] TRACE worker{id=oneshot5}: tower::buffer::worker: returning response future
[     0.003129s] TRACE tower::buffer::worker: worker polling for next message
test loadshed::tests::buffer_load_shed ... ok
```

Signed-off-by: katelyn martin <kate@buoyant.io>
cratelyn added a commit that referenced this pull request Mar 13, 2025
this commit performs a small refactor to one of the unit tests in
`linkerd-stack`'s load-shedding middleware.

this adds a span to the worker tasks spawned in this test, so that
tracing logs can be associated with particular oneshot services.

see #3744 for more information on upgrading our tower dependency. this
is cherry-picked from investigations on that branch related to breaking
changes in 0.5 related to the `Buffer` middleware.

after this change, logs now look like this:

```
; RUST_LOG="trace" cargo test -p linkerd-stack buffer_load_shed -- --nocapture

running 1 test
[     0.002770s] TRACE worker{id=oneshot1}: tower::buffer::service: sending request to buffer worker
[     0.002809s] TRACE worker{id=oneshot2}: tower::buffer::service: sending request to buffer worker
[     0.002823s] TRACE worker{id=oneshot3}: tower::buffer::service: sending request to buffer worker
[     0.002843s] DEBUG worker{id=oneshot4}: linkerd_stack::loadshed: Service has become unavailable
[     0.002851s] DEBUG worker{id=oneshot4}: linkerd_stack::loadshed: Service shedding load
[     0.002878s] TRACE tower::buffer::worker: worker polling for next message
[     0.002885s] TRACE tower::buffer::worker: processing new request
[     0.002892s] TRACE worker{id=oneshot1}: tower::buffer::worker: resumed=false worker received request; waiting for service readiness
[     0.002901s] DEBUG worker{id=oneshot1}: tower::buffer::worker: service.ready=true processing request
[     0.002914s] TRACE worker{id=oneshot1}: tower::buffer::worker: returning response future
[     0.002926s] TRACE tower::buffer::worker: worker polling for next message
[     0.002931s] TRACE tower::buffer::worker: processing new request
[     0.002935s] TRACE worker{id=oneshot2}: tower::buffer::worker: resumed=false worker received request; waiting for service readiness
[     0.002946s] TRACE worker{id=oneshot2}: tower::buffer::worker: service.ready=false delay
[     0.002983s] TRACE worker{id=oneshot5}: tower::buffer::service: sending request to buffer worker
[     0.003001s] DEBUG worker{id=oneshot6}: linkerd_stack::loadshed: Service has become unavailable
[     0.003007s] DEBUG worker{id=oneshot6}: linkerd_stack::loadshed: Service shedding load
[     0.003017s] DEBUG worker{id=oneshot7}: linkerd_stack::loadshed: Service has become unavailable
[     0.003024s] DEBUG worker{id=oneshot7}: linkerd_stack::loadshed: Service shedding load
[     0.003035s] TRACE tower::buffer::worker: worker polling for next message
[     0.003041s] TRACE tower::buffer::worker: resuming buffered request
[     0.003045s] TRACE worker{id=oneshot2}: tower::buffer::worker: resumed=true worker received request; waiting for service readiness
[     0.003052s] DEBUG worker{id=oneshot2}: tower::buffer::worker: service.ready=true processing request
[     0.003060s] TRACE worker{id=oneshot2}: tower::buffer::worker: returning response future
[     0.003068s] TRACE tower::buffer::worker: worker polling for next message
[     0.003073s] TRACE tower::buffer::worker: processing new request
[     0.003077s] TRACE worker{id=oneshot3}: tower::buffer::worker: resumed=false worker received request; waiting for service readiness
[     0.003084s] DEBUG worker{id=oneshot3}: tower::buffer::worker: service.ready=true processing request
[     0.003091s] TRACE worker{id=oneshot3}: tower::buffer::worker: returning response future
[     0.003099s] TRACE tower::buffer::worker: worker polling for next message
[     0.003103s] TRACE tower::buffer::worker: processing new request
[     0.003107s] TRACE worker{id=oneshot5}: tower::buffer::worker: resumed=false worker received request; waiting for service readiness
[     0.003114s] DEBUG worker{id=oneshot5}: tower::buffer::worker: service.ready=true processing request
[     0.003121s] TRACE worker{id=oneshot5}: tower::buffer::worker: returning response future
[     0.003129s] TRACE tower::buffer::worker: worker polling for next message
test loadshed::tests::buffer_load_shed ... ok
```

Signed-off-by: katelyn martin <kate@buoyant.io>
cratelyn added a commit that referenced this pull request Mar 14, 2025
this commit performs a small refactor to one of the unit tests in
`linkerd-stack`'s load-shedding middleware.

this adds a span to the worker tasks spawned in this test, so that
tracing logs can be associated with particular oneshot services.

see #3744 for more information on upgrading our tower dependency. this
is cherry-picked from investigations on that branch related to breaking
changes in 0.5 related to the `Buffer` middleware.

after this change, logs now look like this:

```
; RUST_LOG="trace" cargo test -p linkerd-stack buffer_load_shed -- --nocapture

running 1 test
[     0.002770s] TRACE worker{id=oneshot1}: tower::buffer::service: sending request to buffer worker
[     0.002809s] TRACE worker{id=oneshot2}: tower::buffer::service: sending request to buffer worker
[     0.002823s] TRACE worker{id=oneshot3}: tower::buffer::service: sending request to buffer worker
[     0.002843s] DEBUG worker{id=oneshot4}: linkerd_stack::loadshed: Service has become unavailable
[     0.002851s] DEBUG worker{id=oneshot4}: linkerd_stack::loadshed: Service shedding load
[     0.002878s] TRACE tower::buffer::worker: worker polling for next message
[     0.002885s] TRACE tower::buffer::worker: processing new request
[     0.002892s] TRACE worker{id=oneshot1}: tower::buffer::worker: resumed=false worker received request; waiting for service readiness
[     0.002901s] DEBUG worker{id=oneshot1}: tower::buffer::worker: service.ready=true processing request
[     0.002914s] TRACE worker{id=oneshot1}: tower::buffer::worker: returning response future
[     0.002926s] TRACE tower::buffer::worker: worker polling for next message
[     0.002931s] TRACE tower::buffer::worker: processing new request
[     0.002935s] TRACE worker{id=oneshot2}: tower::buffer::worker: resumed=false worker received request; waiting for service readiness
[     0.002946s] TRACE worker{id=oneshot2}: tower::buffer::worker: service.ready=false delay
[     0.002983s] TRACE worker{id=oneshot5}: tower::buffer::service: sending request to buffer worker
[     0.003001s] DEBUG worker{id=oneshot6}: linkerd_stack::loadshed: Service has become unavailable
[     0.003007s] DEBUG worker{id=oneshot6}: linkerd_stack::loadshed: Service shedding load
[     0.003017s] DEBUG worker{id=oneshot7}: linkerd_stack::loadshed: Service has become unavailable
[     0.003024s] DEBUG worker{id=oneshot7}: linkerd_stack::loadshed: Service shedding load
[     0.003035s] TRACE tower::buffer::worker: worker polling for next message
[     0.003041s] TRACE tower::buffer::worker: resuming buffered request
[     0.003045s] TRACE worker{id=oneshot2}: tower::buffer::worker: resumed=true worker received request; waiting for service readiness
[     0.003052s] DEBUG worker{id=oneshot2}: tower::buffer::worker: service.ready=true processing request
[     0.003060s] TRACE worker{id=oneshot2}: tower::buffer::worker: returning response future
[     0.003068s] TRACE tower::buffer::worker: worker polling for next message
[     0.003073s] TRACE tower::buffer::worker: processing new request
[     0.003077s] TRACE worker{id=oneshot3}: tower::buffer::worker: resumed=false worker received request; waiting for service readiness
[     0.003084s] DEBUG worker{id=oneshot3}: tower::buffer::worker: service.ready=true processing request
[     0.003091s] TRACE worker{id=oneshot3}: tower::buffer::worker: returning response future
[     0.003099s] TRACE tower::buffer::worker: worker polling for next message
[     0.003103s] TRACE tower::buffer::worker: processing new request
[     0.003107s] TRACE worker{id=oneshot5}: tower::buffer::worker: resumed=false worker received request; waiting for service readiness
[     0.003114s] DEBUG worker{id=oneshot5}: tower::buffer::worker: service.ready=true processing request
[     0.003121s] TRACE worker{id=oneshot5}: tower::buffer::worker: returning response future
[     0.003129s] TRACE tower::buffer::worker: worker polling for next message
test loadshed::tests::buffer_load_shed ... ok
```

Signed-off-by: katelyn martin <kate@buoyant.io>
@cratelyn cratelyn force-pushed the kate/hyper-1.x-upgrade branch 2 times, most recently from 2e2fc42 to f0a22e1 Compare March 14, 2025 16:46
@cratelyn cratelyn force-pushed the kate/hyper-1.x-follow-up.tower-0.5 branch 4 times, most recently from f500b25 to 10b1612 Compare March 14, 2025 17:42
@cratelyn cratelyn marked this pull request as ready for review March 14, 2025 17:57
@cratelyn cratelyn requested a review from a team as a code owner March 14, 2025 17:57
@olix0r olix0r self-assigned this Mar 17, 2025
@olix0r
Copy link
Member

olix0r commented Mar 18, 2025

This looks good to me, but we should let the Hyper change land first.

@cratelyn cratelyn force-pushed the kate/hyper-1.x-upgrade branch 2 times, most recently from 77a657e to aecfc7b Compare March 18, 2025 22:46
@cratelyn cratelyn force-pushed the kate/hyper-1.x-follow-up.tower-0.5 branch from 10b1612 to 7e5bf22 Compare March 18, 2025 23:00
cratelyn added a commit that referenced this pull request Mar 18, 2025
#3744 (comment)

Signed-off-by: katelyn martin <kate@buoyant.io>
@cratelyn
Copy link
Collaborator Author

♻️ this branch has been freshly rebased atop #3504, now that #3779 has landed.

@olix0r
Copy link
Member

olix0r commented Mar 18, 2025

Holding approval until the parent PR merges, but LGTM!

Base automatically changed from kate/hyper-1.x-upgrade to main March 21, 2025 16:53
this commit updates our tower dependency from 0.4 to 0.5.

note that this commit does not affect the `tower-service` and
`tower-layer` crates, reëxported by `tower` itself. the `Service<T>`
trait and the closely related `Layer<S>` trait have not been changed.

the `tower` crate's utilities have changed in various ways, some of
particular note for the linkerd2 proxy. see these items, excerpted from
the tower changelog:

- **retry**: **Breaking Change** `retry::Policy::retry` now accepts `&mut Req` and `&mut Res` instead of the previous mutable versions. This
  increases the flexibility of the retry policy. To update, update your method signature to include `mut` for both parameters. ([tower-rs/tower#584])
- **retry**: **Breaking Change** Change Policy to accept &mut self ([tower-rs/tower#681])
- **retry**: **Breaking Change** `Budget` is now a trait. This allows end-users to implement their own budget and bucket implementations. ([tower-rs/tower#703])
- **util**: **Breaking Change** `Either::A` and `Either::B` have been renamed `Either::Left` and `Either::Right`, respectively. ([tower-rs/tower#637])
- **util**: **Breaking Change** `Either` now requires its two services to have the same error type. ([tower-rs/tower#637])
- **util**: **Breaking Change** `Either` no longer implemenmts `Future`. ([tower-rs/tower#637])
- **buffer**: **Breaking Change** `Buffer<S, Request>` is now generic over `Buffer<Request, S::Future>.` ([tower-rs/tower#654])

see:

* <tower-rs/tower#584>
* <tower-rs/tower#681>
* <tower-rs/tower#703>
* <tower-rs/tower#637>
* <tower-rs/tower#654>

the `Either` trait bounds are particularly impactful for us. because
this runs counter to how we treat errors (skewing towards boxed errors,
in general), we temporarily vendor a version of `Either` from the 0.4
release, whose variants have been renamed to match the 0.5 interface.

updating to box the inner `A` and `B` services' errors, so we satiate
the new `A::Error = B::Error` bounds, can be addressed as a follow-on.
that's intentionally left as a separate change, due to the net size of
our patchset between this branch and #3504.

* <tower-rs/tower@v0.4.x...master>
* <https://github.com/tower-rs/tower/blob/master/tower/CHANGELOG.md>

this work is based upon #3504. for more information, see:

* linkerd/linkerd2#8733
* #3504

Signed-off-by: katelyn martin <kate@buoyant.io>
X-Ref: tower-rs/tower#815
X-Ref: tower-rs/tower#817
X-Ref: tower-rs/tower#818
X-Ref: tower-rs/tower#819
this commit updates a test that was affected by breaking changes in
tower's `Buffer` middleware. see this excerpt from the description of
that change:

> I had to change some of the integration tests slightly as part of this
> change. This is because the buffer implementation using semaphore
> permits is _very subtly_ different from one using a bounded channel. In
> the `Semaphore`-based implementation, a semaphore permit is stored in
> the `Message` struct sent over the channel. This is so that the capacity
> is used as long as the message is in flight. However, when the worker
> task is processing a message that's been recieved from the channel,
> the permit is still not dropped. Essentially, the one message actively
> held by the worker task _also_ occupies one "slot" of capacity, so the
> actual channel capacity is one less than the value passed to the
> constructor, _once the first request has been sent to the worker_. The
> bounded MPSC changed this behavior so that capacity is only occupied
> while a request is actually in the channel, which broke some tests
> that relied on the old (and technically wrong) behavior.

bear particular attention to this:

> The bounded MPSC changed this behavior so that capacity is only
> occupied while a request is actually in the channel, which broke some
> tests that relied on the old (and technically wrong) behavior.

that pr adds an additional message to the channel in tests exercising
the laod-shedding behavior, on account of the previous (incorrect)
behavior.

https://github.com/tower-rs/tower/pull/635/files#r797108274

this commit performs the same change for our corresponding test, adding
an additional `ready()` call before we hit the buffer's limit.

Signed-off-by: katelyn martin <kate@buoyant.io>
#3744 (comment)

Signed-off-by: katelyn martin <kate@buoyant.io>
@cratelyn cratelyn force-pushed the kate/hyper-1.x-follow-up.tower-0.5 branch from 9bded3d to 1ba2b17 Compare March 21, 2025 16:59
@cratelyn
Copy link
Collaborator Author

rebased on main, now that #3504 has landed 🙂

@cratelyn cratelyn enabled auto-merge (squash) March 21, 2025 17:07
@cratelyn cratelyn merged commit 76d9695 into main Mar 21, 2025
15 checks passed
@cratelyn cratelyn deleted the kate/hyper-1.x-follow-up.tower-0.5 branch March 21, 2025 17:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants