Skip to content

Commit 8c904cf

Browse files
committed
use the multi_thread executor to run the telemetry hang test
The test wasn't using the multi_thread executor, so eventually it would trip over the bug in opentelemetry which we've hit in the past where the runtime would hang for not fully understood reasons. see: open-telemetry/opentelemetry-rust#536 Using the multi_thread executor is the recommended work-around. I've also remove the redundant thread spawn from the telemetry Drop implementation, since I don't think it's required as well as the spawn_blocking() call. I've run this test in a tight loop for a couple of hours (on my M1 laptop) with no issues.
1 parent 80eb019 commit 8c904cf

File tree

2 files changed

+8
-7
lines changed

2 files changed

+8
-7
lines changed

apollo-router/src/plugins/telemetry/mod.rs

+5-6
Original file line numberDiff line numberDiff line change
@@ -100,14 +100,13 @@ impl Drop for Telemetry {
100100
// Tracer providers must be flushed. This may happen as part of otel if the provider was set
101101
// as the global, but may also happen in the case of an failed config reload.
102102
// If the tracer prover is present then it was not handed over so we must flush it.
103-
// The magic incantation seems to be that the flush MUST happen in a separate thread.
103+
// We must make the call to force_flush() from spawn_blocking() (or spawn a thread) to
104+
// ensure that the call to force_flush() is made from a separate thread.
104105
::tracing::debug!("flushing telemetry");
105-
std::thread::spawn(|| async {
106-
let jh = tokio::task::spawn_blocking(move || {
107-
opentelemetry::trace::TracerProvider::force_flush(&tracer_provider);
108-
});
109-
futures::executor::block_on(jh).expect("failed to flush tracer provider");
106+
let jh = tokio::task::spawn_blocking(move || {
107+
opentelemetry::trace::TracerProvider::force_flush(&tracer_provider);
110108
});
109+
futures::executor::block_on(jh).expect("failed to flush tracer provider");
111110
}
112111

113112
if let Some(sender) = self.spaceport_shutdown.take() {

apollo-router/src/router_factory.rs

+3-1
Original file line numberDiff line numberDiff line change
@@ -294,7 +294,9 @@ mod test {
294294
assert!(service.is_err())
295295
}
296296

297-
#[tokio::test]
297+
// This test must use the mult_thread tokio executor or the opentelemetry hang bug will
298+
// be encountered. (See https://github.com/open-telemetry/opentelemetry-rust/issues/536)
299+
#[tokio::test(flavor = "multi_thread")]
298300
async fn test_telemetry_doesnt_hang_with_invalid_schema() {
299301
use crate::subscriber::{set_global_subscriber, RouterSubscriber};
300302
use tracing_subscriber::EnvFilter;

0 commit comments

Comments
 (0)