Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add global::meter_provider_shutdown #1623

Merged
merged 20 commits into from
Mar 16, 2024

Conversation

lalitb
Copy link
Member

@lalitb lalitb commented Mar 15, 2024

Fixes #1619

Changes

  • Add global::shutdown_meter_provider() which internally sets the global provider to NoopMeterProvider.
  • Add Drop implementation for SdkMeterProvider, which ensures that the pipeline shutdown is invoked once SdkMeterProvider goes out of scope.
  • Update examples to use global::shutdown_meter_provider()

Merge requirement checklist

  • CONTRIBUTING guidelines followed
  • Unit tests added/updated (if applicable)
  • Appropriate CHANGELOG.md files updated for non-trivial, user-facing changes
  • Changes in public API reviewed (if applicable)

@lalitb lalitb requested a review from a team March 15, 2024 07:37
Copy link

codecov bot commented Mar 15, 2024

Codecov Report

Attention: Patch coverage is 92.72727% with 4 lines in your changes are missing coverage. Please review.

Project coverage is 69.0%. Comparing base (a4fae95) to head (1fdb432).

❗ Current head 1fdb432 differs from pull request most recent head aefbaaf. Consider uploading reports for the commit aefbaaf to get more accurate results

Files Patch % Lines
...telemetry-sdk/src/testing/metrics/metric_reader.rs 83.3% 3 Missing ⚠️
opentelemetry-sdk/src/metrics/meter_provider.rs 97.0% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##            main   #1623     +/-   ##
=======================================
+ Coverage   68.2%   69.0%   +0.7%     
=======================================
  Files        139     139             
  Lines      19801   19849     +48     
=======================================
+ Hits       13521   13702    +181     
+ Misses      6280    6147    -133     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@cijothomas
Copy link
Member

can you add a targeted test for this? Similar to #1620

lalitb and others added 4 commits March 15, 2024 09:53
Co-authored-by: Cijo Thomas <cithomas@microsoft.com>
Co-authored-by: Cijo Thomas <cithomas@microsoft.com>
@lalitb
Copy link
Member Author

lalitb commented Mar 15, 2024

can you add a targeted test for this? Similar to #1620

Good point. Have added the tests now.

lalitb and others added 5 commits March 15, 2024 15:29
Co-authored-by: Cijo Thomas <cithomas@microsoft.com>
Co-authored-by: Cijo Thomas <cithomas@microsoft.com>
Co-authored-by: Cijo Thomas <cithomas@microsoft.com>
Co-authored-by: Zhongyang Wu <zhongyang.wu@outlook.com>
impl Drop for SdkMeterProvider {
fn drop(&mut self) {
if let Err(err) = self.shutdown() {
global::handle_error(err);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this might cause false alarms.. it is normal for users to call shutdown of their own, and then drop will call it again, causing the global error handler to print a message. It is unactionable for users, and they haven't anything wrong either.... Might need to revisit.

Copy link

@bwalk-at-ibm bwalk-at-ibm Apr 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's even worse because this is bugged because dropping just one reference to any metrics provider will shutdown and terminate the underlying data (pipelines), invalidating all copies, especially the registered global metrics provider.

Copy link
Member

@cijothomas cijothomas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!
we need follow ups for the below:

  1. Multi shutdown errors can cause unwanted confusion. (user calling shutdown() themselves, + drop)
  2. after shutdown and drop, we need to reclaim full memory used for instruments. This might not be possible with current design.

@cijothomas cijothomas merged commit bae8fb3 into open-telemetry:main Mar 16, 2024
14 of 15 checks passed
@hdost
Copy link
Contributor

hdost commented Mar 16, 2024

Isn't this a restore of sorts?

Should we consider a 0.22.2 patch?

@cijothomas
Copy link
Member

Isn't this a restore of sorts?

Should we consider a 0.22.2 patch?

Good point. Lets discuss in the next community meeting and decide.
This is still not ideal, so we may want some follow ups before doing patch release.

@cijothomas
Copy link
Member

On rethinking, I think we can live without shutdown on the opentelemetry crate. Or we can keep it, but rename it to something like reset_global_provider or something!

The drop impl to invoke shutdown is still required in sdk.

The only drawback is when app owners don't have meter_provider - but that should only occur when using tracing-opentelemetry or others, and can be fixed in those places. The OTLP Exporter in the repo also exposes pipeline builders, which also would require tweaks to account for this.

@TommyCpp
Copy link
Contributor

TommyCpp commented Mar 17, 2024

I think we can live without shutdown on the opentelemetry crate. Or we can keep it, but rename it to something like reset_global_provider or something!

I actually prefer shutdown to keep the concept in global module the same as the MeterProvider itself. But I agree app owner should keep control of the meter provider.

The only drawback is when app owners don't have meter_provider - but that should only occur when using tracing-opentelemetry or others, and can be fixed in those places

Hmm the app owner should always has control of providers even in tracing-opentelemetry. It looks like we are taking a MeterProvider in tracing-opentelemetry for some reasons. https://github.com/tokio-rs/tracing-opentelemetry/blob/v0.1.x/src/metrics.rs#L332. But it only uses the meter provider to create a meter, which doesn't require ownership of the meter provider. So a simple change is to take a &MeterProvider as parameter

@lalitb
Copy link
Member Author

lalitb commented Mar 17, 2024

  1. The global::meter_provider_shutdown() method was introduced to align with the existing shutdown methods for other signals - global::tracer_provider_shutdown() and global::logger_provider_shutdown. However, MeterProvider::Drop() was indeed required to ensure the cleanup logic is always executed upon application termination.

  2. Ideally, the OpenTelemetry crate should not include any global::<signal>_provider_shutdown()/reset() methods. There should be no scenario where the instrumented-library is required to call these methods.

  3. The application should always have the handle of the opentelemetry_sdk::<signal>Provider. This will ensure that they can always invoke shutdown and forceflush on the handle. For this to work:

    • There may be differing opinions on this, but it's worth discussing the possibility of eliminating the exporter-specific pipeline conveniences like OtlpTracePipeline, OtlpMetricPipeline, OtlpLogPipeline, and ZipkinPipeline. These pipelines internally create a provider and set it as global, and then we don't have the handle to it.

    • The tokio-tracing layer or subscriber should either maintain a reference to the Provider, similar to the approach used for logs, or take ownership of the Tracer, Meter, or Logger, as implemented for traces. The adaptation of this approach for metrics within the tracing-opentelemetry is missing and needs to be fixed. This issue has been previously brought up in discussions here and here. It appears necessary for someone, possibly us, to formally create an issue in the tracing-opentelemetry repository to address this.

@TommyCpp
Copy link
Contributor

TommyCpp commented Mar 17, 2024

Just to clearify: are we proposing removing support of the global Tracer/Meter/LoggerProvider or we are only thinking about removing global::<single>_shutdown method.

The application should always have the handle of the opentelemetry_sdk::Provider

Yes. App owner should be the one creating providers but the benifit of a global signal provider is app owner doesn't need to manage the lifetime of signal provider until shutdown. Otherwise it can be painful to pass the signal provider along the app due to the ownership model

@cijothomas
Copy link
Member

There may be differing opinions on this, but it's worth discussing the possibility of eliminating the exporter-specific pipeline conveniences like OtlpTracePipeline, OtlpMetricPipeline, OtlpLogPipeline, and ZipkinPipeline. These pipelines internally create a provider and set it as global, and then we don't have the handle to it.

yes agree. its covered in #1592

@cijothomas
Copy link
Member

Ideally, the OpenTelemetry crate should not include any global::_provider_shutdown()/reset() methods. There should be no scenario where the instrumented-library is required to call these methods.

The application should always have the handle of the opentelemetry_sdk::Provider

I am in agreement. We need to revert adding global_shutdown part of this PR, and follow through and cleanup other signals too.

@lalitb
Copy link
Member Author

lalitb commented Mar 18, 2024

Just to clearify: are we proposing removing support of the global Tracer/Meter/LoggerProvider or we are only thinking about removing global::_shutdown method.

Suggestion is to remove the global::<signal>_shutdown method without introducing any replacements such as global::<signal>_reset. Since these methods are intended exclusively for use by the application owner, they do not have a place in the API. While instrumented libraries can achieve a similar shutdown or reset effect by executing set_meter_provider(NoopMeterProvider::new());, the presence of a dedicated method might misleadingly imply that it is designed for use by libraries.

@TommyCpp
Copy link
Contributor

TommyCpp commented Mar 18, 2024

I am OK if we want to remove global::<signal>_shutdown method from API.

However, for end users, if they set the global signal prvoider using set_<signal>_provider method, they will lose the handle of the provider. For them to shutdown the provider, they need to call let _ = set_<singal>_provider(NoopProvider::new()) method to regain the handle of the provider, which seem to be unintuitive

@lalitb
Copy link
Member Author

lalitb commented Mar 18, 2024

However, for end users, if they set the global signal prvoider using set_provider method, they will lose the handle of the provider. For them to shutdown the provider, they need to call let _ = set_provider(NoopProvider::new()) method to regain the handle of the provider, which seem to be unintuitive

Yes, that's a good point. This is an issue as of now too as there is no way to do force_flush after the call to set_<signal>_provider.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[OTEL 0.22] shutdown SdkMeterProvider when built from opentelemetry-otlp?
5 participants