Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracer::Clone is at least 30% of the cost of Tracer::start, because of InstrumentationLibrary #1092

Closed
shaun-cox opened this issue Jun 5, 2023 · 5 comments · Fixed by #1093

Comments

@shaun-cox
Copy link
Contributor

shaun-cox commented Jun 5, 2023

Here's a flame graph of starting portion of a span from the sdk's trace benchmark.
image

Cloning a Tracer needs to be cheap, because we do it when creating new spans.

/// `Tracer` implementation to create and manage spans
#[derive(Clone)]
pub struct Tracer {
    instrumentation_lib: InstrumentationLibrary,
    provider: Weak<TracerProviderInner>,
}

But, InstrumentationLibrary contains 3 Cow<'static, str> instances and a Vec<KeyValue> for a total of 120 bytes per instance. All of that needs to get copied when Cloning one.

I think we need to focus on the design of the InstrumentationLibrary type, as it's in the api. To me, it feels like it is (or should be) immutable and needs to be cheaply cloneable, so my first thought was it should resemble something like an Arc wrapping an inner struct of 3 `static string slices and the optional vector of attributes.

I'm willing to contribute at PR to change this but want to coordinate and get consensus before choosing an implementation path that might ultimately get rejected after putting in a bunch of work.

My first question is about the seeming preference for Cow<'static, str> rather than, say Arc<str> for representing things like library name, version, and schema url, knowing that they are all fundamentally immutable. IOW, I'm confused for what scenario would ever want to mutate these, and hence need the "copy on write" functionality?

@lalitb
Copy link
Member

lalitb commented Jun 5, 2023

I'm willing to contribute at PR to change this but want to coordinate and get consensus before choosing an implementation path that might ultimately get rejected after putting in a bunch of work.

I would vote for this, I did notice this while reviewing logging SDK (#788 (comment)), and find it unnecessary to clone InstrumentationLibrary and Resource for each log getting exported. Though final say would be by approvers/maintainers :)

@cijothomas
Copy link
Member

I think similar optimization can be done for Resource too?

@shaun-cox
Copy link
Contributor Author

I think similar optimization can be done for Resource too?

For sure. I'm working on a change to make schema_url a Option<Arc<str>> rather than Option<Cow<'static, str>>, but honestly i wonder if it could not just be a literal ('static str)

@TommyCpp
Copy link
Contributor

TommyCpp commented Jun 6, 2023

but honestly i wonder if it could not just be a literal ('static str)

It used to be a literal but we received feedback that some use case populate the value during runtime. (See #666)

@djc
Copy link
Contributor

djc commented Jun 6, 2023

I think the Cow<'static, str> scheme came about as a way to allow (a) usually just &'static str but (b) sometimes allow a run-time populated String. &'static str is potentially nice because it avoids allocating, but I suppose if InstrumentationLibrary is long-lived it probably makes more sense to eat the cost of the initial allocation for the benefit of cheap cloning and less memory use over time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants