-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(blockifier): add cairo native cache rate metric #4393
feat(blockifier): add cairo native cache rate metric #4393
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
474752c
to
6e1d0dc
Compare
1e9c1d7
to
67bf4e8
Compare
6e1d0dc
to
cb72896
Compare
cb72896
to
8cd464e
Compare
f8ad409
to
5f1026c
Compare
Is there a reason to use Suggestion: pub fn miss_rate(&self) -> f64 { |
Do we want to also keep track of the number of cache hits getting a native compiled classes? Code quote: pub struct CacheMetrics {
cache_misses: Arc<AtomicUsize>,
total_cache_calls: Arc<AtomicUsize>,
} |
I think this is slightly cleaner. None-blocking Suggestion: cache_metrics: Arc<CacheMetrics>,
}
#[derive(Default)]
pub struct CacheMetrics {
cache_misses: AtomicUsize,
total_cache_calls: AtomicUsize,
} |
How about adding a private Non-blocking Code quote: NativeClassManager {
cairo_native_run_config,
cache,
sender: Some(sender),
compiler: None,
cache_metrics: CacheMetrics::new(),
} |
Suggestion: fn get_class_cache_miss_rate(&self) -> Ratio<usize>; |
Why do you need to Code quote: CAIRO_NATIVE_CACHE_MISS_RATIO.set(0.0); |
Suggestion: MetricCounter { CLASS_CACHE_MISSES, "blockifier_class_cache_misses", "Counter of cache misses in the global class cache"}
MetricCounter { CLASS_CACHE_CALLS, "blockifier_class_cache_calls", "Counter of total cache calls in the global class cache"} |
Actually, better make the second counter |
This way we only have to update a single counter in each case. Suggestion: pub struct CacheMetrics {
cache_misses: Arc<AtomicUsize>,
cache_hits: Arc<AtomicUsize>,
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General comment:
I think it is better to have two counter metrics:
- cache_hits
- cache_misses
No need to calculate the ratio - we can do the calculation in grafana
And, cache_hits
is better than total_cache_calls
since we only have to update it when we get a hit and not upon a miss.
Reviewed 5 of 11 files at r1, 6 of 6 files at r3, all commit messages.
Reviewable status: 11 of 12 files reviewed, 7 unresolved discussions (waiting on @AvivYossef-starkware and @noaov1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 11 files at r1.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @AvivYossef-starkware and @noaov1)
f59fcde
to
cf25f1c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 1 of 14 files reviewed, 6 unresolved discussions (waiting on @avi-starkware and @noaov1)
crates/blockifier/src/state/native_class_manager.rs
line 63 at r3 (raw file):
Previously, avi-starkware (Avi Cohen) wrote…
Do we want to also keep track of the number of cache hits getting a native compiled classes?
added
crates/blockifier/src/state/native_class_manager.rs
line 63 at r3 (raw file):
Previously, avi-starkware (Avi Cohen) wrote…
I think this is slightly cleaner.
This way we just clone the Arc pointer when cloning the contract class manager, and not the struct itself.None-blocking
Liked it
crates/blockifier/src/state/native_class_manager.rs
line 63 at r3 (raw file):
Previously, avi-starkware (Avi Cohen) wrote…
This way we only have to update a single counter in each case.
Thanks
crates/blockifier/src/state/native_class_manager.rs
line 82 at r3 (raw file):
Previously, avi-starkware (Avi Cohen) wrote…
Is there a reason to use
Ratio<usize>
and notf64
? You convert it tof64
anyway when setting the number in the gauge right?
Not relevant anymore
crates/blockifier/src/state/native_class_manager.rs
line 138 at r3 (raw file):
Previously, avi-starkware (Avi Cohen) wrote…
How about adding a private
NativeClassManager::new()
method for handling all the different possibilities for creating this struct in one place?Non-blocking
sounds good, I'll open a separate PR for it
crates/starknet_batcher/src/metrics.rs
line 19 at r3 (raw file):
Previously, avi-starkware (Avi Cohen) wrote…
Why do you need to
set
the starting value here and it is not needed when registering other metrics?
Can't you just put aninitial_value
in the metric definition?
It changed last week
crates/starknet_sequencer_metrics/src/metric_definitions.rs
line 59 at r3 (raw file):
Previously, avi-starkware (Avi Cohen) wrote…
Actually, better make the second counter
cache_hits
. This way we don't have to update it on a cache miss.
Done.
crates/starknet_batcher/src/block_builder.rs
line 354 at r3 (raw file):
) -> BlockBuilderResult<(Box<dyn BlockBuilderTrait>, AbortSignalSender)>; fn get_contract_class_manager_cache_miss_rate(&self) -> Ratio<usize>;
Done
Why Code quote: /// Retrieves the current cache miss counter and resets it to zero.
pub fn take_cache_miss_counter(&self) -> u64 {
self.cache_metrics.cache_misses.swap(0, Ordering::Relaxed)
}
/// Retrieves the current cache hit counter and resets it to zero.
pub fn take_cache_hit_counter(&self) -> u64 {
self.cache_metrics.cache_hits.swap(0, Ordering::Relaxed)
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 8 of 13 files at r4, all commit messages.
Reviewable status: 9 of 14 files reviewed, 3 unresolved discussions (waiting on @AvivYossef-starkware and @noaov1)
crates/starknet_sequencer_dashboard/src/dashboard_definitions.rs
line 66 at r4 (raw file):
"The ratio of cache misses in the Cairo native cache", formatcp!( "100 * ({} / ({} + {}))",
What happens when you have division by zero here? (note you will always have division by zero here when cairo_native is off)
Code quote:
"100 * ({} / ({} + {}))",
crates/starknet_batcher/src/batcher_test.rs
line 125 at r4 (raw file):
let mut block_builder_factory = MockBlockBuilderFactoryTrait::new(); block_builder_factory.expect_take_class_cache_miss_counter().return_const(0_u64); block_builder_factory.expect_take_class_cache_hit_counter().return_const(0_u64);
Why is the return_const now zero?
shouldn't the behavior be the same (the ratio of 1/2 we had before means 1 in both counters now)
I am not sure I fully understand what return_const does...
Code quote:
block_builder_factory.expect_take_class_cache_miss_counter().return_const(0_u64);
block_builder_factory.expect_take_class_cache_hit_counter().return_const(0_u64);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5 of 13 files at r4.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @AvivYossef-starkware and @noaov1)
Previously, AvivYossef-starkware wrote…
My question in this comment was not about cache hits in general Code snippet: pub struct CacheMetrics {
cache_misses: Arc<AtomicUsize>,
cache_hits: Arc<AtomicUsize>,
} |
cf25f1c
to
9424c39
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 10 of 14 files reviewed, 4 unresolved discussions (waiting on @avi-starkware and @noaov1)
crates/blockifier/src/state/native_class_manager.rs
line 63 at r3 (raw file):
Previously, avi-starkware (Avi Cohen) wrote…
My question in this comment was not about cache hits in general
I think we want to additionally have the number of cache hits that return a cairo_native contract
So we could keep track of how many of the entry_points are run with cairo_native and how many with casm
This PR aims to add the cache miss rate, which will help us decide the right cache size. It’s also a good idea to include the cairo_native compilation rate. I will do that in a separate PR.
crates/blockifier/src/state/native_class_manager.rs
line 248 at r4 (raw file):
Previously, avi-starkware (Avi Cohen) wrote…
Why
take
and notget
?
Won't you get the exact same number in the metric by updating it to theget
value instead of incrementing by thetake
value?
MetricCounter
has no set attribute, only an increment. This is beneficial because every time we create a new contract manager, it won’t need to track the previous cache hits and misses.
crates/starknet_batcher/src/batcher_test.rs
line 125 at r4 (raw file):
Previously, avi-starkware (Avi Cohen) wrote…
Why is the return_const now zero?
shouldn't the behavior be the same (the ratio of 1/2 we had before means 1 in both counters now)I am not sure I fully understand what return_const does...
The return const tells the mock object to return a constant value every time the function is called.
Nothing tests these values, so I could put every possible number here.
crates/starknet_sequencer_dashboard/src/dashboard_definitions.rs
line 66 at r4 (raw file):
Previously, avi-starkware (Avi Cohen) wrote…
What happens when you have division by zero here? (note you will always have division by zero here when cairo_native is off)
According to chatgpt:
This 0/0 yields NaN, and Grafana will typically display this as a gap or missing datapoint in your graph.
but I prefer using clamp_min and avoid it
Previously, AvivYossef-starkware wrote…
Does that mean that the metric keeps growing infinitely? I think we would like to be able to see how changing the cache size affects the cache miss rate, and that will only be possible if the metrics reset when restarting the batcher |
Previously, AvivYossef-starkware wrote…
Oh okay sounds good |
Suggestion: MetricCounter { CLASS_CACHE_MISSES, "class_cache_misses", "Counter of global class cache misses", init=0 },
MetricCounter { CLASS_CACHE_HITS, "class_cache_hits", "Counter of global class cache hits", init=0 }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 4 files at r5, all commit messages.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @AvivYossef-starkware and @noaov1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @AvivYossef-starkware and @noaov1)
9424c39
to
b7ba41e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 9 of 14 files reviewed, 2 unresolved discussions (waiting on @avi-starkware, @Itay-Tsabary-Starkware, and @noaov1)
crates/blockifier/src/state/native_class_manager.rs
line 248 at r4 (raw file):
Previously, avi-starkware (Avi Cohen) wrote…
Does that mean that the metric keeps growing infinitely?
Will it reset to 0 when restarting the batcher?I think we would like to be able to see how changing the cache size affects the cache miss rate, and that will only be possible if the metrics reset when restarting the batcher
I think that the metric will reset to 0 every time that we reset the batcher because the register metric function is called
and the initial value is 0
pub fn register(&self) {
counter!(self.name).absolute(self.initial_value);
describe_counter!(self.name, self.description);
}
@Itay-Tsabary-Starkware am I right?
crates/starknet_batcher/src/metrics.rs
line 12 at r5 (raw file):
MetricCounter { PROPOSAL_STARTED, "batcher_proposal_started", "Counter of proposals started", init = 0 }, MetricCounter { CAIRO_NATIVE_CACHE_MISSES, "cairo_native_cache_misses", "Counter for cache misses in the Cairo native class manager", init=0 }, MetricCounter { CAIRO_NATIVE_CACHE_HITS, "cairo_native_cache_hits", "Counter for cache hits in the Cairo native class manager", init=0 },
Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5 of 5 files at r6, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @Itay-Tsabary-Starkware and @noaov1)
No description provided.