-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Execution, Cadence, and Metering parameters to the Protocol State #7020
base: master
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## feature/efm-recovery #7020 +/- ##
========================================================
- Coverage 41.13% 41.10% -0.03%
========================================================
Files 2122 2123 +1
Lines 187233 187618 +385
========================================================
+ Hits 77010 77120 +110
- Misses 103776 104055 +279
+ Partials 6447 6443 -4
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this looks great! No concerns ... just thoughts about the long-term evolution of this code and establishing beneficial design patterns early:
- keeping maps out of the protocol state entirely for performance reasons 👉 comment for details
- If we want a notion of "unset" to be represented in the protocol state I would consistently use pointers for such fields in the protocol state. Or we leave it to the higher-level business logic to define special values (such as the zero value representing unset) 👉 comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just noticed the error messages in the Replicate
method seem a bit too short:
flow-go/state/protocol/protocol_state/kvstore/models.go
Lines 211 to 213 in c44ea8e
// can only Replicate into model with numerically consecutive version return nil, fmt.Errorf("unsupported replication version %d, expect %d: %w", protocolVersion, 1, ErrIncompatibleVersionChange) flow-go/state/protocol/protocol_state/kvstore/models.go
Lines 84 to 85 in c44ea8e
return nil, fmt.Errorf("unsupported replication version %d, expect %d: %w", protocolVersion, 1, ErrIncompatibleVersionChange)
We didn't update the error message betweenModelv0
andModelv1
😅 . Maybe the following would be more expressive:
return nil, fmt.Errorf("unsupported replication version %d, current model %d supports target version %d or %d: %w",
protocolVersion, currentVersion, currentVersion, nextVersion, ErrIncompatibleVersionChange)
// In particular, two versions only differing in their minor, might be entirely downwards-INCOMPATIBLE. | ||
// | ||
// We generally recommend to use Integer Versioning for components. The MagnitudeOfChangeVersion scheme should | ||
// be only used when there is a clear advantage Integer Versioning which outweighs the risk of falsely |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if the following would be the best english wording
// be only used when there is a clear advantage Integer Versioning which outweighs the risk of falsely | |
// be only used when there is a clear advantage over Integer Versioning, which outweighs the risk of falsely |
// is in no way reflected by the versioning scheme. Any automated decisions of compatibility for different | ||
// versions are to be avoided (including versions where only the minor is different). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// is in no way reflected by the versioning scheme. Any automated decisions of compatibility for different | |
// versions are to be avoided (including versions where only the minor is different). | |
// is in no way reflected by the versioning scheme. Any automated decisions regarding compatibility of | |
// different versions are to be avoided (including versions where only the minor is different). |
// Once the ViewBasedActivator A is persisted to the protocol state, P is updated | ||
// to value A.Data in the first block on or after view A.ActivationView. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Once the ViewBasedActivator A is persisted to the protocol state, P is updated | |
// to value A.Data in the first block on or after view A.ActivationView. | |
// Once the ViewBasedActivator A is persisted to the protocol state, P is updated to value | |
// A.Data in the first block with view ≥ A.ActivationView (in each fork independently). |
// TODO docs | ||
ExecutionEffortParameters map[uint]uint64 | ||
// TODO docs | ||
ExecutionMemoryParameters map[uint]uint64 | ||
// TODO docs | ||
ExecutionMemoryLimit uint64 | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
We commit to the Protocol State ID in each block. This means that we need to compute the ID of the protocol state on the hot-path of consensus, which is quite latency sensitive.
-
ID computation requires a deterministic representation of maps. Hence, we need to convert the map into a slice and sort that by key (since maps have no deterministic ordering in go) 👉 code.
-
Since sorting can be a non-trivial amount of computation, my preference would be to keep that out of the consensus hot path. In other words:
- within the protocol state, I would be inclined to turn this into a slice of pair
- The governance smart contract or the protocol state machine that manage updates can convert the values into
[]pair
- If the fvm really needs this as a map, they can convert it locally - or we can keep the map representation of
[]pair
in a higher level.
Long story short: I'd like to keep that work outside of the protocol state.
Since @janezpodhostnik is OOO currently, lets make that change already. I don't expect there to be any fundamental problem why we can't represent the maps as []pair
in the protocol state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point but depending on the size of these maps, the sort overhead could easily be negligable. If the maps are expected to be large, I'm onboard with optimizing this now. If not, I feel this is premature and would add complexity that isn't worth it.
For reference, the current hash time is ~0.1ms when both maps contain 1000 elements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing to keep in mind is that the FVM needs to know when any of the parameters (weights, limits, versions) changed because we need to hook up the cross block programs cache invalidation to that.
I think it would be great if a ID of the execution part would be provided as a separate thing.
The map size is currently in the 1e2
range and might go into the 1e3
range in the future, but its never going to be larger then that. I think if we want to get rid of the sorting it also needs to be represented as a list of pairs in cadence, because cadence does not offer guarantees that a map will be read in the same order in different blocks (even if the contents of the map are the same).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing to keep in mind is that the FVM needs to know when any of the parameters (weights, limits, versions) changed because we need to hook up the cross block programs cache invalidation to that.
I think it would be great if a ID of the execution part would be provided as a separate thing.
Let me try to re-phrase that to make sure I understood your point:
- You want to query the "ID of the Execution Parameters" to decide whether the parameters changes compared to some previous configuration you have cached.
- If the ID is different, you know that some of the parameters changed and invalidate the cache
- If the ID is unchanged, you know the Execution Parameters are unchanged; hence you continue to work with the cache
We already have an ID for the protocol state overall. But that may change a bit more frequently than the execution state parameters ... but probably most of the changes will be due to Execution changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sort overhead could easily be negligable.
map size is currently in the 1e2 range and might go into the 1e3 range in the future, but its never going to be larger then that
Admittedly, I am quite ocd about those aspects ... In my experience, it is generally quite time intensive to figure out where to optimize a complex system when it is desired to be faster ... if it is a lot of little aspects, you might as well just do a lot of those optimizations from the start, unless you are pretty sure that this aspect will never be too slow.
I guess my question is: why doesn't the smart contract do the sorting when it receives a transaction updating the Execution Parameters? Does it make a difference whether we implement the sorting here or in the smart contract? And once the smart contract sorted its internal representation of the Execution Parameters, it emits them as a list in the service event?
We don't depend on the ordering for safety, we only depend on the ordering for performance. Therefore, it makes sense to do the ordering once in the beginning.
But yea, I get the argument 😓 😅
// UndefinedMagnitudeOfChangeVersion represents the zero or unset value for a MagnitudeOfChangeVersion. | ||
var UndefinedMagnitudeOfChangeVersion = MagnitudeOfChangeVersion{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would recommend to go with a uniform design
- generally, we are using
nil
to represent unset fields (incl. theViewBasedActivator
) - but here, we are using an element of the value range and explicitly define it to represent "unset"
My recommendation would be either of the following:
- we use a pointer instead of a value type
now,
protocol.UpdatableField[*protocol.MagnitudeOfChangeVersion]
nil
represents the "unset" parameter. - we remove
UndefinedMagnitudeOfChangeVersion
and haveModelv2.GetExecutionComponentVersion()
always return the value. In other words, we would remove lines 320-322 fromflow-go/state/protocol/protocol_state/kvstore/models.go
Lines 317 to 324 in c44ea8e
// GetExecutionComponentVersion returns the current Execution component version from Modelv2. // Returns kvstore.ErrKeyNotSet if the key has no value func (model *Modelv2) GetExecutionComponentVersion() (protocol.MagnitudeOfChangeVersion, error) { if model.ExecutionComponentVersion.CurrentValue == protocol.UndefinedMagnitudeOfChangeVersion { return protocol.UndefinedMagnitudeOfChangeVersion, ErrKeyNotSet } return model.ExecutionComponentVersion.CurrentValue, nil }
Reasoning:
-
In essence, with either approach, the Execution Node has to deal with the "unset" scenario:
- in the current implementation the EN gets
ErrKeyNotSet
- In case 1. the execution node gets
nil
- with solution 2. the EN gets the zero value
In the end, the EN is interpreting the return value. Therefore, it seems logical that the EN makes the determination what a particular value actually means instead of the protocol state defining that one specific value has a special meaning.
- in the current implementation the EN gets
-
I have a very soft inclination towards option 1, because I think the intention is that there should always be meaningful values. I think we can leave that edge case of what happens during the upgrade to the Execution Node.
Nevertheless, I have no big concerns with your approach ... but I also don't know how confusing this will be when we have 20 more entries in the protocol state and there is a wild mix between zero values with proper meaning, zero values representing "unset", pointers, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with using nil for possibly empty values. Given that we are going to wait on merging this, I will revisit this decision later:
- if we can align on meaningful initial values to use when merging with the Execution team, then we can use non-nilable values from the start and omit the
ErrKeyNotSet
- otherwise, we can use nil-able values per this suggestion
Converting to a draft as we don't plan to include this prior to merging EFM Recovery and deploying protocol HCU v2. Please feel free to provide feedback now; I will re-base the PR against master following the Protocol HCU. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
Say we spin up a new network (like localnet) what would be the process of changing the ExecutionEfortWeights
(and other parameters) right after or during bootstrap? Would this just mean setting values in a smart contract, or are more steps needed?
// TODO docs | ||
ExecutionEffortParameters map[uint]uint64 | ||
// TODO docs | ||
ExecutionMemoryParameters map[uint]uint64 | ||
// TODO docs | ||
ExecutionMemoryLimit uint64 | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing to keep in mind is that the FVM needs to know when any of the parameters (weights, limits, versions) changed because we need to hook up the cross block programs cache invalidation to that.
I think it would be great if a ID of the execution part would be provided as a separate thing.
The map size is currently in the 1e2
range and might go into the 1e3
range in the future, but its never going to be larger then that. I think if we want to get rid of the sorting it also needs to be represented as a list of pairs in cadence, because cadence does not offer guarantees that a map will be read in the same order in different blocks (even if the contents of the map are the same).
// TODO should this live in fvm package? | ||
type ExecutionMeteringParameters struct { | ||
// TODO docs | ||
ExecutionEffortParameters map[uint]uint64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized, that we can take this opportunity to rename Execution(.*)Parameters
to a more appropriate name: Execution(.*)Weights
// teams working with this version. The engineers in those teams must commit to diligently documenting | ||
// all relevant changes, details regarding magnitude of changes and if applicable “imperfect but | ||
// good-enough downwards compatibility”. | ||
type MagnitudeOfChangeVersion struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are your thoughts using MagnitudeVersion
instead of MagnitudeOfChangeVersion
or maybe something like FeatureVersion
?
// Error Returns: | ||
// - kvstore.ErrKeyNotSupported if invoked on a KVStore instance before v2. | ||
// - kvstore.ErrKeyNotSet if the key has no value | ||
GetCadenceComponentVersion() (MagnitudeOfChangeVersion, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we condense it and instead of returning version GetCadenceComponentVersion
and GetDacdenceComponentVersionUpgrade
we implement GetCadenceComponentVersion
with return type UpdatableField[MagnitudeOfChangeVersion]
?
ExecutionEffortParameters map[uint]uint64 | ||
// TODO docs | ||
ExecutionMemoryParameters map[uint]uint64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we change this to uint64
? Because uint
is platform dependent ... and by setting it to uint64
we avoid ambiguity.
Adds execution versioning and metering parameters to the v2 KVStore. The goal of this PR was to minimally add the desired fields to v2 to avoid an additional model change down the road. This PR adds getters which always return appropriate errors but no logic for setting the new fields.
We've decided to instead wait and add these fields to the v3 KVStore. This PR will be updated accordingly and re-based against master following the v2 upgrade. Feel free to provide feedback in the meantime.
Ref: #7000