Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Maps to documentation #856

Merged
merged 5 commits into from
Mar 26, 2025
Merged

Add Maps to documentation #856

merged 5 commits into from
Mar 26, 2025

Conversation

igor-aptos
Copy link
Contributor

Description

generated in UI, not sure if I need to autogenerate anything, or is it automatic

Checklist

  • If any existing pages were renamed or removed:
    • Were redirects added to next.config.mjs?
    • Did you update any relative links that pointed to the renamed / removed pages?
  • Do all Lints pass?
    • Have you ran pnpm fmt?
    • Have you ran pnpm lint?

@igor-aptos igor-aptos requested a review from a team as a code owner March 18, 2025 23:23
Copy link

vercel bot commented Mar 18, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
developer-docs-nextra ✅ Ready (Inspect) Visit Preview 💬 Add feedback Mar 26, 2025 4:31pm

Also if we want to parallelize transactions and we have a few elements that are modified extremely often, `Table` can provide that.
Note that `Table` cannot be destroyed, because it doesn't know if it is empty.
- `TableWithLength` is wrapper around the `Table`, that adds tracking of it's `length`, allowing `length`, `empty` and `destroy_empty`
operations on top of the `Table`. Adding or removing elements to `TableWithLength` cannot be done in parallel.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

highlight this "cannot"

shall we discourage people to use TableWithLength?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you cannot use Table if you want to be able to destroy it.

we probably need TableWithLength variant that uses aggregators. not sure if it is worth copying for that :) maybe TableWithParLength ? :)

Comment on lines 12 to 24
State on the Aptos Blockchain is managed as a set of resources. Transactions
performance heavily depends on how reads and writes to resources.
Storage gas costs are paid based on number of resources that exist, and their sizes.
IO gas costs are paid based on number of resources read and modified, and their sizes,
but are generally significantly smaller than storage gas costs.
That means that writing to a new resource has the highest (storage) gas cost, and deleting
an existing resource gives the largest refund.
Additionally, transactions modifying the same resource conflict with one another, and cannot be
executed in parallel.

One useful analogy is thinking about each resource being a file on a disk,
then performance of smart contract would correlate well to a program that
operates on files in the same way.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this should go in a separate page on gas since this is a page about maps, but if you want to keep it here as an overview, i have some suggestions below

Suggested change
State on the Aptos Blockchain is managed as a set of resources. Transactions
performance heavily depends on how reads and writes to resources.
Storage gas costs are paid based on number of resources that exist, and their sizes.
IO gas costs are paid based on number of resources read and modified, and their sizes,
but are generally significantly smaller than storage gas costs.
That means that writing to a new resource has the highest (storage) gas cost, and deleting
an existing resource gives the largest refund.
Additionally, transactions modifying the same resource conflict with one another, and cannot be
executed in parallel.
One useful analogy is thinking about each resource being a file on a disk,
then performance of smart contract would correlate well to a program that
operates on files in the same way.
Aptos Blockchain state is managed through on-chain **resources**. Furthermore, transaction performance and gas cost is heavily influenced by how these resources are read and written.
Breaking down the gas costs further, we have:
1. Storage gas costs, which are determined by the number and size of resources (i.e., writing to a new resource incurs the highest storage gas cost, whereas deleting an existing resource provides the largest refund.)
2. IO gas costs—generally much lower— which depend on the number and size of resources read and modified.
Transactions that modify the same resource cannot be executed in parallel, as they conflict with one another.

Comment on lines 28 to 47
- `OrderedMap` is a struct, and is, similar to `vector`, fully contained within the resource that stores it.
With it, it is bounded in size to the size of a single resource.
It provides regular map functions, as well as accessing elements in order, like front/back or prev/next.
When you need an inline mapping, that will fit in a resource, this is the option to choose.
It's implementation is SortedVectorMap, but because of limited size and efficiency of memcpy, all main operations are practically O(log(n)).
- `Table` is unbounded in size, puts each (key, value) pair in the separate resource. You can `add` or `remove` elements,
or check if it `contains` some key, but cannot be iterated on. When keys or values are large / unbounded, we can use the `Table`.
Also if we want to parallelize transactions and we have a few elements that are modified extremely often, `Table` can provide that.
Note that `Table` cannot be destroyed, because it doesn't know if it is empty.
- `TableWithLength` is wrapper around the `Table`, that adds tracking of it's `length`, allowing `length`, `empty` and `destroy_empty`
operations on top of the `Table`. Adding or removing elements to `TableWithLength` cannot be done in parallel.
- `BigOrderedMap` groups multiple (key, value) pairs in a single resource, but is unbounded in size - and uses more resources as needed.
It's implementation is a BPlusTreeMap, where each node is a resource containing OrderedMap, with inner nodes only containing keys, while leaves contain values as well.
It is opportunistically parallel - if map has large enough elements to be using multiple resources, modifying the map for keys that are not close
to each other should generally be parallel operation.
It is configured so that each resource containing internal node has the same capacity in number of keys,
and each resource containing leaf node has the same capacity in the number of (key, value) pairs.
Capacity of nodes (both leaf and inner degree) are configurable - to allow the tradeoff between storage gas cost on one end,
and other gas costs and parallelism on the other.
It provides regular map functions, as well as accessing elements in order, like front/back or prev/next.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `OrderedMap` is a struct, and is, similar to `vector`, fully contained within the resource that stores it.
With it, it is bounded in size to the size of a single resource.
It provides regular map functions, as well as accessing elements in order, like front/back or prev/next.
When you need an inline mapping, that will fit in a resource, this is the option to choose.
It's implementation is SortedVectorMap, but because of limited size and efficiency of memcpy, all main operations are practically O(log(n)).
- `Table` is unbounded in size, puts each (key, value) pair in the separate resource. You can `add` or `remove` elements,
or check if it `contains` some key, but cannot be iterated on. When keys or values are large / unbounded, we can use the `Table`.
Also if we want to parallelize transactions and we have a few elements that are modified extremely often, `Table` can provide that.
Note that `Table` cannot be destroyed, because it doesn't know if it is empty.
- `TableWithLength` is wrapper around the `Table`, that adds tracking of it's `length`, allowing `length`, `empty` and `destroy_empty`
operations on top of the `Table`. Adding or removing elements to `TableWithLength` cannot be done in parallel.
- `BigOrderedMap` groups multiple (key, value) pairs in a single resource, but is unbounded in size - and uses more resources as needed.
It's implementation is a BPlusTreeMap, where each node is a resource containing OrderedMap, with inner nodes only containing keys, while leaves contain values as well.
It is opportunistically parallel - if map has large enough elements to be using multiple resources, modifying the map for keys that are not close
to each other should generally be parallel operation.
It is configured so that each resource containing internal node has the same capacity in number of keys,
and each resource containing leaf node has the same capacity in the number of (key, value) pairs.
Capacity of nodes (both leaf and inner degree) are configurable - to allow the tradeoff between storage gas cost on one end,
and other gas costs and parallelism on the other.
It provides regular map functions, as well as accessing elements in order, like front/back or prev/next.
| Implementation | Size Limit | Storage Structure | Key Features |
|--------------------|------------|------------------|--------------|
| **OrderedMap** | Bounded (fits in a single resource) | Stored entirely within the resource | Supports ordered access (front/back, prev/next), implemented as `SortedVectorMap`, O(log(n)) operations |
| **Table** | Unbounded | Each (key, value) stored in a separate resource | Supports `add`, `remove`, `contains`, but **not iteration**; useful for large/unbounded keys/values and high-parallelism cases |
| **TableWithLength** | Unbounded | `Table` with additional length tracking | Supports `length`, `empty`, and `destroy_empty`; cannot modify in parallel |
| **BigOrderedMap** | Unbounded | Uses multiple resources dynamically | Implemented as `BPlusTreeMap`; **opportunistically parallel** for non-adjacent keys; supports ordered access (front/back, prev/next); configurable node capacities to balance storage and performance |

Comment on lines 57 to 59
#### Creating Tables

- `new<K, V>(): Self`: creates an empty map
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Semantics are a bit confusing here — Tables and maps are the same?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll update the title. I'll use map throughout as a general term, and Table is just an implementation

Comment on lines +151 to +156
- `new<K, V>(): Self<K, V>`: Returns a new `BigOrderedMap` with the default configuration. Only allowed to be called with constant size types. For variable sized types, another constructor is needed, to explicitly select automatic or specific degree selection.
- `new_with_type_size_hints<K, V>(avg_key_bytes: u64, max_key_bytes: u64, avg_value_bytes: u64, max_value_bytes: u64): Self<K, V>`: Returns a map that is configured to perform best when keys and values are of given `avg` sizes, and guarantees to fit elements up to given `max` sizes.
- `new_with_config<K, V>(inner_max_degree: u16, leaf_max_degree: u16, reuse_slots: bool): Self<K, V>`: Returns a new `BigOrderedMap` with the provided max degree consts (the maximum # of children a node can have, both inner and leaf). If 0 is passed for either, then it is dynamically computed based on size of first key and value, and keys and values up to 100x times larger will be accepted.
If non-0 is passed, sizes of all elements must respect (or their additions will be rejected):
- `key_size * inner_max_degree <= MAX_NODE_BYTES`
- `entry_size * leaf_max_degree <= MAX_NODE_BYTES`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe make a header for these methods like you did above for Table methods

## Different Map implementations

- `OrderedMap` is a struct, and is, similar to `vector`, fully contained within the resource that stores it.
With it, it is bounded in size to the size of a single resource.
Copy link
Contributor

@lightmark lightmark Mar 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use blob to replace resource here. People with be confused by resource & resource groups...let's avoid those in this doc and use blob as the basic unit of a leaf in a merkle tree.

It provides regular map functions, as well as accessing elements in order, like front/back or prev/next.
When you need an inline mapping, that will fit in a resource, this is the option to choose.
It's implementation is SortedVectorMap, but because of limited size and efficiency of memcpy, all main operations are practically O(log(n)).
- `Table` is unbounded in size, puts each (key, value) pair in the separate resource. You can `add` or `remove` elements,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

single bob since resource is a move concept. We are discussing storage.

lightmark

This comment was marked as duplicate.

@igor-aptos
Copy link
Contributor Author

addressed comments, and added perf table

@igor-aptos
Copy link
Contributor Author

I've took @hariria's suggestion to have a table instead of a list for comparison, and to shorten it. let me know if it looks better, or you prefer the older version

Copy link
Collaborator

@hariria hariria left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@igor-aptos
Copy link
Contributor Author

@lightmark term we use in code is storage slot, so I'll use that.

made a change to differentiate storage slot and resource (there are still some references to the "resource" where applicable)

@igor-aptos
Copy link
Contributor Author

I'll land so I have the URL, and to see how it looks, but feel free to continue commenting, and I'll send another PR to adjust

@igor-aptos igor-aptos enabled auto-merge (squash) March 20, 2025 22:01

We measured performance at small scale, measuring microseconds taken for a single pair of `insert` + `remove` operation, into a map of varied size.

| num elements | OrderedMap | BigOrderedMap all inlined | BigOrderedMap max_degree=16 |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably needs a bit elaboration on the setup explaining "all inlined" vs "max_degree=16"

maybe describe the inlining in the comparison table above. (nice if a confused reader can find something by searching "inline")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed inline, and added section for max_degree


#### Destroying Maps

All except `Table` support:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exceptions are under a few titled below, seems not necessary to call it out in this section.

@igor-aptos igor-aptos disabled auto-merge March 20, 2025 22:34
@igor-aptos igor-aptos enabled auto-merge (squash) March 20, 2025 22:34
Comment on lines -78 to -79
"smart-table": {
title: "Smart Table",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we removing smart table documentation altogether

@igor-aptos
Copy link
Contributor Author

@gregnazario added the SmartTable section to "Maps" page, don't think we need two pages - as most of the functions are consistent across maps.

Added caveats on why SmartTable should be avoided there.

- `SimpleMap` has been deprecated, and replaced with `OrderedMap`.
- `SmartTable` has been deprecated, and replaced with `BigOrderedMap`.

#### Performance comparison
Copy link
Contributor

@manudhundi manudhundi Mar 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a perf table that highlights the message that creating a "slot" is costly ? That is, Table is costlier than BigOrderedMap

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to add appropriate tests to do that, so will leave this for some later PR

@gregnazario gregnazario dismissed their stale review March 26, 2025 16:23

Removing blocker

@igor-aptos igor-aptos merged commit 0d603e4 into main Mar 26, 2025
4 of 5 checks passed
@igor-aptos igor-aptos deleted the igor/maps_docs branch March 26, 2025 16:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants