Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Maps to documentation #856

Merged
merged 5 commits into from
Mar 26, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions apps/nextra/next.config.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -1866,6 +1866,11 @@ export default withBundleAnalyzer(
destination: "/en/network/faucet",
permanent: true,
},
{
source: "/en/build/smart-contracts/smart-table",
destination: "/en/build/smart-contracts/maps",
permanent: true,
},
],
}),
);
4 changes: 2 additions & 2 deletions apps/nextra/pages/en/build/smart-contracts/_meta.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -75,8 +75,8 @@ export default {
"smart-vector": {
title: "Smart Vector",
},
"smart-table": {
title: "Smart Table",
Comment on lines -78 to -79
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we removing smart table documentation altogether

maps: {
title: "Maps",
},
"---examples---": {
type: "separator",
Expand Down
196 changes: 196 additions & 0 deletions apps/nextra/pages/en/build/smart-contracts/maps.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,196 @@
---
title: "Maps"
---

# Overview

There are multiple different implementations of key-value maps inside the framework, suited for different usecases.
We will go over their differences and similarities, and how to choose which one to use.

## Aptos Blockchain performance and gas cost considerations

Aptos Blockchain state is kept in **storage slots**.
Furthermore, transaction performance and gas cost is heavily influenced by how these **slots** are read and written.
Breaking down the gas costs further, we have:
1. Storage fee, which are determined by the number and size of **slots** (i.e., writing to a new **slot** incurs the highest storage fee, whereas deleting an existing **slot** provides the largest refund.)
2. IO gas costs —generally much lower— which depend on the number and size of resources read and modified.
3. execution gas costs are based on the computation needed, and are generally in the similar scale as io gas costs.

Transactions that modify the same **slot** cannot be executed concurrently (with some exceptions, like aggregators and resources as a part of the same resource group), as they conflict with one another.

One useful analogy is thinking about each **slot** being a file on a disk,
then performance of smart contract would correlate well to a program that
operates on files in the same way.

## Different Map implementations

| Implementation | Size Limit | Storage Structure | Key Features |
|--------------------|------------|------------------|--------------|
| `OrderedMap` | Bounded (fits in a single **slot**) | Stored entirely within the resource that contains it | Supports ordered access (front/back, prev/next), implemented as sorted vector, but operations are effectively O(log(n)) due to internal optimizations |
| `Table` | Unbounded | Each (key, value) stored in a separate **slot** | Supports basic operations, like `add`, `remove`, `contains`, but **not iteration**, and **cannot be destroyed**; useful for large/unbounded keys/values and where high-concurrency is needed |
| `TableWithLength` | Unbounded | same as `Table` | Variant of `Table`, with additional length tracking, which adds `length`, `empty`, and `destroy_empty` methods; Adding or removing elements **cannot** be done concurrently, modifying existing elements can. |
| `BigOrderedMap` | Unbounded | Combines multiple keys into a single **slot**, initially stored within resource that contains it, and grows into multiple **slots** dynamically | Implemented as B+ tree; **opportunistically concurrent** for non-adjacent keys; supports ordered access (front/back, prev/next); configurable node capacities to balance storage and performance |

Note:
- `SimpleMap` has been deprecated, and replaced with `OrderedMap`.
- `SmartTable` has been deprecated, and replaced with `BigOrderedMap`.

#### Performance comparison
Copy link
Contributor

@manudhundi manudhundi Mar 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a perf table that highlights the message that creating a "slot" is costly ? That is, Table is costlier than BigOrderedMap

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to add appropriate tests to do that, so will leave this for some later PR


We measured performance at small scale, measuring microseconds taken for a single pair of `insert` + `remove` operation, into a map of varied size.

| num elements | OrderedMap | BigOrderedMap max_degree>10000 | BigOrderedMap max_degree=16 |
|--------------|------------|---------------------------|-----------------------------|
| 10 | 65 | 123 | 123 |
| 100 | 85 | 146 | 455 |
| 1000 | 105 | 168 | 567 |
| 10000 | 142 | 210 | 656 |

You can see that overhead of `BigOrderedMap` compared to `OrderedMap`, when both are in the single **slot**, is around 1.5-2x.
So you can generally used `BigOrdredMap` when it is unknown if data will be too large to be stored in a single **slot**.

## Common map operations:

Most maps above support the same set of functions (for actual signatures and restrictions, check out the corresponding implementations):

#### Creating Maps

- `new<K, V>(): Self`: creates an empty map

#### Destroying Maps

- `destroy_empty<K, V>(self: Self<K, V>)`: Destroys an empty map. (**not** supported by `Table`)
- `destroy<K, V>(self: Self<K, V>, dk: |K|, dv: |V|)`: Destroys a map with given functions that destroy correponding elements. (**not** supported by `Table` and `TableWithLength`)

#### Managing Entries

- `add<K, V>(self: &mut Self<K, V>, key: K, value: V)`: Adds a key-value pair to the map.
- `remove<K, V>(self: &mut Self<K, V>, key: K): V`: Removes and returns the value associated with a key.
- `upsert<K, V>(self: &mut Self<K, V>, key: K, value: V): Option<V>`: Inserts or updates a key-value pair.
- `add_all<K, V>(self: &mut Self<K, V>, keys: vector<K>, values: vector<V>)`: Adds multiple key-value pairs to the map. (**not** supported by `Table` and `TableWithLength`)

#### Retrieving Entries

- `contains<K, V>(self: &Self<K, V>, key: &K): bool`: Checks whether key exists in the map.
- `borrow<K, V>(self: &Self<K, V>, key: &K): &V`: Returns an immutable reference to the value associated with a key.
- `borrow_mut<K: drop, V>(self: &mut Self<K, V>, key: K): &mut V`: Returns a mutable reference to the value associated with a key.
(`BigOrderedMap` only allows `borrow_mut` when value type has a static constant size, due to modification being able to break it's invariants otherwise. Use `remove()` and `add()` combination instead)

#### Order-dependant functions

These set of functions are only implemented by `OrderedMap` and `BigOrderedMap`.

- `borrow_front<K, V>(self: &Self<K, V>): (&K, &V)`
- `borrow_back<K, V>(self: &Self<K, V>): (&K, &V)`
- `pop_front<K, V>(self: &mut Self<K, V>): (K, V)`
- `pop_back<K, V>(self: &mut Self<K, V>): (K, V)`
- `prev_key<K: copy, V>(self: &Self<K, V>, key: &K): Option<K>`
- `next_key<K: copy, V>(self: &Self<K, V>, key: &K): Option<K>`

#### Utility Functions

- `length<K, V>(self: &Self<K, V>): u64`: Returns the number of entries in the map. (not supported by `Table`)

#### Traversal Functions

These set of functions are not implemented by `Table` and `TableWithLength`.

- `keys<K: copy, V>(self: &Self<K, V>): vector<K>`
- `values<K, V: copy>(self: &Self<K, V>): vector<V>`
- `to_vec_pair<K, V>(self: Self<K, V>): (vector<K>, vector<V>)`
- `for_each_ref<K, V>(self: &Self<K, V>, f: |&K, &V|)`

- `to_ordered_map<K, V>(self: &BigOrderedMap<K, V>): OrderedMap<K, V>`: Converts `BigOrderedMap` into `OrderedMap`

## Example Usage

### Creating and Using a OrderedMap

```move filename="map_usage.move"
module 0x42::map_usage {
use aptos_framework::ordered_map;

public entry fun main() {
let map = ordeded_map::new<u64, u64>();
map.add(1, 100);
map.add(2, 200);

let length = map.length();
assert!(length == 2, 0);

let value1 = map.borrow(&1);
assert!(*value1 == 100, 0);

let value2 = map.borrow(&2);
assert!(*value2 == 200, 0);

let removed_value = map.remove(&1);
assert!(removed_value == 100, 0);

map.destroy_empty();
}
}
```

## Additional details for `BigOrderedMap`

Its current implementation is B+ tree, which is chosen as it is best suited for the onchain storage layout - where the majority of cost comes from loading and writing to storage items, and there is no partial read/write of them.

Implementation has few characteristics that make it very versatile and useful across wide range of usecases:

- When it has few elements, it stores all of them within the resource that contains it, providing comparable performance to OrderedMap itself, while then dynamically growing to multiple resources as more and more elements are added
- It reduces amount of conflicts: modifications to a different part of the key-space can be generally done concurrently, and it provides knobs for tuning between concurrency and size
- All operations have guaranteed upper-bounds on performance (how long they take, as well as how much execution and io gas they consume), allowing for safe usage across a variety of use cases.
- One caveat, is refundable storage fee. By default, operation that requires map to grow to more resources needs to pay for storage fee for it. Implementation here has an option to pre-pay for storage slots, and to reuse them as elements are added/removed, allowing applications to achieve fully predictable overall gas charges, if needed.
- If key/value is within the size limits map was configured with, inserts will never fail unpredictably, as map internally understands and manages maximal **slot** size limits.

### `BigOrderedMap` structure

`BigOrderedMap` is represented as a tree, where inner nodes split the "key-space" into separate ranges for each of it's children, and leaf nodes contain the actual key-value pairs.
Internally it has `inner_max_degree` representing largest number of children an inner node can have, and `leaf_max_degree` representing largest number of key-value pairs leaf node can have.

#### Creating `BigOrderedMap`

Because it's layout affects what can be inserted and performance, there are a few ways to create and configure it:

- `new<K, V>(): Self<K, V>`: Returns a new `BigOrderedMap` with the default configuration. Only allowed to be called with constant size types. For variable sized types, another constructor is needed, to explicitly select automatic or specific degree selection.
- `new_with_type_size_hints<K, V>(avg_key_bytes: u64, max_key_bytes: u64, avg_value_bytes: u64, max_value_bytes: u64): Self<K, V>`: Returns a map that is configured to perform best when keys and values are of given `avg` sizes, and guarantees to fit elements up to given `max` sizes.
- `new_with_config<K, V>(inner_max_degree: u16, leaf_max_degree: u16, reuse_slots: bool): Self<K, V>`: Returns a new `BigOrderedMap` with the provided max degree consts (the maximum # of children a node can have, both inner and leaf). If 0 is passed for either, then it is dynamically computed based on size of first key and value, and keys and values up to 100x times larger will be accepted.
If non-0 is passed, sizes of all elements must respect (or their additions will be rejected):
- `key_size * inner_max_degree <= MAX_NODE_BYTES`
- `entry_size * leaf_max_degree <= MAX_NODE_BYTES`
Comment on lines +156 to +161
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe make a header for these methods like you did above for Table methods


`reuse_slots` means that removing elements from the map doesn't free the storage slots and returns the refund.
Together with `allocate_spare_slots`, it allows to preallocate slots and have inserts have predictable gas costs.
(otherwise, inserts that require map to add new nodes, cost significantly more, compared to the rest)

## Source Code

- [ordered_map.move](https://github.com/aptos-labs/aptos-core/blob/main/aptos-move/framework/aptos-framework/sources/datastructures/ordered_map.move)
- [table.move](https://github.com/aptos-labs/aptos-core/blob/6f5872b567075fe3615e1363d35f89dc5eb45b0d/aptos-move/framework/aptos-stdlib/sources/table.move)
- [table_with_length.move](https://github.com/aptos-labs/aptos-core/blob/6f5872b567075fe3615e1363d35f89dc5eb45b0d/aptos-move/framework/aptos-stdlib/sources/table.move)
- [big_ordered_map.move](https://github.com/aptos-labs/aptos-core/blob/main/aptos-move/framework/aptos-framework/sources/datastructures/big_ordered_map.move)

## Additional details of (deprecated) SmartTable

The Smart Table is a scalable hash table implementation based on linear hashing.
This data structure aims to optimize storage and performance by utilizing linear hashing, which splits one bucket at a time instead of doubling the number of buckets, thus avoiding unexpected gas costs.
Unfortunatelly, it's implementation makes every addition/removal be a conflict, making such transactions fully sequential.
The Smart Table uses the SipHash function for faster hash computations while tolerating collisions. Unfortunatelly, this also means that collisions are predictable, which means that if end users can control the keys being inserted, it can have large number of collisions in a single bucket.

### SmartTable Structure

The `SmartTable` struct is designed to handle dynamic data efficiently:

- `buckets`: A table with a length that stores vectors of entries.
- `num_buckets`: The current number of buckets.
- `level`: The number of bits representing `num_buckets`.
- `size`: The total number of items in the table.
- `split_load_threshold`: The load threshold percentage that triggers bucket splits.
- `target_bucket_size`: The target size of each bucket, which is not strictly enforced.

### SmartTable usage examples

- [Move Spiders Smart Table](https://movespiders.com/courses/modules/datastructures/lessonId/7)
- [Move Spiders Querying Smart Table via FullNode APIs](https://movespiders.com/courses/modules/datastructures/lessonId/9)
- [Move Spiders Querying Smart Table via View Function](https://movespiders.com/courses/modules/datastructures/lessonId/10)
Loading
Loading