Skip to content

Commit

Permalink
Added new topic: serialization_of_multi_member_constructs.adoc.
Browse files Browse the repository at this point in the history
  • Loading branch information
stoobie committed Dec 24, 2023
1 parent 7c5ffce commit e377a76
Show file tree
Hide file tree
Showing 2 changed files with 116 additions and 0 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@
*** xref:Smart_Contracts/contract-syntax.adoc[Migrating a contract from Cairo 0 to Cairo]
*** xref:Smart_Contracts/cairo-and-sierra.adoc[Cairo and Sierra]
*** xref:Smart_Contracts/system-calls-cairo1.adoc[System calls]
*** xref:Smart_Contracts/serialization_of_multi_member_constructs.adoc[Serialization of Cairo types]

** Cryptography
*** xref:Cryptography/p-value.adoc[The STARK field]
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
[id="serialization_of_cairo_types"]
= Serialization of multi-member constructs in Cairo

== Which types are affected?

* Addresses: `ContractAddress`, `EthAddress`, `StorageAddress`, `ClassHash`
* all `uints`, all `ints`
* array
* Enum
* Struct
* bytearray (string)


== Who is the audience for this topic?

People who want to know how to interact with contracts, especially lib/sdk developers that want to create transactions.

== What is the issue

Field elements (`felt252`), which contain 252 bits, are the only actual type in the Cairo VM. So all high-level Cairo types that are larger than 252 bits, such as `uint256` or arrays, are ultimately represented by a list of felts.

To interact with a contract, you need to know how exactly the types in the contract’s function signature are serialized, in order to appropriately formulate the calldata in the transaction. This calldata is usually encapsulated by an SDK, such as `starknet.js`. So if you use an SDK, you don’t necessarily need to know that `u256` is represented by two felts. You specify a number, and the SDK properly encodes it.

However, when you have a structure that includes multiple members, you need to represent each member as a serialized set of field elements, where each field element can hold up to 31 bytes (248 bits). This 31-byte chunk is referred to in this context as a _word_

For example, a string is represented in Cairo as a `ByteArray` type. The first byte of each word in the byte array is the most significant byte in the word.

pub(crate) data: Array<bytes31>,
// This felt252 actually represents a bytes31, with < 31 bytes.
// It is represented as a felt252 to improve performance of building the byte array.
// The number of bytes in here is specified in `pending_word_len`.
// The first byte is the most significant byte among the `pending_word_len` bytes in the word.
pub(crate) pending_word: felt252,
// Should be in range [0, 30].
// pub(crate) pending_word_len: usize,


with the following structure:

[horizontal,labelwidth="20"]
1st member:: The number of 31-byte words in the array construct.
middle members:: The data. One or more field elements, where the last, or only, element is up to 30 bytes. An element of 30 bytes or less is a _pending word_.
last member:: The number of bytes of the pending word.

.Example 1: A string shorter than 31 characters

Consider the string `"hello"`, which is represented by the 5-byte hex value `0x68656c6c6f`. The resulting ByteArray is serialized as follows:

[source,cairo]
----
...
0, // Number of 31-byte words in the array construct.
0x68656c6c6f, // Pending word
5 // Length of the pending word, in bytes
...
----


// [horizontal,labelwidth="20"]
// 1st member:: `0`, the number of 31-byte chunks
// middle member:: `0x68656c6c6f`, 5-byte pending word. One member, which is also the pending word.
// last member:: `5`, the number of bytes in the pending word.
//

.Example 2: A string longer than 31 bytes

Consider the string `"Long string, more than 31 characters."`, which is represented by the following hex values:

* 0x4c6f6e6720737472696e672c206d6f7265207468616e203331206368617261 (31-byte word)
* 0x63746572732e (6-byte pending word)

The resulting ByteArray is serialized as follows:

[source,cairo]
----
...
1, // Number of 31-byte words in the array construct.
0x4c6f6e6720737472696e672c206d6f7265207468616e203331206368617261, // 31-byte word.
0x63746572732e, // Pending word
6 // Length of the pending word, in bytes
...
----

// [horizontal,labelwidth="20"]
// 1st member:: `1`, the number of 31-byte chunks
// middle members:: `0x4c6f6e6720737472696e672c206d6f7265207468616e203331206368617261` +
// `0x63746572732e`, // pending_word
// last member:: `6`, the number of bytes in the pending word



such as a `ByteArray`, which is the Cairo version of a string. an array of strings, which are `ByteArray`, you need to represent each member as a serialized set of felts. For example, consider the following:


In

pub struct ByteArray {
// Full "words" of 31 bytes each. The first byte of each word in the byte array
// is the most significant byte in the word.
pub(crate) data: Array<bytes31>,
// This felt252 actually represents a bytes31, with < 31 bytes.
// It is represented as a felt252 to improve performance of building the byte array.
// The number of bytes in here is specified in `pending_word_len`.
// The first byte is the most significant byte among the `pending_word_len` bytes in the word.
pub(crate) pending_word: felt252,
// Should be in range [0, 30].
pub(crate) pending_word_len: usize,
}

All integer types previously mentioned fit into a felt252, except for u256 which needs 4 more bits to be stored. Under the hood, u256 is basically a struct with 2 fields: u256 {low: u128, high: u128}

== Additional resources

* link:https://book.cairo-lang.org/ch02-02-data-types.html#integer-types[Integer types] in _The Cairo Programming Language_.

0 comments on commit e377a76

Please sign in to comment.