Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: improve type.hashTreeRoot() using batch #409

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

twoeths
Copy link
Contributor

@twoeths twoeths commented Oct 15, 2024

Motivation

  • improve type.hashTreeRoot() using batch

Description

  • instead of getRoots() and compute root from there, this PR implement getChunkBytes()
    • this compute root using merkleizeInto() which use batch there
    • reuse chunkBytesBuffer memory in type, almost no Uint8Array allocations in the middle
  • new hashTreeRootInto() api. This is needed in case consumers want to reuse memory allocation there
  • use allocUnsafe() of as-sha256 where it makes sense

cherry picked from #378

Copy link

github-actions bot commented Oct 15, 2024

Performance Report

✔️ no performance regression detected

Full benchmark results
Benchmark suite Current: 6351e2b Previous: 5041139 Ratio
digestTwoHashObjects 50023 times 48.772 ms/op 48.248 ms/op 1.01
digest2Bytes32 50023 times 54.866 ms/op 54.794 ms/op 1.00
digest 50023 times 54.151 ms/op 53.430 ms/op 1.01
input length 32 1.1620 us/op 1.1670 us/op 1.00
input length 64 1.3050 us/op 1.3180 us/op 0.99
input length 128 2.3190 us/op 2.2320 us/op 1.04
input length 256 3.3840 us/op 3.3300 us/op 1.02
input length 512 5.6140 us/op 5.5630 us/op 1.01
input length 1024 10.721 us/op 10.681 us/op 1.00
digest 1000000 times 863.76 ms/op 852.84 ms/op 1.01
hashObjectToByteArray 50023 times 1.2288 ms/op 1.2289 ms/op 1.00
byteArrayToHashObject 50023 times 1.5485 ms/op 1.5426 ms/op 1.00
digest64 200092 times 213.57 ms/op 214.18 ms/op 1.00
hash 200092 times using batchHash4UintArray64s 264.06 ms/op 250.07 ms/op 1.06
digest64HashObjects 200092 times 193.14 ms/op 191.43 ms/op 1.01
hash 200092 times using batchHash4HashObjectInputs 234.19 ms/op 197.77 ms/op 1.18
getGindicesAtDepth 3.4720 us/op 3.4480 us/op 1.01
iterateAtDepth 6.2800 us/op 6.3750 us/op 0.99
getGindexBits 367.00 ns/op 356.00 ns/op 1.03
gindexIterator 873.00 ns/op 896.00 ns/op 0.97
HashComputationLevel.push then loop 25.666 ms/op 25.761 ms/op 1.00
HashComputation[] push then loop 38.016 ms/op 47.920 ms/op 0.79
hash 2 Uint8Array 500000 times - hashtree 215.55 ms/op 213.19 ms/op 1.01
hashTwoObjects 500000 times - hashtree 217.74 ms/op 214.65 ms/op 1.01
executeHashComputations - hashtree 9.1150 ms/op 9.4258 ms/op 0.97
hash 2 Uint8Array 500000 times - as-sha256 558.77 ms/op 560.79 ms/op 1.00
hashTwoObjects 500000 times - as-sha256 508.70 ms/op 509.59 ms/op 1.00
executeHashComputations - as-sha256 48.588 ms/op 44.935 ms/op 1.08
hash 2 Uint8Array 500000 times - noble 1.2271 s/op 1.2219 s/op 1.00
hashTwoObjects 500000 times - noble 1.6578 s/op 1.6307 s/op 1.02
executeHashComputations - noble 36.628 ms/op 36.118 ms/op 1.01
getHashComputations 2.5400 ms/op 2.0905 ms/op 1.22
executeHashComputations 10.339 ms/op 10.248 ms/op 1.01
get root 15.560 ms/op 15.171 ms/op 1.03
getNodeH() x7812.5 avg hindex 12.378 us/op 12.186 us/op 1.02
getNodeH() x7812.5 index 0 7.4860 us/op 7.5320 us/op 0.99
getNodeH() x7812.5 index 7 7.6530 us/op 7.4730 us/op 1.02
getNodeH() x7812.5 index 7 with key array 6.3940 us/op 6.2860 us/op 1.02
new LeafNode() x7812.5 306.22 us/op 298.94 us/op 1.02
getHashComputations 250000 nodes 14.673 ms/op 14.464 ms/op 1.01
batchHash 250000 nodes 86.618 ms/op 85.554 ms/op 1.01
get root 250000 nodes 118.20 ms/op 115.63 ms/op 1.02
getHashComputations 500000 nodes 41.078 ms/op 33.095 ms/op 1.24
batchHash 500000 nodes 162.21 ms/op 165.80 ms/op 0.98
get root 500000 nodes 234.81 ms/op 230.95 ms/op 1.02
getHashComputations 1000000 nodes 56.923 ms/op 61.868 ms/op 0.92
batchHash 1000000 nodes 330.58 ms/op 360.79 ms/op 0.92
get root 1000000 nodes 476.00 ms/op 468.00 ms/op 1.02
multiproof - depth 15, 1 requested leaves 7.9890 us/op 8.0560 us/op 0.99
tree offset multiproof - depth 15, 1 requested leaves 17.623 us/op 17.674 us/op 1.00
compact multiproof - depth 15, 1 requested leaves 2.9870 us/op 2.9060 us/op 1.03
multiproof - depth 15, 2 requested leaves 11.478 us/op 11.648 us/op 0.99
tree offset multiproof - depth 15, 2 requested leaves 21.192 us/op 21.182 us/op 1.00
compact multiproof - depth 15, 2 requested leaves 2.9900 us/op 2.9630 us/op 1.01
multiproof - depth 15, 3 requested leaves 16.425 us/op 16.292 us/op 1.01
tree offset multiproof - depth 15, 3 requested leaves 27.727 us/op 27.060 us/op 1.02
compact multiproof - depth 15, 3 requested leaves 3.6250 us/op 3.5790 us/op 1.01
multiproof - depth 15, 4 requested leaves 21.536 us/op 21.682 us/op 0.99
tree offset multiproof - depth 15, 4 requested leaves 34.521 us/op 33.787 us/op 1.02
compact multiproof - depth 15, 4 requested leaves 4.2400 us/op 4.1570 us/op 1.02
packedRootsBytesToLeafNodes bytes 4000 offset 0 5.5790 us/op 5.5150 us/op 1.01
packedRootsBytesToLeafNodes bytes 4000 offset 1 5.6720 us/op 5.4370 us/op 1.04
packedRootsBytesToLeafNodes bytes 4000 offset 2 5.5830 us/op 5.4920 us/op 1.02
packedRootsBytesToLeafNodes bytes 4000 offset 3 5.6140 us/op 5.4890 us/op 1.02
subtreeFillToContents depth 40 count 250000 49.610 ms/op 46.753 ms/op 1.06
setRoot - gindexBitstring 20.825 ms/op 21.228 ms/op 0.98
setRoot - gindex 21.474 ms/op 22.040 ms/op 0.97
getRoot - gindexBitstring 2.5954 ms/op 2.5335 ms/op 1.02
getRoot - gindex 3.0795 ms/op 3.1068 ms/op 0.99
getHashObject then setHashObject 21.978 ms/op 22.187 ms/op 0.99
setNodeWithFn 19.477 ms/op 19.955 ms/op 0.98
getNodeAtDepth depth 0 x100000 280.80 us/op 282.81 us/op 0.99
setNodeAtDepth depth 0 x100000 2.5836 ms/op 2.6916 ms/op 0.96
getNodesAtDepth depth 0 x100000 313.56 us/op 315.41 us/op 0.99
setNodesAtDepth depth 0 x100000 830.44 us/op 791.01 us/op 1.05
getNodeAtDepth depth 1 x100000 342.08 us/op 343.13 us/op 1.00
setNodeAtDepth depth 1 x100000 8.3838 ms/op 8.5334 ms/op 0.98
getNodesAtDepth depth 1 x100000 435.63 us/op 440.75 us/op 0.99
setNodesAtDepth depth 1 x100000 6.8704 ms/op 6.8871 ms/op 1.00
getNodeAtDepth depth 2 x100000 918.55 us/op 740.07 us/op 1.24
setNodeAtDepth depth 2 x100000 15.166 ms/op 15.720 ms/op 0.96
getNodesAtDepth depth 2 x100000 19.070 ms/op 18.316 ms/op 1.04
setNodesAtDepth depth 2 x100000 22.060 ms/op 22.968 ms/op 0.96
tree.getNodesAtDepth - gindexes 9.3007 ms/op 8.7213 ms/op 1.07
tree.getNodesAtDepth - push all nodes 2.2438 ms/op 2.4908 ms/op 0.90
tree.getNodesAtDepth - navigation 320.72 us/op 311.60 us/op 1.03
tree.setNodesAtDepth - indexes 725.59 us/op 731.42 us/op 0.99
set at depth 8 662.00 ns/op 801.00 ns/op 0.83
set at depth 16 1.2220 us/op 1.1460 us/op 1.07
set at depth 32 2.1560 us/op 1.9810 us/op 1.09
iterateNodesAtDepth 8 256 15.038 us/op 14.435 us/op 1.04
getNodesAtDepth 8 256 3.8810 us/op 3.7610 us/op 1.03
iterateNodesAtDepth 16 65536 4.7826 ms/op 5.1401 ms/op 0.93
getNodesAtDepth 16 65536 2.1091 ms/op 1.5996 ms/op 1.32
iterateNodesAtDepth 32 250000 16.296 ms/op 15.931 ms/op 1.02
getNodesAtDepth 32 250000 6.7817 ms/op 4.4646 ms/op 1.52
iterateNodesAtDepth 40 250000 15.491 ms/op 15.824 ms/op 0.98
getNodesAtDepth 40 250000 4.6561 ms/op 4.5864 ms/op 1.02
250000 validators root getter 118.76 ms/op 115.88 ms/op 1.02
250000 validators batchHash() 82.807 ms/op 91.401 ms/op 0.91
250000 validators hashComputations 18.866 ms/op 19.528 ms/op 0.97
bitlist bytes to struct (120,90) 793.00 ns/op 743.00 ns/op 1.07
bitlist bytes to tree (120,90) 2.6650 us/op 2.5500 us/op 1.05
bitlist bytes to struct (2048,2048) 990.00 ns/op 990.00 ns/op 1.00
bitlist bytes to tree (2048,2048) 4.0470 us/op 4.0390 us/op 1.00
ByteListType - deserialize 7.5467 ms/op 7.2279 ms/op 1.04
BasicListType - deserialize 14.501 ms/op 14.356 ms/op 1.01
ByteListType - serialize 7.9186 ms/op 7.7393 ms/op 1.02
BasicListType - serialize 10.295 ms/op 9.7326 ms/op 1.06
BasicListType - tree_convertToStruct 26.558 ms/op 26.077 ms/op 1.02
List[uint8, 68719476736] len 300000 ViewDU.getAll() + iterate 4.6499 ms/op 4.7901 ms/op 0.97
List[uint8, 68719476736] len 300000 ViewDU.get(i) 4.6046 ms/op 4.4103 ms/op 1.04
Array.push len 300000 empty Array - number 6.7940 ms/op 6.6991 ms/op 1.01
Array.set len 300000 from new Array - number 1.7463 ms/op 1.7371 ms/op 1.01
Array.set len 300000 - number 5.6890 ms/op 5.5399 ms/op 1.03
Uint8Array.set len 300000 487.38 us/op 480.11 us/op 1.02
Uint32Array.set len 300000 559.79 us/op 539.82 us/op 1.04
Container({a: uint8, b: uint8}) getViewDU x300000 25.251 ms/op 24.820 ms/op 1.02
ContainerNodeStruct({a: uint8, b: uint8}) getViewDU x300000 10.768 ms/op 10.455 ms/op 1.03
List(Container) len 300000 ViewDU.getAllReadonly() + iterate 211.73 ms/op 197.86 ms/op 1.07
List(Container) len 300000 ViewDU.getAllReadonlyValues() + iterate 244.02 ms/op 226.89 ms/op 1.08
List(Container) len 300000 ViewDU.get(i) 6.4973 ms/op 6.3845 ms/op 1.02
List(Container) len 300000 ViewDU.getReadonly(i) 6.3111 ms/op 6.3471 ms/op 0.99
List(ContainerNodeStruct) len 300000 ViewDU.getAllReadonly() + iterate 40.030 ms/op 38.556 ms/op 1.04
List(ContainerNodeStruct) len 300000 ViewDU.getAllReadonlyValues() + iterate 5.0000 ms/op 5.2337 ms/op 0.96
List(ContainerNodeStruct) len 300000 ViewDU.get(i) 6.1139 ms/op 6.1641 ms/op 0.99
List(ContainerNodeStruct) len 300000 ViewDU.getReadonly(i) 6.0587 ms/op 6.0643 ms/op 1.00
Array.push len 300000 empty Array - object 6.4613 ms/op 6.4074 ms/op 1.01
Array.set len 300000 from new Array - object 2.0504 ms/op 1.9925 ms/op 1.03
Array.set len 300000 - object 6.0209 ms/op 6.0039 ms/op 1.00
cachePermanentRootStruct no cache 3.4010 us/op 5.0960 us/op 0.67
cachePermanentRootStruct with cache 194.00 ns/op 199.00 ns/op 0.97
epochParticipation len 250000 rws 7813 2.3420 ms/op 2.3262 ms/op 1.01
Deneb BeaconBlock.hashTreeRoot(), numTransaction=200 5.3556 ms/op
BeaconState ViewDU hashTreeRoot() vc=200000 112.57 ms/op 112.48 ms/op 1.00
BeaconState ViewDU recursive hash - commit step vc=200000 4.9000 ms/op 4.5695 ms/op 1.07
BeaconState ViewDU validator tree creation vc=10000 39.953 ms/op 39.100 ms/op 1.02
BeaconState ViewDU batchHashTreeRoot vc=200000 102.09 ms/op 98.243 ms/op 1.04
BeaconState ViewDU hashTreeRoot - commit step vc=200000 89.630 ms/op 88.615 ms/op 1.01
BeaconState ViewDU hashTreeRoot - hash step vc=200000 15.659 ms/op 15.568 ms/op 1.01
deserialize Attestation - tree 3.7780 us/op 3.5980 us/op 1.05
deserialize Attestation - struct 2.0660 us/op 1.9260 us/op 1.07
deserialize SignedAggregateAndProof - tree 4.9640 us/op 5.0130 us/op 0.99
deserialize SignedAggregateAndProof - struct 3.2500 us/op 3.0720 us/op 1.06
deserialize SyncCommitteeMessage - tree 1.4340 us/op 1.4160 us/op 1.01
deserialize SyncCommitteeMessage - struct 1.1270 us/op 1.0680 us/op 1.06
deserialize SignedContributionAndProof - tree 3.0600 us/op 2.9390 us/op 1.04
deserialize SignedContributionAndProof - struct 2.4190 us/op 2.3790 us/op 1.02
deserialize SignedBeaconBlock - tree 287.40 us/op 294.19 us/op 0.98
deserialize SignedBeaconBlock - struct 131.66 us/op 123.51 us/op 1.07
BeaconState vc 300000 - deserialize tree 642.68 ms/op 626.16 ms/op 1.03
BeaconState vc 300000 - serialize tree 158.09 ms/op 127.41 ms/op 1.24
BeaconState.historicalRoots vc 300000 - deserialize tree 860.00 ns/op 849.00 ns/op 1.01
BeaconState.historicalRoots vc 300000 - serialize tree 627.00 ns/op 636.00 ns/op 0.99
BeaconState.validators vc 300000 - deserialize tree 611.36 ms/op 600.20 ms/op 1.02
BeaconState.validators vc 300000 - serialize tree 112.12 ms/op 99.082 ms/op 1.13
BeaconState.balances vc 300000 - deserialize tree 28.313 ms/op 27.416 ms/op 1.03
BeaconState.balances vc 300000 - serialize tree 4.2633 ms/op 4.5367 ms/op 0.94
BeaconState.previousEpochParticipation vc 300000 - deserialize tree 1.0718 ms/op 892.01 us/op 1.20
BeaconState.previousEpochParticipation vc 300000 - serialize tree 352.51 us/op 312.54 us/op 1.13
BeaconState.currentEpochParticipation vc 300000 - deserialize tree 1.1048 ms/op 902.60 us/op 1.22
BeaconState.currentEpochParticipation vc 300000 - serialize tree 354.09 us/op 334.63 us/op 1.06
BeaconState.inactivityScores vc 300000 - deserialize tree 29.351 ms/op 28.336 ms/op 1.04
BeaconState.inactivityScores vc 300000 - serialize tree 4.0009 ms/op 3.9001 ms/op 1.03
hashTreeRoot Attestation - struct 14.354 us/op 19.866 us/op 0.72
hashTreeRoot Attestation - tree 9.2970 us/op 9.1020 us/op 1.02
hashTreeRoot SignedAggregateAndProof - struct 16.609 us/op 24.204 us/op 0.69
hashTreeRoot SignedAggregateAndProof - tree 13.889 us/op 13.693 us/op 1.01
hashTreeRoot SyncCommitteeMessage - struct 4.1560 us/op 6.2360 us/op 0.67
hashTreeRoot SyncCommitteeMessage - tree 3.6440 us/op 3.6920 us/op 0.99
hashTreeRoot SignedContributionAndProof - struct 9.8970 us/op 14.727 us/op 0.67
hashTreeRoot SignedContributionAndProof - tree 9.4890 us/op 9.3310 us/op 1.02
hashTreeRoot SignedBeaconBlock - struct 837.67 us/op 1.2232 ms/op 0.68
hashTreeRoot SignedBeaconBlock - tree 827.92 us/op 825.90 us/op 1.00
hashTreeRoot Validator - struct 4.7250 us/op 7.7160 us/op 0.61
hashTreeRoot Validator - tree 7.3990 us/op 6.8910 us/op 1.07
BeaconState vc 300000 - hashTreeRoot tree 2.2643 s/op 2.2144 s/op 1.02
BeaconState vc 300000 - batchHashTreeRoot tree 4.1751 s/op 4.0718 s/op 1.03
BeaconState.historicalRoots vc 300000 - hashTreeRoot tree 1.0080 us/op 998.00 ns/op 1.01
BeaconState.validators vc 300000 - hashTreeRoot tree 2.5026 s/op 2.4050 s/op 1.04
BeaconState.balances vc 300000 - hashTreeRoot tree 35.911 ms/op 35.352 ms/op 1.02
BeaconState.previousEpochParticipation vc 300000 - hashTreeRoot tree 4.4475 ms/op 4.3136 ms/op 1.03
BeaconState.currentEpochParticipation vc 300000 - hashTreeRoot tree 4.4131 ms/op 4.3078 ms/op 1.02
BeaconState.inactivityScores vc 300000 - hashTreeRoot tree 38.786 ms/op 38.126 ms/op 1.02
hash64 x18 8.8540 us/op 8.7420 us/op 1.01
hashTwoObjects x18 7.9440 us/op 8.3110 us/op 0.96
hash64 x1740 792.61 us/op 796.95 us/op 0.99
hashTwoObjects x1740 741.84 us/op 746.70 us/op 0.99
hash64 x2700000 1.2402 s/op 1.2300 s/op 1.01
hashTwoObjects x2700000 1.1554 s/op 1.1605 s/op 1.00
get_exitEpoch - ContainerType 256.00 ns/op 231.00 ns/op 1.11
get_exitEpoch - ContainerNodeStructType 244.00 ns/op 229.00 ns/op 1.07
set_exitEpoch - ContainerType 245.00 ns/op 252.00 ns/op 0.97
set_exitEpoch - ContainerNodeStructType 267.00 ns/op 232.00 ns/op 1.15
get_pubkey - ContainerType 826.00 ns/op 927.00 ns/op 0.89
get_pubkey - ContainerNodeStructType 249.00 ns/op 223.00 ns/op 1.12
hashTreeRoot - ContainerType 393.00 ns/op 392.00 ns/op 1.00
hashTreeRoot - ContainerNodeStructType 401.00 ns/op 358.00 ns/op 1.12
createProof - ContainerType 3.9500 us/op 3.8400 us/op 1.03
createProof - ContainerNodeStructType 21.561 us/op 21.848 us/op 0.99
serialize - ContainerType 1.8240 us/op 1.7040 us/op 1.07
serialize - ContainerNodeStructType 1.4440 us/op 1.4030 us/op 1.03
set_exitEpoch_and_hashTreeRoot - ContainerType 2.9720 us/op 2.9300 us/op 1.01
set_exitEpoch_and_hashTreeRoot - ContainerNodeStructType 7.5090 us/op 7.3560 us/op 1.02
Array - for of 6.7880 us/op 7.4930 us/op 0.91
Array - for(;;) 5.7160 us/op 6.7180 us/op 0.85
basicListValue.readonlyValuesArray() 4.2301 ms/op 4.2245 ms/op 1.00
basicListValue.readonlyValuesArray() + loop all 4.3291 ms/op 4.2879 ms/op 1.01
compositeListValue.readonlyValuesArray() 31.139 ms/op 28.068 ms/op 1.11
compositeListValue.readonlyValuesArray() + loop all 33.099 ms/op 32.030 ms/op 1.03
Number64UintType - get balances list 4.4722 ms/op 4.6109 ms/op 0.97
Number64UintType - set balances list 10.170 ms/op 10.105 ms/op 1.01
Number64UintType - get and increase 10 then set 40.175 ms/op 46.189 ms/op 0.87
Number64UintType - increase 10 using applyDelta 17.432 ms/op 16.394 ms/op 1.06
Number64UintType - increase 10 using applyDeltaInBatch 17.351 ms/op 16.492 ms/op 1.05
tree_newTreeFromUint64Deltas 22.553 ms/op 20.557 ms/op 1.10
unsafeUint8ArrayToTree 35.789 ms/op 37.853 ms/op 0.95
bitLength(50) 239.00 ns/op 217.00 ns/op 1.10
bitLengthStr(50) 218.00 ns/op 208.00 ns/op 1.05
bitLength(8000) 239.00 ns/op 209.00 ns/op 1.14
bitLengthStr(8000) 256.00 ns/op 255.00 ns/op 1.00
bitLength(250000) 241.00 ns/op 210.00 ns/op 1.15
bitLengthStr(250000) 295.00 ns/op 291.00 ns/op 1.01
merkleize 32 chunks 14.792 us/op
merkleizeBlocksBytes 32 chunks 3.4760 us/op
merkleizeBlockArray 32 chunks 6.6260 us/op
merkleize 128 chunks 57.843 us/op
merkleizeBlocksBytes 128 chunks 7.7980 us/op
merkleizeBlockArray 128 chunks 18.434 us/op
merkleize 512 chunks 230.94 us/op
merkleizeBlocksBytes 512 chunks 23.365 us/op
merkleizeBlockArray 512 chunks 64.513 us/op
merkleize 1024 chunks 484.61 us/op
merkleizeBlocksBytes 1024 chunks 43.020 us/op
merkleizeBlockArray 1024 chunks 121.11 us/op
floor - Math.floor (53) 1.2598 ns/op 1.2468 ns/op 1.01
floor - << 0 (53) 1.2429 ns/op 1.2440 ns/op 1.00
floor - Math.floor (512) 1.2436 ns/op 1.2439 ns/op 1.00
floor - << 0 (512) 1.2435 ns/op 1.2669 ns/op 0.98
fnIf(0) 1.5557 ns/op 1.5551 ns/op 1.00
fnSwitch(0) 2.1760 ns/op 2.1827 ns/op 1.00
fnObj(0) 1.5761 ns/op 1.5592 ns/op 1.01
fnArr(0) 1.5540 ns/op 1.5557 ns/op 1.00
fnIf(4) 2.1772 ns/op 2.1812 ns/op 1.00
fnSwitch(4) 2.1773 ns/op 2.1777 ns/op 1.00
fnObj(4) 1.5547 ns/op 1.5568 ns/op 1.00
fnArr(4) 1.5559 ns/op 1.5777 ns/op 0.99
fnIf(9) 3.1070 ns/op 3.1089 ns/op 1.00
fnSwitch(9) 2.1783 ns/op 2.1768 ns/op 1.00
fnObj(9) 1.5746 ns/op 1.5626 ns/op 1.01
fnArr(9) 1.5550 ns/op 1.5543 ns/op 1.00
Container {a,b,vec} - as struct x100000 124.63 us/op 125.02 us/op 1.00
Container {a,b,vec} - as tree x100000 559.75 us/op 559.98 us/op 1.00
Container {a,vec,b} - as struct x100000 155.52 us/op 155.71 us/op 1.00
Container {a,vec,b} - as tree x100000 560.34 us/op 561.24 us/op 1.00
get 2 props x1000000 - rawObject 315.37 us/op 312.25 us/op 1.01
get 2 props x1000000 - proxy 73.799 ms/op 73.895 ms/op 1.00
get 2 props x1000000 - customObj 311.43 us/op 313.10 us/op 0.99
Simple object binary -> struct 950.00 ns/op 574.00 ns/op 1.66
Simple object binary -> tree_backed 2.5540 us/op 1.7190 us/op 1.49
Simple object struct -> tree_backed 2.9310 us/op 2.1960 us/op 1.33
Simple object tree_backed -> struct 2.5230 us/op 1.4930 us/op 1.69
Simple object struct -> binary 1.2330 us/op 811.00 ns/op 1.52
Simple object tree_backed -> binary 2.1750 us/op 1.2420 us/op 1.75
aggregationBits binary -> struct 939.00 ns/op 480.00 ns/op 1.96
aggregationBits binary -> tree_backed 2.8230 us/op 2.0790 us/op 1.36
aggregationBits struct -> tree_backed 3.1940 us/op 2.4660 us/op 1.30
aggregationBits tree_backed -> struct 1.2960 us/op 922.00 ns/op 1.41
aggregationBits struct -> binary 823.00 ns/op 650.00 ns/op 1.27
aggregationBits tree_backed -> binary 1.0810 us/op 825.00 ns/op 1.31
List(uint8) 100000 binary -> struct 1.6817 ms/op 1.7624 ms/op 0.95
List(uint8) 100000 binary -> tree_backed 280.70 us/op 273.22 us/op 1.03
List(uint8) 100000 struct -> tree_backed 1.3992 ms/op 1.4638 ms/op 0.96
List(uint8) 100000 tree_backed -> struct 1.1186 ms/op 1.1198 ms/op 1.00
List(uint8) 100000 struct -> binary 1.1027 ms/op 1.0992 ms/op 1.00
List(uint8) 100000 tree_backed -> binary 112.44 us/op 114.74 us/op 0.98
List(uint64Number) 100000 binary -> struct 1.3945 ms/op 1.4048 ms/op 0.99
List(uint64Number) 100000 binary -> tree_backed 4.6367 ms/op 5.0190 ms/op 0.92
List(uint64Number) 100000 struct -> tree_backed 6.3875 ms/op 6.1059 ms/op 1.05
List(uint64Number) 100000 tree_backed -> struct 2.4515 ms/op 2.4342 ms/op 1.01
List(uint64Number) 100000 struct -> binary 1.4951 ms/op 1.5826 ms/op 0.94
List(uint64Number) 100000 tree_backed -> binary 1.0972 ms/op 1.0330 ms/op 1.06
List(Uint64Bigint) 100000 binary -> struct 3.8732 ms/op 3.7095 ms/op 1.04
List(Uint64Bigint) 100000 binary -> tree_backed 5.5355 ms/op 5.0164 ms/op 1.10
List(Uint64Bigint) 100000 struct -> tree_backed 6.7228 ms/op 8.1592 ms/op 0.82
List(Uint64Bigint) 100000 tree_backed -> struct 4.7862 ms/op 5.1796 ms/op 0.92
List(Uint64Bigint) 100000 struct -> binary 2.1055 ms/op 2.0661 ms/op 1.02
List(Uint64Bigint) 100000 tree_backed -> binary 1.1334 ms/op 1.2860 ms/op 0.88
Vector(Root) 100000 binary -> struct 36.980 ms/op 37.432 ms/op 0.99
Vector(Root) 100000 binary -> tree_backed 40.261 ms/op 42.356 ms/op 0.95
Vector(Root) 100000 struct -> tree_backed 53.055 ms/op 49.278 ms/op 1.08
Vector(Root) 100000 tree_backed -> struct 52.010 ms/op 49.389 ms/op 1.05
Vector(Root) 100000 struct -> binary 2.9548 ms/op 2.7343 ms/op 1.08
Vector(Root) 100000 tree_backed -> binary 6.6796 ms/op 6.5106 ms/op 1.03
List(Validator) 100000 binary -> struct 106.76 ms/op 107.22 ms/op 1.00
List(Validator) 100000 binary -> tree_backed 347.07 ms/op 320.95 ms/op 1.08
List(Validator) 100000 struct -> tree_backed 384.61 ms/op 373.91 ms/op 1.03
List(Validator) 100000 tree_backed -> struct 221.99 ms/op 202.44 ms/op 1.10
List(Validator) 100000 struct -> binary 29.793 ms/op 28.992 ms/op 1.03
List(Validator) 100000 tree_backed -> binary 111.15 ms/op 105.77 ms/op 1.05
List(Validator-NS) 100000 binary -> struct 117.94 ms/op 117.19 ms/op 1.01
List(Validator-NS) 100000 binary -> tree_backed 169.68 ms/op 159.64 ms/op 1.06
List(Validator-NS) 100000 struct -> tree_backed 210.38 ms/op 208.31 ms/op 1.01
List(Validator-NS) 100000 tree_backed -> struct 172.93 ms/op 169.86 ms/op 1.02
List(Validator-NS) 100000 struct -> binary 29.412 ms/op 28.812 ms/op 1.02
List(Validator-NS) 100000 tree_backed -> binary 34.947 ms/op 34.085 ms/op 1.03
get epochStatuses - MutableVector 112.88 us/op 98.762 us/op 1.14
get epochStatuses - ViewDU 208.24 us/op 213.39 us/op 0.98
set epochStatuses - ListTreeView 2.0275 ms/op 2.1305 ms/op 0.95
set epochStatuses - ListTreeView - set() 466.24 us/op 446.33 us/op 1.04
set epochStatuses - ListTreeView - commit() 722.56 us/op 731.49 us/op 0.99
bitstring 514.77 ns/op 513.77 ns/op 1.00
bit mask 13.651 ns/op 13.799 ns/op 0.99
struct - increase slot to 1000000 942.26 us/op 933.62 us/op 1.01
UintNumberType - increase slot to 1000000 28.730 ms/op 27.711 ms/op 1.04
UintBigintType - increase slot to 1000000 174.83 ms/op 171.34 ms/op 1.02
UintBigint8 x 100000 tree_deserialize 5.7077 ms/op 4.8651 ms/op 1.17
UintBigint8 x 100000 tree_serialize 1.1297 ms/op 1.1287 ms/op 1.00
UintBigint16 x 100000 tree_deserialize 5.3366 ms/op 6.0631 ms/op 0.88
UintBigint16 x 100000 tree_serialize 1.3820 ms/op 1.3632 ms/op 1.01
UintBigint32 x 100000 tree_deserialize 5.5350 ms/op 5.5379 ms/op 1.00
UintBigint32 x 100000 tree_serialize 1.9077 ms/op 1.8300 ms/op 1.04
UintBigint64 x 100000 tree_deserialize 5.8962 ms/op 6.2323 ms/op 0.95
UintBigint64 x 100000 tree_serialize 2.4838 ms/op 2.4893 ms/op 1.00
UintBigint8 x 100000 value_deserialize 435.68 us/op 437.44 us/op 1.00
UintBigint8 x 100000 value_serialize 774.09 us/op 771.83 us/op 1.00
UintBigint16 x 100000 value_deserialize 469.17 us/op 466.78 us/op 1.01
UintBigint16 x 100000 value_serialize 807.99 us/op 814.65 us/op 0.99
UintBigint32 x 100000 value_deserialize 502.87 us/op 497.55 us/op 1.01
UintBigint32 x 100000 value_serialize 856.17 us/op 851.55 us/op 1.01
UintBigint64 x 100000 value_deserialize 568.44 us/op 561.32 us/op 1.01
UintBigint64 x 100000 value_serialize 1.0445 ms/op 1.0346 ms/op 1.01
UintBigint8 x 100000 deserialize 3.4295 ms/op 3.1338 ms/op 1.09
UintBigint8 x 100000 serialize 1.6446 ms/op 1.5236 ms/op 1.08
UintBigint16 x 100000 deserialize 3.5563 ms/op 3.1959 ms/op 1.11
UintBigint16 x 100000 serialize 1.5437 ms/op 1.5486 ms/op 1.00
UintBigint32 x 100000 deserialize 3.3850 ms/op 3.2662 ms/op 1.04
UintBigint32 x 100000 serialize 2.8506 ms/op 2.8154 ms/op 1.01
UintBigint64 x 100000 deserialize 4.4418 ms/op 4.2641 ms/op 1.04
UintBigint64 x 100000 serialize 1.6602 ms/op 1.7528 ms/op 0.95
UintBigint128 x 100000 deserialize 6.4605 ms/op 5.5805 ms/op 1.16
UintBigint128 x 100000 serialize 14.780 ms/op 15.764 ms/op 0.94
UintBigint256 x 100000 deserialize 7.9071 ms/op 8.9411 ms/op 0.88
UintBigint256 x 100000 serialize 43.432 ms/op 45.903 ms/op 0.95
Slice from Uint8Array x25000 1.3018 ms/op 1.3955 ms/op 0.93
Slice from ArrayBuffer x25000 17.060 ms/op 15.878 ms/op 1.07
Slice from ArrayBuffer x25000 + new Uint8Array 18.501 ms/op 16.324 ms/op 1.13
Copy Uint8Array 100000 iterate 2.6403 ms/op 2.7272 ms/op 0.97
Copy Uint8Array 100000 slice 97.777 us/op 95.870 us/op 1.02
Copy Uint8Array 100000 Uint8Array.prototype.slice.call 93.596 us/op 87.975 us/op 1.06
Copy Buffer 100000 Uint8Array.prototype.slice.call 90.363 us/op 90.034 us/op 1.00
Copy Uint8Array 100000 slice + set 195.84 us/op 177.75 us/op 1.10
Copy Uint8Array 100000 subarray + set 96.497 us/op 88.467 us/op 1.09
Copy Uint8Array 100000 slice arrayBuffer 96.102 us/op 91.189 us/op 1.05
Uint64 deserialize 100000 - iterate Uint8Array 1.9120 ms/op 1.9793 ms/op 0.97
Uint64 deserialize 100000 - by Uint32A 1.8220 ms/op 2.0172 ms/op 0.90
Uint64 deserialize 100000 - by DataView.getUint32 x2 1.8059 ms/op 1.8379 ms/op 0.98
Uint64 deserialize 100000 - by DataView.getBigUint64 4.9341 ms/op 4.8271 ms/op 1.02
Uint64 deserialize 100000 - by byte 41.480 ms/op 40.847 ms/op 1.02

by benchmarkbot/action

@twoeths
Copy link
Contributor Author

twoeths commented Oct 21, 2024

tested this on feat1, see ChainSafe/lodestar#7171 (comment)
ready to review

@twoeths twoeths marked this pull request as ready for review October 21, 2024 03:18
@twoeths twoeths requested a review from a team as a code owner October 21, 2024 03:18
@twoeths twoeths marked this pull request as draft October 31, 2024 03:39
@twoeths
Copy link
Contributor Author

twoeths commented Oct 31, 2024

sha256 works in blocks, each is 64 bytes so perhaps it's more meaningful to reflect that for chunkBytesBuffer variable

also with holesky, there are 1.7M validators. For every 8 deposits we have to reallocate the whole 1.7M * 8 bytes = 13.6MB for BeaconState.balances which is not ideal. Need to instead allocate another 64 bytes in this case. This applies for all list types.

Update:

  • the hash of BeaconState.balances and everything inside BeaconState work through ViewDU so it's not revelant
  • it's more related to the hash of BeaconBlock, for example transaction data and lists like number of transactions

@twoeths twoeths force-pushed the te/improve_type_dot_hash_tree_root branch 2 times, most recently from 9e32c5c to 7ed3ced Compare November 9, 2024 02:19
@philknows philknows modified the milestones: v1.0, v1.1 Jan 22, 2025
@nazarhussain nazarhussain force-pushed the te/improve_type_dot_hash_tree_root branch from d3821ee to cbb30a2 Compare February 12, 2025 12:18
@nazarhussain nazarhussain marked this pull request as ready for review February 12, 2025 14:05
Copy link
Member

@matthewkeil matthewkeil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a few comments but I think this PR really needs to be reviewed by @wemeetagain

// Merkleization

protected getRoots(value: ByteArray): Uint8Array[] {
return splitIntoRootChunks(value);
protected getBlocksBytes(value: ByteArray): Uint8Array {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im confused by what BlocksBytes represents. I understand there is a shared buffer that is getting reused but why the name Blocks and thus bytes of a block? Perhaps we can talk about the naming conventions here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like a Block is the unit used for (2) 32-byte roots?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps calling 32 byte chunks Bytes32 and 64-byte "blocks" Bytes64? Not sure, what do you think?


blocksBuffer.set(value);
const valueLen = value.length;
const blockByteLen = Math.ceil(valueLen / 64) * 64;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be helpful to break out a helper function so its clear why this is happening everywhere.

export function getPaddedByte32Count(buf: ArrayBuffer): number {
    return Math.ceil(buf.length / 32);
}

export function getPaddedByte64Count(buf: ArrayBuffer): number {
    return Math.ceil(buf.length / 64);
}

const blockDiff = newBlockCount - oldBlockCount;
const newBlocksBytes = new Uint8Array(blockDiff * 64);
for (let i = 0; i < blockDiff; i++) {
this.blockArray.push(newBlocksBytes.subarray(i * 64, (i + 1) * 64));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why loop and push in chunks? Why not just push all at once and only trigger one resize to the blockArray?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants