feat: improve type.hashTreeRoot() using batch #409

twoeths · 2024-10-15T03:25:40Z

Motivation

improve type.hashTreeRoot() using batch

Description

instead of getRoots() and compute root from there, this PR implement getChunkBytes()
- this compute root using merkleizeInto() which use batch there
- reuse chunkBytesBuffer memory in type, almost no Uint8Array allocations in the middle
new hashTreeRootInto() api. This is needed in case consumers want to reuse memory allocation there
use allocUnsafe() of as-sha256 where it makes sense

cherry picked from #378

github-actions · 2024-10-15T03:57:41Z

Performance Report

✔️ no performance regression detected

Full benchmark results

Benchmark suite	Current: `6351e2b`	Previous: `5041139`	Ratio
digestTwoHashObjects 50023 times	48.772 ms/op	48.248 ms/op	1.01
digest2Bytes32 50023 times	54.866 ms/op	54.794 ms/op	1.00
digest 50023 times	54.151 ms/op	53.430 ms/op	1.01
input length 32	1.1620 us/op	1.1670 us/op	1.00
input length 64	1.3050 us/op	1.3180 us/op	0.99
input length 128	2.3190 us/op	2.2320 us/op	1.04
input length 256	3.3840 us/op	3.3300 us/op	1.02
input length 512	5.6140 us/op	5.5630 us/op	1.01
input length 1024	10.721 us/op	10.681 us/op	1.00
digest 1000000 times	863.76 ms/op	852.84 ms/op	1.01
hashObjectToByteArray 50023 times	1.2288 ms/op	1.2289 ms/op	1.00
byteArrayToHashObject 50023 times	1.5485 ms/op	1.5426 ms/op	1.00
digest64 200092 times	213.57 ms/op	214.18 ms/op	1.00
hash 200092 times using batchHash4UintArray64s	264.06 ms/op	250.07 ms/op	1.06
digest64HashObjects 200092 times	193.14 ms/op	191.43 ms/op	1.01
hash 200092 times using batchHash4HashObjectInputs	234.19 ms/op	197.77 ms/op	1.18
getGindicesAtDepth	3.4720 us/op	3.4480 us/op	1.01
iterateAtDepth	6.2800 us/op	6.3750 us/op	0.99
getGindexBits	367.00 ns/op	356.00 ns/op	1.03
gindexIterator	873.00 ns/op	896.00 ns/op	0.97
HashComputationLevel.push then loop	25.666 ms/op	25.761 ms/op	1.00
HashComputation[] push then loop	38.016 ms/op	47.920 ms/op	0.79
hash 2 Uint8Array 500000 times - hashtree	215.55 ms/op	213.19 ms/op	1.01
hashTwoObjects 500000 times - hashtree	217.74 ms/op	214.65 ms/op	1.01
executeHashComputations - hashtree	9.1150 ms/op	9.4258 ms/op	0.97
hash 2 Uint8Array 500000 times - as-sha256	558.77 ms/op	560.79 ms/op	1.00
hashTwoObjects 500000 times - as-sha256	508.70 ms/op	509.59 ms/op	1.00
executeHashComputations - as-sha256	48.588 ms/op	44.935 ms/op	1.08
hash 2 Uint8Array 500000 times - noble	1.2271 s/op	1.2219 s/op	1.00
hashTwoObjects 500000 times - noble	1.6578 s/op	1.6307 s/op	1.02
executeHashComputations - noble	36.628 ms/op	36.118 ms/op	1.01
getHashComputations	2.5400 ms/op	2.0905 ms/op	1.22
executeHashComputations	10.339 ms/op	10.248 ms/op	1.01
get root	15.560 ms/op	15.171 ms/op	1.03
getNodeH() x7812.5 avg hindex	12.378 us/op	12.186 us/op	1.02
getNodeH() x7812.5 index 0	7.4860 us/op	7.5320 us/op	0.99
getNodeH() x7812.5 index 7	7.6530 us/op	7.4730 us/op	1.02
getNodeH() x7812.5 index 7 with key array	6.3940 us/op	6.2860 us/op	1.02
new LeafNode() x7812.5	306.22 us/op	298.94 us/op	1.02
getHashComputations 250000 nodes	14.673 ms/op	14.464 ms/op	1.01
batchHash 250000 nodes	86.618 ms/op	85.554 ms/op	1.01
get root 250000 nodes	118.20 ms/op	115.63 ms/op	1.02
getHashComputations 500000 nodes	41.078 ms/op	33.095 ms/op	1.24
batchHash 500000 nodes	162.21 ms/op	165.80 ms/op	0.98
get root 500000 nodes	234.81 ms/op	230.95 ms/op	1.02
getHashComputations 1000000 nodes	56.923 ms/op	61.868 ms/op	0.92
batchHash 1000000 nodes	330.58 ms/op	360.79 ms/op	0.92
get root 1000000 nodes	476.00 ms/op	468.00 ms/op	1.02
multiproof - depth 15, 1 requested leaves	7.9890 us/op	8.0560 us/op	0.99
tree offset multiproof - depth 15, 1 requested leaves	17.623 us/op	17.674 us/op	1.00
compact multiproof - depth 15, 1 requested leaves	2.9870 us/op	2.9060 us/op	1.03
multiproof - depth 15, 2 requested leaves	11.478 us/op	11.648 us/op	0.99
tree offset multiproof - depth 15, 2 requested leaves	21.192 us/op	21.182 us/op	1.00
compact multiproof - depth 15, 2 requested leaves	2.9900 us/op	2.9630 us/op	1.01
multiproof - depth 15, 3 requested leaves	16.425 us/op	16.292 us/op	1.01
tree offset multiproof - depth 15, 3 requested leaves	27.727 us/op	27.060 us/op	1.02
compact multiproof - depth 15, 3 requested leaves	3.6250 us/op	3.5790 us/op	1.01
multiproof - depth 15, 4 requested leaves	21.536 us/op	21.682 us/op	0.99
tree offset multiproof - depth 15, 4 requested leaves	34.521 us/op	33.787 us/op	1.02
compact multiproof - depth 15, 4 requested leaves	4.2400 us/op	4.1570 us/op	1.02
packedRootsBytesToLeafNodes bytes 4000 offset 0	5.5790 us/op	5.5150 us/op	1.01
packedRootsBytesToLeafNodes bytes 4000 offset 1	5.6720 us/op	5.4370 us/op	1.04
packedRootsBytesToLeafNodes bytes 4000 offset 2	5.5830 us/op	5.4920 us/op	1.02
packedRootsBytesToLeafNodes bytes 4000 offset 3	5.6140 us/op	5.4890 us/op	1.02
subtreeFillToContents depth 40 count 250000	49.610 ms/op	46.753 ms/op	1.06
setRoot - gindexBitstring	20.825 ms/op	21.228 ms/op	0.98
setRoot - gindex	21.474 ms/op	22.040 ms/op	0.97
getRoot - gindexBitstring	2.5954 ms/op	2.5335 ms/op	1.02
getRoot - gindex	3.0795 ms/op	3.1068 ms/op	0.99
getHashObject then setHashObject	21.978 ms/op	22.187 ms/op	0.99
setNodeWithFn	19.477 ms/op	19.955 ms/op	0.98
getNodeAtDepth depth 0 x100000	280.80 us/op	282.81 us/op	0.99
setNodeAtDepth depth 0 x100000	2.5836 ms/op	2.6916 ms/op	0.96
getNodesAtDepth depth 0 x100000	313.56 us/op	315.41 us/op	0.99
setNodesAtDepth depth 0 x100000	830.44 us/op	791.01 us/op	1.05
getNodeAtDepth depth 1 x100000	342.08 us/op	343.13 us/op	1.00
setNodeAtDepth depth 1 x100000	8.3838 ms/op	8.5334 ms/op	0.98
getNodesAtDepth depth 1 x100000	435.63 us/op	440.75 us/op	0.99
setNodesAtDepth depth 1 x100000	6.8704 ms/op	6.8871 ms/op	1.00
getNodeAtDepth depth 2 x100000	918.55 us/op	740.07 us/op	1.24
setNodeAtDepth depth 2 x100000	15.166 ms/op	15.720 ms/op	0.96
getNodesAtDepth depth 2 x100000	19.070 ms/op	18.316 ms/op	1.04
setNodesAtDepth depth 2 x100000	22.060 ms/op	22.968 ms/op	0.96
tree.getNodesAtDepth - gindexes	9.3007 ms/op	8.7213 ms/op	1.07
tree.getNodesAtDepth - push all nodes	2.2438 ms/op	2.4908 ms/op	0.90
tree.getNodesAtDepth - navigation	320.72 us/op	311.60 us/op	1.03
tree.setNodesAtDepth - indexes	725.59 us/op	731.42 us/op	0.99
set at depth 8	662.00 ns/op	801.00 ns/op	0.83
set at depth 16	1.2220 us/op	1.1460 us/op	1.07
set at depth 32	2.1560 us/op	1.9810 us/op	1.09
iterateNodesAtDepth 8 256	15.038 us/op	14.435 us/op	1.04
getNodesAtDepth 8 256	3.8810 us/op	3.7610 us/op	1.03
iterateNodesAtDepth 16 65536	4.7826 ms/op	5.1401 ms/op	0.93
getNodesAtDepth 16 65536	2.1091 ms/op	1.5996 ms/op	1.32
iterateNodesAtDepth 32 250000	16.296 ms/op	15.931 ms/op	1.02
getNodesAtDepth 32 250000	6.7817 ms/op	4.4646 ms/op	1.52
iterateNodesAtDepth 40 250000	15.491 ms/op	15.824 ms/op	0.98
getNodesAtDepth 40 250000	4.6561 ms/op	4.5864 ms/op	1.02
250000 validators root getter	118.76 ms/op	115.88 ms/op	1.02
250000 validators batchHash()	82.807 ms/op	91.401 ms/op	0.91
250000 validators hashComputations	18.866 ms/op	19.528 ms/op	0.97
bitlist bytes to struct (120,90)	793.00 ns/op	743.00 ns/op	1.07
bitlist bytes to tree (120,90)	2.6650 us/op	2.5500 us/op	1.05
bitlist bytes to struct (2048,2048)	990.00 ns/op	990.00 ns/op	1.00
bitlist bytes to tree (2048,2048)	4.0470 us/op	4.0390 us/op	1.00
ByteListType - deserialize	7.5467 ms/op	7.2279 ms/op	1.04
BasicListType - deserialize	14.501 ms/op	14.356 ms/op	1.01
ByteListType - serialize	7.9186 ms/op	7.7393 ms/op	1.02
BasicListType - serialize	10.295 ms/op	9.7326 ms/op	1.06
BasicListType - tree_convertToStruct	26.558 ms/op	26.077 ms/op	1.02
List[uint8, 68719476736] len 300000 ViewDU.getAll() + iterate	4.6499 ms/op	4.7901 ms/op	0.97
List[uint8, 68719476736] len 300000 ViewDU.get(i)	4.6046 ms/op	4.4103 ms/op	1.04
Array.push len 300000 empty Array - number	6.7940 ms/op	6.6991 ms/op	1.01
Array.set len 300000 from new Array - number	1.7463 ms/op	1.7371 ms/op	1.01
Array.set len 300000 - number	5.6890 ms/op	5.5399 ms/op	1.03
Uint8Array.set len 300000	487.38 us/op	480.11 us/op	1.02
Uint32Array.set len 300000	559.79 us/op	539.82 us/op	1.04
Container({a: uint8, b: uint8}) getViewDU x300000	25.251 ms/op	24.820 ms/op	1.02
ContainerNodeStruct({a: uint8, b: uint8}) getViewDU x300000	10.768 ms/op	10.455 ms/op	1.03
List(Container) len 300000 ViewDU.getAllReadonly() + iterate	211.73 ms/op	197.86 ms/op	1.07
List(Container) len 300000 ViewDU.getAllReadonlyValues() + iterate	244.02 ms/op	226.89 ms/op	1.08
List(Container) len 300000 ViewDU.get(i)	6.4973 ms/op	6.3845 ms/op	1.02
List(Container) len 300000 ViewDU.getReadonly(i)	6.3111 ms/op	6.3471 ms/op	0.99
List(ContainerNodeStruct) len 300000 ViewDU.getAllReadonly() + iterate	40.030 ms/op	38.556 ms/op	1.04
List(ContainerNodeStruct) len 300000 ViewDU.getAllReadonlyValues() + iterate	5.0000 ms/op	5.2337 ms/op	0.96
List(ContainerNodeStruct) len 300000 ViewDU.get(i)	6.1139 ms/op	6.1641 ms/op	0.99
List(ContainerNodeStruct) len 300000 ViewDU.getReadonly(i)	6.0587 ms/op	6.0643 ms/op	1.00
Array.push len 300000 empty Array - object	6.4613 ms/op	6.4074 ms/op	1.01
Array.set len 300000 from new Array - object	2.0504 ms/op	1.9925 ms/op	1.03
Array.set len 300000 - object	6.0209 ms/op	6.0039 ms/op	1.00
cachePermanentRootStruct no cache	3.4010 us/op	5.0960 us/op	0.67
cachePermanentRootStruct with cache	194.00 ns/op	199.00 ns/op	0.97
epochParticipation len 250000 rws 7813	2.3420 ms/op	2.3262 ms/op	1.01
Deneb BeaconBlock.hashTreeRoot(), numTransaction=200	5.3556 ms/op
BeaconState ViewDU hashTreeRoot() vc=200000	112.57 ms/op	112.48 ms/op	1.00
BeaconState ViewDU recursive hash - commit step vc=200000	4.9000 ms/op	4.5695 ms/op	1.07
BeaconState ViewDU validator tree creation vc=10000	39.953 ms/op	39.100 ms/op	1.02
BeaconState ViewDU batchHashTreeRoot vc=200000	102.09 ms/op	98.243 ms/op	1.04
BeaconState ViewDU hashTreeRoot - commit step vc=200000	89.630 ms/op	88.615 ms/op	1.01
BeaconState ViewDU hashTreeRoot - hash step vc=200000	15.659 ms/op	15.568 ms/op	1.01
deserialize Attestation - tree	3.7780 us/op	3.5980 us/op	1.05
deserialize Attestation - struct	2.0660 us/op	1.9260 us/op	1.07
deserialize SignedAggregateAndProof - tree	4.9640 us/op	5.0130 us/op	0.99
deserialize SignedAggregateAndProof - struct	3.2500 us/op	3.0720 us/op	1.06
deserialize SyncCommitteeMessage - tree	1.4340 us/op	1.4160 us/op	1.01
deserialize SyncCommitteeMessage - struct	1.1270 us/op	1.0680 us/op	1.06
deserialize SignedContributionAndProof - tree	3.0600 us/op	2.9390 us/op	1.04
deserialize SignedContributionAndProof - struct	2.4190 us/op	2.3790 us/op	1.02
deserialize SignedBeaconBlock - tree	287.40 us/op	294.19 us/op	0.98
deserialize SignedBeaconBlock - struct	131.66 us/op	123.51 us/op	1.07
BeaconState vc 300000 - deserialize tree	642.68 ms/op	626.16 ms/op	1.03
BeaconState vc 300000 - serialize tree	158.09 ms/op	127.41 ms/op	1.24
BeaconState.historicalRoots vc 300000 - deserialize tree	860.00 ns/op	849.00 ns/op	1.01
BeaconState.historicalRoots vc 300000 - serialize tree	627.00 ns/op	636.00 ns/op	0.99
BeaconState.validators vc 300000 - deserialize tree	611.36 ms/op	600.20 ms/op	1.02
BeaconState.validators vc 300000 - serialize tree	112.12 ms/op	99.082 ms/op	1.13
BeaconState.balances vc 300000 - deserialize tree	28.313 ms/op	27.416 ms/op	1.03
BeaconState.balances vc 300000 - serialize tree	4.2633 ms/op	4.5367 ms/op	0.94
BeaconState.previousEpochParticipation vc 300000 - deserialize tree	1.0718 ms/op	892.01 us/op	1.20
BeaconState.previousEpochParticipation vc 300000 - serialize tree	352.51 us/op	312.54 us/op	1.13
BeaconState.currentEpochParticipation vc 300000 - deserialize tree	1.1048 ms/op	902.60 us/op	1.22
BeaconState.currentEpochParticipation vc 300000 - serialize tree	354.09 us/op	334.63 us/op	1.06
BeaconState.inactivityScores vc 300000 - deserialize tree	29.351 ms/op	28.336 ms/op	1.04
BeaconState.inactivityScores vc 300000 - serialize tree	4.0009 ms/op	3.9001 ms/op	1.03
hashTreeRoot Attestation - struct	14.354 us/op	19.866 us/op	0.72
hashTreeRoot Attestation - tree	9.2970 us/op	9.1020 us/op	1.02
hashTreeRoot SignedAggregateAndProof - struct	16.609 us/op	24.204 us/op	0.69
hashTreeRoot SignedAggregateAndProof - tree	13.889 us/op	13.693 us/op	1.01
hashTreeRoot SyncCommitteeMessage - struct	4.1560 us/op	6.2360 us/op	0.67
hashTreeRoot SyncCommitteeMessage - tree	3.6440 us/op	3.6920 us/op	0.99
hashTreeRoot SignedContributionAndProof - struct	9.8970 us/op	14.727 us/op	0.67
hashTreeRoot SignedContributionAndProof - tree	9.4890 us/op	9.3310 us/op	1.02
hashTreeRoot SignedBeaconBlock - struct	837.67 us/op	1.2232 ms/op	0.68
hashTreeRoot SignedBeaconBlock - tree	827.92 us/op	825.90 us/op	1.00
hashTreeRoot Validator - struct	4.7250 us/op	7.7160 us/op	0.61
hashTreeRoot Validator - tree	7.3990 us/op	6.8910 us/op	1.07
BeaconState vc 300000 - hashTreeRoot tree	2.2643 s/op	2.2144 s/op	1.02
BeaconState vc 300000 - batchHashTreeRoot tree	4.1751 s/op	4.0718 s/op	1.03
BeaconState.historicalRoots vc 300000 - hashTreeRoot tree	1.0080 us/op	998.00 ns/op	1.01
BeaconState.validators vc 300000 - hashTreeRoot tree	2.5026 s/op	2.4050 s/op	1.04
BeaconState.balances vc 300000 - hashTreeRoot tree	35.911 ms/op	35.352 ms/op	1.02
BeaconState.previousEpochParticipation vc 300000 - hashTreeRoot tree	4.4475 ms/op	4.3136 ms/op	1.03
BeaconState.currentEpochParticipation vc 300000 - hashTreeRoot tree	4.4131 ms/op	4.3078 ms/op	1.02
BeaconState.inactivityScores vc 300000 - hashTreeRoot tree	38.786 ms/op	38.126 ms/op	1.02
hash64 x18	8.8540 us/op	8.7420 us/op	1.01
hashTwoObjects x18	7.9440 us/op	8.3110 us/op	0.96
hash64 x1740	792.61 us/op	796.95 us/op	0.99
hashTwoObjects x1740	741.84 us/op	746.70 us/op	0.99
hash64 x2700000	1.2402 s/op	1.2300 s/op	1.01
hashTwoObjects x2700000	1.1554 s/op	1.1605 s/op	1.00
get_exitEpoch - ContainerType	256.00 ns/op	231.00 ns/op	1.11
get_exitEpoch - ContainerNodeStructType	244.00 ns/op	229.00 ns/op	1.07
set_exitEpoch - ContainerType	245.00 ns/op	252.00 ns/op	0.97
set_exitEpoch - ContainerNodeStructType	267.00 ns/op	232.00 ns/op	1.15
get_pubkey - ContainerType	826.00 ns/op	927.00 ns/op	0.89
get_pubkey - ContainerNodeStructType	249.00 ns/op	223.00 ns/op	1.12
hashTreeRoot - ContainerType	393.00 ns/op	392.00 ns/op	1.00
hashTreeRoot - ContainerNodeStructType	401.00 ns/op	358.00 ns/op	1.12
createProof - ContainerType	3.9500 us/op	3.8400 us/op	1.03
createProof - ContainerNodeStructType	21.561 us/op	21.848 us/op	0.99
serialize - ContainerType	1.8240 us/op	1.7040 us/op	1.07
serialize - ContainerNodeStructType	1.4440 us/op	1.4030 us/op	1.03
set_exitEpoch_and_hashTreeRoot - ContainerType	2.9720 us/op	2.9300 us/op	1.01
set_exitEpoch_and_hashTreeRoot - ContainerNodeStructType	7.5090 us/op	7.3560 us/op	1.02
Array - for of	6.7880 us/op	7.4930 us/op	0.91
Array - for(;;)	5.7160 us/op	6.7180 us/op	0.85
basicListValue.readonlyValuesArray()	4.2301 ms/op	4.2245 ms/op	1.00
basicListValue.readonlyValuesArray() + loop all	4.3291 ms/op	4.2879 ms/op	1.01
compositeListValue.readonlyValuesArray()	31.139 ms/op	28.068 ms/op	1.11
compositeListValue.readonlyValuesArray() + loop all	33.099 ms/op	32.030 ms/op	1.03
Number64UintType - get balances list	4.4722 ms/op	4.6109 ms/op	0.97
Number64UintType - set balances list	10.170 ms/op	10.105 ms/op	1.01
Number64UintType - get and increase 10 then set	40.175 ms/op	46.189 ms/op	0.87
Number64UintType - increase 10 using applyDelta	17.432 ms/op	16.394 ms/op	1.06
Number64UintType - increase 10 using applyDeltaInBatch	17.351 ms/op	16.492 ms/op	1.05
tree_newTreeFromUint64Deltas	22.553 ms/op	20.557 ms/op	1.10
unsafeUint8ArrayToTree	35.789 ms/op	37.853 ms/op	0.95
bitLength(50)	239.00 ns/op	217.00 ns/op	1.10
bitLengthStr(50)	218.00 ns/op	208.00 ns/op	1.05
bitLength(8000)	239.00 ns/op	209.00 ns/op	1.14
bitLengthStr(8000)	256.00 ns/op	255.00 ns/op	1.00
bitLength(250000)	241.00 ns/op	210.00 ns/op	1.15
bitLengthStr(250000)	295.00 ns/op	291.00 ns/op	1.01
merkleize 32 chunks	14.792 us/op
merkleizeBlocksBytes 32 chunks	3.4760 us/op
merkleizeBlockArray 32 chunks	6.6260 us/op
merkleize 128 chunks	57.843 us/op
merkleizeBlocksBytes 128 chunks	7.7980 us/op
merkleizeBlockArray 128 chunks	18.434 us/op
merkleize 512 chunks	230.94 us/op
merkleizeBlocksBytes 512 chunks	23.365 us/op
merkleizeBlockArray 512 chunks	64.513 us/op
merkleize 1024 chunks	484.61 us/op
merkleizeBlocksBytes 1024 chunks	43.020 us/op
merkleizeBlockArray 1024 chunks	121.11 us/op
floor - Math.floor (53)	1.2598 ns/op	1.2468 ns/op	1.01
floor - << 0 (53)	1.2429 ns/op	1.2440 ns/op	1.00
floor - Math.floor (512)	1.2436 ns/op	1.2439 ns/op	1.00
floor - << 0 (512)	1.2435 ns/op	1.2669 ns/op	0.98
fnIf(0)	1.5557 ns/op	1.5551 ns/op	1.00
fnSwitch(0)	2.1760 ns/op	2.1827 ns/op	1.00
fnObj(0)	1.5761 ns/op	1.5592 ns/op	1.01
fnArr(0)	1.5540 ns/op	1.5557 ns/op	1.00
fnIf(4)	2.1772 ns/op	2.1812 ns/op	1.00
fnSwitch(4)	2.1773 ns/op	2.1777 ns/op	1.00
fnObj(4)	1.5547 ns/op	1.5568 ns/op	1.00
fnArr(4)	1.5559 ns/op	1.5777 ns/op	0.99
fnIf(9)	3.1070 ns/op	3.1089 ns/op	1.00
fnSwitch(9)	2.1783 ns/op	2.1768 ns/op	1.00
fnObj(9)	1.5746 ns/op	1.5626 ns/op	1.01
fnArr(9)	1.5550 ns/op	1.5543 ns/op	1.00
Container {a,b,vec} - as struct x100000	124.63 us/op	125.02 us/op	1.00
Container {a,b,vec} - as tree x100000	559.75 us/op	559.98 us/op	1.00
Container {a,vec,b} - as struct x100000	155.52 us/op	155.71 us/op	1.00
Container {a,vec,b} - as tree x100000	560.34 us/op	561.24 us/op	1.00
get 2 props x1000000 - rawObject	315.37 us/op	312.25 us/op	1.01
get 2 props x1000000 - proxy	73.799 ms/op	73.895 ms/op	1.00
get 2 props x1000000 - customObj	311.43 us/op	313.10 us/op	0.99
Simple object binary -> struct	950.00 ns/op	574.00 ns/op	1.66
Simple object binary -> tree_backed	2.5540 us/op	1.7190 us/op	1.49
Simple object struct -> tree_backed	2.9310 us/op	2.1960 us/op	1.33
Simple object tree_backed -> struct	2.5230 us/op	1.4930 us/op	1.69
Simple object struct -> binary	1.2330 us/op	811.00 ns/op	1.52
Simple object tree_backed -> binary	2.1750 us/op	1.2420 us/op	1.75
aggregationBits binary -> struct	939.00 ns/op	480.00 ns/op	1.96
aggregationBits binary -> tree_backed	2.8230 us/op	2.0790 us/op	1.36
aggregationBits struct -> tree_backed	3.1940 us/op	2.4660 us/op	1.30
aggregationBits tree_backed -> struct	1.2960 us/op	922.00 ns/op	1.41
aggregationBits struct -> binary	823.00 ns/op	650.00 ns/op	1.27
aggregationBits tree_backed -> binary	1.0810 us/op	825.00 ns/op	1.31
List(uint8) 100000 binary -> struct	1.6817 ms/op	1.7624 ms/op	0.95
List(uint8) 100000 binary -> tree_backed	280.70 us/op	273.22 us/op	1.03
List(uint8) 100000 struct -> tree_backed	1.3992 ms/op	1.4638 ms/op	0.96
List(uint8) 100000 tree_backed -> struct	1.1186 ms/op	1.1198 ms/op	1.00
List(uint8) 100000 struct -> binary	1.1027 ms/op	1.0992 ms/op	1.00
List(uint8) 100000 tree_backed -> binary	112.44 us/op	114.74 us/op	0.98
List(uint64Number) 100000 binary -> struct	1.3945 ms/op	1.4048 ms/op	0.99
List(uint64Number) 100000 binary -> tree_backed	4.6367 ms/op	5.0190 ms/op	0.92
List(uint64Number) 100000 struct -> tree_backed	6.3875 ms/op	6.1059 ms/op	1.05
List(uint64Number) 100000 tree_backed -> struct	2.4515 ms/op	2.4342 ms/op	1.01
List(uint64Number) 100000 struct -> binary	1.4951 ms/op	1.5826 ms/op	0.94
List(uint64Number) 100000 tree_backed -> binary	1.0972 ms/op	1.0330 ms/op	1.06
List(Uint64Bigint) 100000 binary -> struct	3.8732 ms/op	3.7095 ms/op	1.04
List(Uint64Bigint) 100000 binary -> tree_backed	5.5355 ms/op	5.0164 ms/op	1.10
List(Uint64Bigint) 100000 struct -> tree_backed	6.7228 ms/op	8.1592 ms/op	0.82
List(Uint64Bigint) 100000 tree_backed -> struct	4.7862 ms/op	5.1796 ms/op	0.92
List(Uint64Bigint) 100000 struct -> binary	2.1055 ms/op	2.0661 ms/op	1.02
List(Uint64Bigint) 100000 tree_backed -> binary	1.1334 ms/op	1.2860 ms/op	0.88
Vector(Root) 100000 binary -> struct	36.980 ms/op	37.432 ms/op	0.99
Vector(Root) 100000 binary -> tree_backed	40.261 ms/op	42.356 ms/op	0.95
Vector(Root) 100000 struct -> tree_backed	53.055 ms/op	49.278 ms/op	1.08
Vector(Root) 100000 tree_backed -> struct	52.010 ms/op	49.389 ms/op	1.05
Vector(Root) 100000 struct -> binary	2.9548 ms/op	2.7343 ms/op	1.08
Vector(Root) 100000 tree_backed -> binary	6.6796 ms/op	6.5106 ms/op	1.03
List(Validator) 100000 binary -> struct	106.76 ms/op	107.22 ms/op	1.00
List(Validator) 100000 binary -> tree_backed	347.07 ms/op	320.95 ms/op	1.08
List(Validator) 100000 struct -> tree_backed	384.61 ms/op	373.91 ms/op	1.03
List(Validator) 100000 tree_backed -> struct	221.99 ms/op	202.44 ms/op	1.10
List(Validator) 100000 struct -> binary	29.793 ms/op	28.992 ms/op	1.03
List(Validator) 100000 tree_backed -> binary	111.15 ms/op	105.77 ms/op	1.05
List(Validator-NS) 100000 binary -> struct	117.94 ms/op	117.19 ms/op	1.01
List(Validator-NS) 100000 binary -> tree_backed	169.68 ms/op	159.64 ms/op	1.06
List(Validator-NS) 100000 struct -> tree_backed	210.38 ms/op	208.31 ms/op	1.01
List(Validator-NS) 100000 tree_backed -> struct	172.93 ms/op	169.86 ms/op	1.02
List(Validator-NS) 100000 struct -> binary	29.412 ms/op	28.812 ms/op	1.02
List(Validator-NS) 100000 tree_backed -> binary	34.947 ms/op	34.085 ms/op	1.03
get epochStatuses - MutableVector	112.88 us/op	98.762 us/op	1.14
get epochStatuses - ViewDU	208.24 us/op	213.39 us/op	0.98
set epochStatuses - ListTreeView	2.0275 ms/op	2.1305 ms/op	0.95
set epochStatuses - ListTreeView - set()	466.24 us/op	446.33 us/op	1.04
set epochStatuses - ListTreeView - commit()	722.56 us/op	731.49 us/op	0.99
bitstring	514.77 ns/op	513.77 ns/op	1.00
bit mask	13.651 ns/op	13.799 ns/op	0.99
struct - increase slot to 1000000	942.26 us/op	933.62 us/op	1.01
UintNumberType - increase slot to 1000000	28.730 ms/op	27.711 ms/op	1.04
UintBigintType - increase slot to 1000000	174.83 ms/op	171.34 ms/op	1.02
UintBigint8 x 100000 tree_deserialize	5.7077 ms/op	4.8651 ms/op	1.17
UintBigint8 x 100000 tree_serialize	1.1297 ms/op	1.1287 ms/op	1.00
UintBigint16 x 100000 tree_deserialize	5.3366 ms/op	6.0631 ms/op	0.88
UintBigint16 x 100000 tree_serialize	1.3820 ms/op	1.3632 ms/op	1.01
UintBigint32 x 100000 tree_deserialize	5.5350 ms/op	5.5379 ms/op	1.00
UintBigint32 x 100000 tree_serialize	1.9077 ms/op	1.8300 ms/op	1.04
UintBigint64 x 100000 tree_deserialize	5.8962 ms/op	6.2323 ms/op	0.95
UintBigint64 x 100000 tree_serialize	2.4838 ms/op	2.4893 ms/op	1.00
UintBigint8 x 100000 value_deserialize	435.68 us/op	437.44 us/op	1.00
UintBigint8 x 100000 value_serialize	774.09 us/op	771.83 us/op	1.00
UintBigint16 x 100000 value_deserialize	469.17 us/op	466.78 us/op	1.01
UintBigint16 x 100000 value_serialize	807.99 us/op	814.65 us/op	0.99
UintBigint32 x 100000 value_deserialize	502.87 us/op	497.55 us/op	1.01
UintBigint32 x 100000 value_serialize	856.17 us/op	851.55 us/op	1.01
UintBigint64 x 100000 value_deserialize	568.44 us/op	561.32 us/op	1.01
UintBigint64 x 100000 value_serialize	1.0445 ms/op	1.0346 ms/op	1.01
UintBigint8 x 100000 deserialize	3.4295 ms/op	3.1338 ms/op	1.09
UintBigint8 x 100000 serialize	1.6446 ms/op	1.5236 ms/op	1.08
UintBigint16 x 100000 deserialize	3.5563 ms/op	3.1959 ms/op	1.11
UintBigint16 x 100000 serialize	1.5437 ms/op	1.5486 ms/op	1.00
UintBigint32 x 100000 deserialize	3.3850 ms/op	3.2662 ms/op	1.04
UintBigint32 x 100000 serialize	2.8506 ms/op	2.8154 ms/op	1.01
UintBigint64 x 100000 deserialize	4.4418 ms/op	4.2641 ms/op	1.04
UintBigint64 x 100000 serialize	1.6602 ms/op	1.7528 ms/op	0.95
UintBigint128 x 100000 deserialize	6.4605 ms/op	5.5805 ms/op	1.16
UintBigint128 x 100000 serialize	14.780 ms/op	15.764 ms/op	0.94
UintBigint256 x 100000 deserialize	7.9071 ms/op	8.9411 ms/op	0.88
UintBigint256 x 100000 serialize	43.432 ms/op	45.903 ms/op	0.95
Slice from Uint8Array x25000	1.3018 ms/op	1.3955 ms/op	0.93
Slice from ArrayBuffer x25000	17.060 ms/op	15.878 ms/op	1.07
Slice from ArrayBuffer x25000 + new Uint8Array	18.501 ms/op	16.324 ms/op	1.13
Copy Uint8Array 100000 iterate	2.6403 ms/op	2.7272 ms/op	0.97
Copy Uint8Array 100000 slice	97.777 us/op	95.870 us/op	1.02
Copy Uint8Array 100000 Uint8Array.prototype.slice.call	93.596 us/op	87.975 us/op	1.06
Copy Buffer 100000 Uint8Array.prototype.slice.call	90.363 us/op	90.034 us/op	1.00
Copy Uint8Array 100000 slice + set	195.84 us/op	177.75 us/op	1.10
Copy Uint8Array 100000 subarray + set	96.497 us/op	88.467 us/op	1.09
Copy Uint8Array 100000 slice arrayBuffer	96.102 us/op	91.189 us/op	1.05
Uint64 deserialize 100000 - iterate Uint8Array	1.9120 ms/op	1.9793 ms/op	0.97
Uint64 deserialize 100000 - by Uint32A	1.8220 ms/op	2.0172 ms/op	0.90
Uint64 deserialize 100000 - by DataView.getUint32 x2	1.8059 ms/op	1.8379 ms/op	0.98
Uint64 deserialize 100000 - by DataView.getBigUint64	4.9341 ms/op	4.8271 ms/op	1.02
Uint64 deserialize 100000 - by byte	41.480 ms/op	40.847 ms/op	1.02

by benchmarkbot/action

twoeths · 2024-10-21T03:18:16Z

tested this on feat1, see ChainSafe/lodestar#7171 (comment)
ready to review

twoeths · 2024-10-31T03:46:21Z

sha256 works in blocks, each is 64 bytes so perhaps it's more meaningful to reflect that for chunkBytesBuffer variable

also with holesky, there are 1.7M validators. For every 8 deposits we have to reallocate the whole 1.7M * 8 bytes = 13.6MB for BeaconState.balances which is not ideal. Need to instead allocate another 64 bytes in this case. This applies for all list types.

Update:

the hash of BeaconState.balances and everything inside BeaconState work through ViewDU so it's not revelant
it's more related to the hash of BeaconBlock, for example transaction data and lists like number of transactions

matthewkeil

I left a few comments but I think this PR really needs to be reviewed by @wemeetagain

matthewkeil · 2025-02-21T18:52:07Z

packages/ssz/src/type/byteArray.ts

  // Merkleization

-  protected getRoots(value: ByteArray): Uint8Array[] {
-    return splitIntoRootChunks(value);
+  protected getBlocksBytes(value: ByteArray): Uint8Array {


Im confused by what BlocksBytes represents. I understand there is a shared buffer that is getting reused but why the name Blocks and thus bytes of a block? Perhaps we can talk about the naming conventions here?

It seems like a Block is the unit used for (2) 32-byte roots?

Perhaps calling 32 byte chunks Bytes32 and 64-byte "blocks" Bytes64? Not sure, what do you think?

matthewkeil · 2025-02-21T19:04:41Z

packages/ssz/src/type/byteArray.ts

+
+  blocksBuffer.set(value);
+  const valueLen = value.length;
+  const blockByteLen = Math.ceil(valueLen / 64) * 64;


It might be helpful to break out a helper function so its clear why this is happening everywhere.

export function getPaddedByte32Count(buf: ArrayBuffer): number { return Math.ceil(buf.length / 32); } export function getPaddedByte64Count(buf: ArrayBuffer): number { return Math.ceil(buf.length / 64); }

matthewkeil · 2025-02-21T19:13:33Z

packages/ssz/src/type/byteList.ts

+      const blockDiff = newBlockCount - oldBlockCount;
+      const newBlocksBytes = new Uint8Array(blockDiff * 64);
+      for (let i = 0; i < blockDiff; i++) {
+        this.blockArray.push(newBlocksBytes.subarray(i * 64, (i + 1) * 64));


Why loop and push in chunks? Why not just push all at once and only trigger one resize to the blockArray?

github-actions bot added as-sha256 ssz labels Oct 15, 2024

twoeths mentioned this pull request Oct 15, 2024

Use allocUnsafe where possible #278

Closed

twoeths added the status-do-not-merge label Oct 15, 2024

twoeths marked this pull request as ready for review October 21, 2024 03:18

twoeths requested a review from a team as a code owner October 21, 2024 03:18

twoeths mentioned this pull request Oct 21, 2024

feat: getAll() apis to support output parameters #417

Draft

twoeths marked this pull request as draft October 31, 2024 03:39

twoeths force-pushed the te/improve_type_dot_hash_tree_root branch 2 times, most recently from 9e32c5c to 7ed3ced Compare November 9, 2024 02:19

philknows modified the milestones: v1.0, v1.1 Jan 22, 2025

twoeths added 3 commits February 12, 2025 13:10

feat: improve type.hashTreeRoot() using batch

f1bdd86

feat: consume merkleizeBlockArray

95a52bf

fix: lint in ssz package

cbb30a2

nazarhussain force-pushed the te/improve_type_dot_hash_tree_root branch from d3821ee to cbb30a2 Compare February 12, 2025 12:18

nazarhussain added 4 commits February 12, 2025 13:22

Fix old perf test usage

18bd9bf

Fiz types

6014e37

Fix lint errors

876a1ab

Fix lint errors

74dff5b

nazarhussain marked this pull request as ready for review February 12, 2025 14:05

matthewkeil reviewed Feb 21, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: improve type.hashTreeRoot() using batch #409

feat: improve type.hashTreeRoot() using batch #409

twoeths commented Oct 15, 2024

github-actions bot commented Oct 15, 2024 •

edited

Loading

twoeths commented Oct 21, 2024

twoeths commented Oct 31, 2024 •

edited

Loading

matthewkeil left a comment

matthewkeil Feb 21, 2025

matthewkeil Feb 21, 2025

matthewkeil Feb 21, 2025

matthewkeil Feb 21, 2025

matthewkeil Feb 21, 2025

feat: improve type.hashTreeRoot() using batch #409

Are you sure you want to change the base?

feat: improve type.hashTreeRoot() using batch #409

Conversation

twoeths commented Oct 15, 2024

github-actions bot commented Oct 15, 2024 • edited Loading

Performance Report

twoeths commented Oct 21, 2024

twoeths commented Oct 31, 2024 • edited Loading

matthewkeil left a comment

Choose a reason for hiding this comment

matthewkeil Feb 21, 2025

Choose a reason for hiding this comment

matthewkeil Feb 21, 2025

Choose a reason for hiding this comment

matthewkeil Feb 21, 2025

Choose a reason for hiding this comment

matthewkeil Feb 21, 2025

Choose a reason for hiding this comment

matthewkeil Feb 21, 2025

Choose a reason for hiding this comment

github-actions bot commented Oct 15, 2024 •

edited

Loading

twoeths commented Oct 31, 2024 •

edited

Loading