Skip to content
This repository has been archived by the owner on Sep 16, 2024. It is now read-only.

#65: Replace CRC32 with BLAKE3 Hash Function #73

Closed
wants to merge 6 commits into from

Conversation

tareknaser
Copy link
Collaborator

Description

This pull request updates the hashing method for the ResourceId struct, replacing CRC32 with BLAKE3 hash function.
Related Issue: #65

Changes

The changes include:

  • Addition of benchmarks for the ResourceId::compute_bytes method using the criterion crate with the html_reports feature enabled. Run cargo bench to generate benchmarks in the /target/criterion folder
  • Removing collision handling as BLAKE3 is a cryptographic hash function

Refer to individual commit messages and bodies for detailed information

This commit introduces a new benchmark file, benches/hash_benchmark.rs. The file includes the compute_bytes_benchmark function, which has three benchmarks for different use cases:
- compute_bytes_small: Benchmarks with small input data.
- compute_bytes_medium: Evaluates medium-sized input data.
- compute_bytes_large: Measures large input data performance.

criterion is configured as a dev dependency with the html_reports feature for detailed HTML reports in target/criterion/.

Signed-off-by: Tarek <tareknaser360@gmail.com>
This commit includes the following changes:

- Update the hashing method in the ResourceId struct from crc32fast to blake3
- Due to the change in hash representation, ResourceId no longer implements Copy but remains Clone.
- Adjust code throughout arklib to accommodate the new hashing method.
- Update tests that utilize ResourceId to reflect the changes.

There will be a follow up commit to remove the collision counter from the code since Blake3 is a cryptographic hash function

Signed-off-by: Tarek <tareknaser360@gmail.com>
The hash function was updated from crc32 to blake3 in the previous commit. This commit removes references to collisions from the code, as Blake3 is a cryptographic hash function and does not produce collisions.

Signed-off-by: Tarek <tareknaser360@gmail.com>
Signed-off-by: Tarek <tareknaser360@gmail.com>
This commit updates the storage format of the BLAKE3 hash in the ResourceId struct from a hex-encoded string to a fixed-size array [u8; 32].

Storing the hash as a byte array eliminates the need for hex decoding and encoding operations, potentially improving performance.

Signed-off-by: Tarek <tareknaser360@gmail.com>
@kirillt
Copy link
Member

kirillt commented Jan 28, 2024

Let's finish the benchmarks first:

After we have index benchmark, we'll be able to judge better about performance penalty of Blake3. Collision tracking, necessary for CRC32, could eat some performance up, too.

This commit improves readability for ResourceId struct by base64 encoding the blake3 hash for printing and parsing.

It also removes data_size field from ResourceId and removes its references from other parts of the code

Signed-off-by: Tarek <tareknaser360@gmail.com>
@kirillt kirillt changed the title Replace CRC32 with BLAKE3 Hash Function #65: Replace CRC32 with BLAKE3 Hash Function Feb 6, 2024
@tareknaser
Copy link
Collaborator Author

This PR should be rewritten when #90 is merged here and into ark-rust as well.

I can re rewrite the logic in ark-rust directly so that it implements ResourceIdTrait

@tareknaser
Copy link
Collaborator Author

Closing this in favor of ARK-Builders/ark-core#32

@tareknaser tareknaser closed this Apr 21, 2024
@tareknaser tareknaser deleted the betterhash branch April 21, 2024 17:41
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants