Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: validate Baggage key by W3C standards #2804

Merged
merged 6 commits into from
Mar 19, 2025

Conversation

gruebel
Copy link
Member

@gruebel gruebel commented Mar 14, 2025

Relates to #2717

Changes

  • Also did some small performance improvements, which makes it faster while still adding the extra burden of kay validation

Merge requirement checklist

  • CONTRIBUTING guidelines followed
  • Unit tests added/updated (if applicable)
  • Appropriate CHANGELOG.md files updated for non-trivial, user-facing changes
  • Changes in public API reviewed (if applicable)

@gruebel gruebel requested a review from a team as a code owner March 14, 2025 08:28
Copy link

codecov bot commented Mar 14, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 80.5%. Comparing base (31b494b) to head (bbffd71).
Report is 1 commits behind head on main.

Additional details and impacted files
@@          Coverage Diff          @@
##            main   #2804   +/-   ##
=====================================
  Coverage   80.4%   80.5%           
=====================================
  Files        124     124           
  Lines      23390   23411   +21     
=====================================
+ Hits       18828   18849   +21     
  Misses      4562    4562           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@scottgerring scottgerring changed the title fix: validate Baggage key by W3C standards fix: validate Baggage key by W3C standards [performance] Mar 14, 2025
@scottgerring scottgerring changed the title fix: validate Baggage key by W3C standards [performance] fix: validate Baggage key by W3C standards Mar 14, 2025
@scottgerring scottgerring reopened this Mar 14, 2025
!key.is_empty()
&& key
.iter()
.all(|b| b.is_ascii() && !INVALID_ASCII_KEY_CHARS.contains(b))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each character in the key is sequentially validated against the INVALID_ASCII_KEY_CHARS array. Using a bitmask for INVALID_ASCII_KEY_CHARS can optimize this check by enabling constant-time lookups instead of an iterative scan.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ooh nice idea

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fraillt Please see if you have some perf suggestions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how to do it using bitmasks (as these characters are quite scattered and there's no easy binary pattern to match/mask), but usually when there's not so many values to check, it's very efficient to do it with switch/match, like this:

fn is_key_valid2(key: &[u8]) -> bool {
    !key.is_empty()
        && key.iter().all(|b| match b {
            b'!' => true,
            b'"' => false,
            b'#' => true,
            b'$' => true,
            b'%' => true,
            b'&' => true,
            b'\'' => true,
            b'(' => false,
            b')' => false,
            b'*' => true,
            b'+' => true,
            b',' => false,
            b'-' => true,
            b'.' => true,
            b'/' => false,
            b'0' => true,
            b'1' => true,
            b'2' => true,
            b'3' => true,
            b'4' => true,
            b'5' => true,
            b'6' => true,
            b'7' => true,
            b'8' => true,
            b'9' => true,
            b':' => false,
            b';' => false,
            b'<' => false,
            b'=' => false,
            b'>' => false,
            b'?' => false,
            b'@' => false,
            b'A' => true,
            b'B' => true,
            b'C' => true,
            b'D' => true,
            b'E' => true,
            b'F' => true,
            b'G' => true,
            b'H' => true,
            b'I' => true,
            b'J' => true,
            b'K' => true,
            b'L' => true,
            b'M' => true,
            b'N' => true,
            b'O' => true,
            b'P' => true,
            b'Q' => true,
            b'R' => true,
            b'S' => true,
            b'T' => true,
            b'U' => true,
            b'V' => true,
            b'W' => true,
            b'X' => true,
            b'Y' => true,
            b'Z' => true,
            b'[' => false,
            b'\\' => false,
            b']' => false,
            b'^' => true,
            b'_' => true,
            b'`' => true,
            b'a' => true,
            b'b' => true,
            b'c' => true,
            b'd' => true,
            b'e' => true,
            b'f' => true,
            b'g' => true,
            b'h' => true,
            b'i' => true,
            b'j' => true,
            b'k' => true,
            b'l' => true,
            b'm' => true,
            b'n' => true,
            b'o' => true,
            b'p' => true,
            b'q' => true,
            b'r' => true,
            b's' => true,
            b't' => true,
            b'u' => true,
            b'v' => true,
            b'w' => true,
            b'x' => true,
            b'y' => true,
            b'z' => true,
            b'{' => false,
            b'|' => true,
            b'}' => false,
            b'~' => true,
            _ => false,
        })
}

This is equivalent to b.is_ascii_graphic() && !INVALID_ASCII_KEY_CHARS.contains(b), notice that i also ignore control characters (is_ascii_graphic() not is_ascii).

I have tested this is_key_valid2 function, locally on my machine, and I get +3x performance improvement.

Copy link
Member

@cijothomas cijothomas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Would be nice to incorporate the perf optimizations in this or in a follow up.

@gruebel
Copy link
Member Author

gruebel commented Mar 19, 2025

Thanks. Would be nice to incorporate the perf optimizations in this or in a follow up.

@cijothomas I created a new ticket for it #2835. I tested the proposed idea/solution, but it had locally no measurable effect for me 😢

@cijothomas cijothomas merged commit bece03b into open-telemetry:main Mar 19, 2025
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants