Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement word-break and overflow-wrap style properties #315

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

valadaptive
Copy link
Contributor

@valadaptive valadaptive commented Mar 25, 2025

Requires upstream support in Swash (dfrg/swash#89).

This implements the CSS word-break and overflow-wrap style properties (including the latter's influence on the layout's min-content size).

This replaces the "emergency break" logic with something that should be simpler.

Different browsers seem to disagree on exactly which ranges of text the properties apply to. The behavior implemented here should be reasonable.

In the table below, red is the default wrapping settings, blue is overflow-wrap: anywhere, and green is word-break: break-all:

Gecko Blink WebKit Parley
image image image image
image image image image
image image image image
image image image image
image image image image
image image image image *
image image image word_break_break_all_second_half-0

* This doesn't match browsers, but the behavior is officially unspecified per w3c/csswg-drafts#3897

@DJMcNab
Copy link
Member

DJMcNab commented Mar 25, 2025

Wow, this is fantastic. I'm not really best placed to review this, but I do have a few initial thoughts.

That table is excellent! One thing which immediately jumps out at me is that using red/green for significant information is potentially challenging for colourblind users. I don't know of anyone on the team which this would impact, and I'm not suggesting that you should re-make the table (it looks like a lot of work...), but I'd recommend it as something to bear in mind in the future. (My understanding is that the general advice for having "side-channel" important information of this kind is to use e.g. patterns for differentiation; I'm not sure how practical that would be in this case...)

I'm not sure what to do about the new Korean tests. I guess showing that we lay things out correctly is quite cool, even without the actual glyphs being visible. If we can find a "sufficiently" small font with a suitable license, adding it to the repo is probably reasonable.

Are the tests standard? I see you link to wpt.fyi for one of them. Should the rest have a similar source provided/linked?

@valadaptive
Copy link
Contributor Author

valadaptive commented Mar 25, 2025

I've added links to the WPT page for the word-break: keep-all tests. Perhaps I should try implementing the other tests here as well? I think it would involve a lot of tedious manual porting.

As far as I can tell, the other tests (regarding which spans of text the CSS property "attaches" to) are original. I did just glance over at some of the WPT tests, and it looks like there's one (http://wpt.live/css/css-text/word-break/word-break-break-all-inline-007.tentative.html) that fails in all the browsers, and I think corresponds to this case:
image

I had to implement some weird behavior to replicate the browsers in that case (see the comment in layout/context.rs about applying each character's line breaking property to the next character), but actually just reverted it to the behavior that seems more sensible since WPT says it's the browsers' fault.

Regarding the Korean/Japanese tests, I decided not to include any CJK fonts since they're huge (Noto Sans JP is 4.5MB per weight). I added a comment about that.

Regarding accessibility, I changed the tests so that the bit of text with the relevant line-wrapping style property is underlined.

@valadaptive valadaptive reopened this Mar 25, 2025
@valadaptive valadaptive force-pushed the wrapping-styles branch 2 times, most recently from 849faf6 to e739ac0 Compare March 25, 2025 18:58
@xorgy
Copy link
Member

xorgy commented Mar 25, 2025

Thank you! I'm wondering how this will work with hyphen insertion (particularly with explicit hyphenation/break opportunities through U+00AD SOFT HYPHEN). Seems like it'll work well once we have that, and these properties (somewhat) affect that.

Copy link
Member

@xorgy xorgy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense to consolidate the definitions of WordBreakStrength and just reuse/reexport that here.
Thank you!

@nicoburns
Copy link
Contributor

nicoburns commented Mar 26, 2025

Regarding testing, it's worth noting that we can run some WPT tests via Blitz (which is using Parley for text layout). We can't run everything though, because some tests require JavaScript (and some are designed for manual verification), so it may well be worth porting some of them.


For running word-break WPT tests with Blitz (basically anything that's screenshot based):

  • Clone https://github.com/DioxusLabs/blitz
  • Run cargo run --release --package wpt -- css/css-text/word-break
  • (if you have just installed then you can use just wpt css/css-text/word-break)
  • You can patch in a local version of Parley using the top-level (workspace) Cargo.toml

Currently we can run 5799/103 css/css-text/word-break (of which 2325 pass)
And 1079/1774 of all css/css-text tests (of which 314 pass)

} else {
self.state.append_cluster_to_line(next_x);
self.state.cluster_idx += 1;
else if let Some(prev_emergency) =
Copy link
Contributor

@nicoburns nicoburns Mar 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it always correct to try "prev_boundary" first and then fallback to "prev_emergency"? Are there cases (OverflowWrap::Anywhere?) where we ought to use the later "emergency" breakpoint even if an earlier "regular" breakpoint is available?

I wonder whether we ought to just have one "previous break point" state that just gets set in different conditions depending on the line-breaking styles?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there cases (OverflowWrap::Anywhere?) where we ought to use the later "emergency" breakpoint even if an earlier "regular" breakpoint is available?

Text with OverflowWrap::Anywhere should only use an "emergency" breakpoint if there are no other opportunities to break the line.

The name "anywhere" is a bit confusing since it may give the impression that it behaves like word-break: break-all. but it's just break-word with the difference that it affects the min content size.

I don't think there's any way to get around having two different previous break points since we need that fallback behavior.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name "anywhere" is a bit confusing since it may give the impression that it behaves like word-break: break-all

I see. Part of me feels like we ought not to stick to the CSS property names here and come up with something more sensible (but using the CSS properties is also very convenient for Blitz which can translate them 1:1 from Stylo)

Comment on lines +105 to 130
let mut word_break = Default::default();
let mut style_idx = 0;

let mut char_indices = text.char_indices();
loop {
let Some((char_idx, _)) = char_indices.next() else {
break;
};

// Find the style for this character. If the text is empty, we may not have any styles. Otherwise,
// self.styles should span the entire range of the text.
while let Some(style) = self.styles.get(style_idx) {
if style.range.end > char_idx {
word_break = style.style.word_break;
break;
}
style_idx += 1;
}
a.set_break_strength(word_break);

let Some((properties, boundary)) = a.next() else {
break;
};

self.info.push((CharInfo::new(properties, boundary), 0));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems pretty awkward, but I guess fine...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's a bit unfortunate. I thought about other ways to do it in Swash, e.g. passing in the word break strength with each call to next, but that would be a breaking change there and make it no longer an iterator.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could make the analysis take an iterator over (char, WordBreakStrength) rather than just char? And you could have a different constructor for backwards compat.

@nicoburns
Copy link
Contributor

Blitz branch pointed at this PR: https://github.com/DioxusLabs/blitz/tree/word-break-overflow-wrap

This PR is currently only making 1 extra test pass. But I wouldn't read too much into that. I can see some tests failing like https://wpt.live/css/css-text/word-break/word-break-break-all-010.html. But that test uses ch units which Blitz doesn't currently support (need to get font metrics out of fontique for that!).


Notes on debugging WPT tests using Blitz:

To open a specific test in Blitz: cargo run --release --package readme -- https://wpt.live/css/css-text/word-break/word-break-break-all-010.html
To inspect:

  • Press alt-h to enable inspector mode
  • Click on a node to have some debug info logged to console.

For inline formatting contexts you'll get something like the following:

Text content: " XXXXX"
Inline Boxes:

Lines:
Line 0:
  RUN (x: 0, w: 80)
Line 1:
  RUN (x: 0, w: 80)
Line 2:
  RUN (x: 0, w: 80)

@nicoburns nicoburns mentioned this pull request Mar 7, 2025
21 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants