-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add InvalidCharError::char
#124
Conversation
Returns a `char` if the invalid byte is ascii, otherwise returns the raw `u8`. Deprecates `InvalidCharError::invalid_char`.
This does not handle multi-byte characters yet.
|
Is |
In this case the returned |
At the moment we only return a char if it is a single-byte character right ? I agree that multi-byte chars would need more work. |
No, we currently do not return a |
Right I was talking about this current commit here all good. |
Oh, I see! I misunderstood your initial approach. This is clever but I am still going to concept-NACK it. If users want to do this (treat ASCII specially) it is easy for them to do it themselves -- but this change forces them to do so, which adds extra code complexity. It's also an ugly API. While the motivation makes sense, in terms of our existing code layout which prevents us from unconditionally getting a |
Sounds great I agree. Was there another approach you had in mind for #100 ? If I understand you correctly we would not address this issue. |
What I would imagine is that if we read an invalid u8, we attempt to continue reading u8s until we've parsed a complete UTF-8 codepoint. (We would not attempt to deal with combining characters or anything like that because Rust itself makes it hard.) And if we fail that...hmm. Maybe we would want to fall back to your API anyway? Or I think there's a standard UTF "unknown character" character we could just sub in. |
@apoelstra we chatted about this, bump so you can update your position if you want to. |
Yeah -- I think we don't want to target this issue at all now for 1.0. If there is some noninvasive non-performance-affecting way to do it, that's great, but I don't believe there is. Meanwhile, if a user has a hex string with a bad character in it, and wants to learn more specifically what that character is (and where it is, etc), it's not hard to do this without help from the crate. ( |
Sounds good, I close this PR. Whenever appropriate, I let you close #100 as well. |
Returns a
char
if the invalid byte is ascii, otherwise returns the rawu8
.Deprecates
InvalidCharError::invalid_char
.Partially addresses #100