Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Handle errors during constant folding in expression compilation #12376

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

pramodsatya
Copy link
Collaborator

@pramodsatya pramodsatya commented Feb 18, 2025

Expression compilation currently ignores any errors encountered during constant folding. This change enables the expression compiler to return a FailFunction in place of the constant expression which throws a VeloxException when evaluated. This is consistent with the behavior of Presto's expression compiler, please see this issue for more context.

Resolves #12414.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 18, 2025
Copy link

netlify bot commented Feb 18, 2025

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 11310d6
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/67b91644a636460008bd9123

@@ -1753,9 +1753,11 @@ std::vector<common::Subfield> Expr::extractSubfields() const {
ExprSet::ExprSet(
const std::vector<core::TypedExprPtr>& sources,
core::ExecCtx* execCtx,
bool enableConstantFolding)
bool enableConstantFolding,
bool ignoreConstantFoldError)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pramodsatya : Maybe say 'constantFoldErrorWithFailFunction' instead to make the config more clear about what it does.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @aditi-pandit, updated accordingly and added a unit test. Could you please take another look?

@pramodsatya pramodsatya marked this pull request as ready for review February 21, 2025 23:51
Copy link
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pramodsatya High-level question. ExprCompiler takes core::TypedExprPtr and returns an executable expression, exec::Expr. If you plan to use this or constant folding during query optimization, I assume you need core::TypedExprPtr as a result of constant folding. Since there is no way to get core::TypedExprPtr from exec::Expr, how do you plan to proceed?

I assume we would need something like #12306.

@pramodsatya
Copy link
Collaborator Author

@pramodsatya High-level question. ExprCompiler takes core::TypedExprPtr and returns an executable expression, exec::Expr. If you plan to use this or constant folding during query optimization, I assume you need core::TypedExprPtr as a result of constant folding. Since there is no way to get core::TypedExprPtr from exec::Expr, how do you plan to proceed?

I assume we would need something like #12306.

@mbasmanova, once the constant folded exec::Expr is returned by ExprCompiler, it will be converted to a Presto protocol RowExpression using the RowExpressionConverter class (to be added in prestodb/presto#22927, please refer to the Presto issue for more details: prestodb/presto#24591). So I don't need core::TypedExprPtr as a result of constant folding. Please let me know if I am missing something.

@mbasmanova
Copy link
Contributor

@pramodsatya I don't think it is a good idea to convert Expr instance to RowExpression (the equivalent of TypedExpr). Instead, I suggest to enhance evaluateConstantExpression to return result or error and use that.

Copy link
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments.

@aditi-pandit
Copy link
Collaborator

@pramodsatya High-level question. ExprCompiler takes core::TypedExprPtr and returns an executable expression, exec::Expr. If you plan to use this or constant folding during query optimization, I assume you need core::TypedExprPtr as a result of constant folding. Since there is no way to get core::TypedExprPtr from exec::Expr, how do you plan to proceed?
I assume we would need something like #12306.

@mbasmanova, once the constant folded exec::Expr is returned by ExprCompiler, it will be converted to a Presto protocol RowExpression using the RowExpressionConverter class (to be added in prestodb/presto#22927, please refer to the Presto issue for more details: prestodb/presto#24591). So I don't need core::TypedExprPtr as a result of constant folding. Please let me know if I am missing something.

@mbasmanova : Thanks for your comments. Wanted to give broader context of why we came up with this design. Would be really happy to hear any alternatives you suggest.

The expectation was that Presto optimizer was more aggressive with constant folding and some other cases that Velox Expression Compiler doesn't.
e.g. i) For Coalesce, Presto would remove null values in the beginning parameters and evaluate intermediate constants.
ii) For OR, if the first argument evaluated to NULL, then return the second argument.

So Tim changed some code in the Presto optimizer and gave us an API for batch evaluation of expressions. The intent was to go from Presto Row Expressions -> Velox expression, do the Presto compatible changes in presto-native invoking ExpressionCompiler at different points for evaluating constants and return Presto Row Expressions ([prestodb/presto#22927]). We have also given Presto all the function signatures of Presto registry in Velox so that a function catalog (conforming to FunctionNamespaceManager) can be populated. So my expectation would be that will cause Presto optimizer to evaluate consistently with Velox signatures.

He had a bunch of test cases for us and we ended up with tons of failures on account of the Velox behavior for constant folding exception cases. So that's why we came up with this proposal.

Would be great to hear your thoughts about all this and really welcome alternative designs.

@mbasmanova
Copy link
Contributor

@aditi-pandit Aditi, thank you for sharing additional context. Some questions.

The expectation was that Presto optimizer was more aggressive with constant folding and some other cases that Velox Expression Compiler doesn't.

A natural question here would be whether it makes sense for Velox to implement these. Have we already looked into that? What did we decide and why?

He had a bunch of test cases for us and we ended up with tons of failures on account of the Velox behavior for constant folding exception cases. So that's why we came up with this proposal.

I'm not sure this explains why we decided to convert Velox Expr to Presto RowExpression. Would you take a look at the alternative I mentioned in #12376 (comment)

@aditi-pandit
Copy link
Collaborator

aditi-pandit commented Feb 25, 2025

A natural question here would be whether it makes sense for Velox to implement these. Have we already looked into that? What did we decide and why?

@mbasmanova : Implementing these optimizations in Velox is an option. From our side we wanted to mature our concept and implementation before making a proposal to the Velox team.

I'm not sure this explains why we decided to convert Velox Expr to Presto RowExpression. Would you take a look at the alternative I mentioned in #12376 (comment)

The Presto Optimizer code had parsers for on the wire Presto RowExpressions. If we chose Velox Expr, or TypedExpr we have to add new parsing code for that. And since we had the right function signatures from Velox our conversion using Presto RowExpression seemed lossless and correct.

Just saw your new code in #12306. We can consider using it.

@pramodsatya

@mbasmanova
Copy link
Contributor

The Presto Optimizer code had parsers for on the wire Presto RowExpressions. If we chose Velox Expr, or TypedExpr we have to add new parsing code for that.

I'm confused. ExprCompiler produces Velox Expr, not presto RowExpression. Hence, in the current design, we need to somehow convert Velox Expr to RowExpression, no?

@aditi-pandit
Copy link
Collaborator

aditi-pandit commented Feb 25, 2025

The Presto Optimizer code had parsers for on the wire Presto RowExpressions. If we chose Velox Expr, or TypedExpr we have to add new parsing code for that.

I'm confused. ExprCompiler produces Velox Expr, not presto RowExpression. Hence, in the current design, we need to somehow convert Velox Expr to RowExpression, no?

@mbasmanova : Oh I got the confusion.

Yes. We wrote C++ code to convert Velox Expr to Presto RowExpression in presto-native.

@mbasmanova
Copy link
Contributor

We wrote C++ code to convert Velox Expr to Presto RowExpression in presto-native.

@aditi-pandit Got it. My original comment was to point out that this is non-ideal. Expr is an implementation detail of the expression execution engine, not a public API. Hence, we should not take dependency on that.

@aditi-pandit
Copy link
Collaborator

@aditi-pandit Got it. My original comment was to point out that this is non-ideal. Expr is an implementation detail of the expression execution engine, not a public API. Hence, we should not take dependency on that.

@mbasmanova : We discussed this internally. It would be hard to achieve constant folding as expected by the presto native API Tim wanted without parsing Expr. So we would mostly abandon that approach. Though we will decide that only after he returns from his parental leave.

In the meanwhile we will work on moving the optimizations to Velox if you are open to that.

@mbasmanova
Copy link
Contributor

mbasmanova commented Feb 27, 2025

@aditi-pandit Sounds good. It would be nice to create a GitHub issue or a Google doc to describe the optimizations.

@aditi-pandit
Copy link
Collaborator

@aditi-pandit Sounds good. It would be nice to create a GitHub issue or a Google doc to describe the optimizations.

@mbasmanova : Yes, there was some code prototyped for that already. We will share early next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Handling constant folding errors
4 participants