Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adapt leafCallToSubfieldFilter to Spark #11093

Open
rui-mo opened this issue Sep 25, 2024 · 2 comments
Open

Adapt leafCallToSubfieldFilter to Spark #11093

rui-mo opened this issue Sep 25, 2024 · 2 comments
Labels
enhancement New feature or request

Comments

@rui-mo
Copy link
Collaborator

rui-mo commented Sep 25, 2024

Description

Gluten implemented some logic to convert call expr as subfield filter. To avoid duplication, we would like to use the existing 'leafCallToSubfieldFilter' logic in Velox. One incompatibility we found is current implementation is matching against Presto function names, while Spark function names could be different. The other one is 'isnotnull' is frequently used in Spark and it is not currently supported. These are the drafted changes we would like to propose: rui-mo@7340903.

@Yuhta
Copy link
Contributor

Yuhta commented Sep 25, 2024

Yeah leafCallToSubfieldFilter is Presto specific (similar to Tokenizer). Mixing in Spark names probably is not a good way to do it. A related question is how do we parse this expression in ExprCompiler, do we have a isnotnull function in sparksql?

@Yuhta
Copy link
Contributor

Yuhta commented Sep 25, 2024

One way to fix it is to have

class ExprToSubfieldFilterParser {
  std::unique_ptr<common::Filter> leafCallToSubfieldFilter(
    const core::CallTypedExpr&,
    common::Subfield&,
    core::ExpressionEvaluator*,
    bool negated = false);
};

And have this stored as shared_ptr inside ExecCtx, or another context object with query scope.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants