Handle empty result sets in CTEs #92

camuthig · 2024-05-17T18:57:01Z

If Django finds a CTE during SQL generation that will result in empty results, it will throw an EmptyResultSet immediately. Because the CTEs are compiled to SQL prior to the base query in the CTEQueryCompiler, though, the base compiler is missing core information Django needs to gracefully handle this situation, like col_count and klass_info.

To resolve this, in the case of EmptyResultSet being thrown by CTE compilation, the base as_sql will be called as well before reraising.

The creation of the base queryset can not be move before the explain behaviors in generate_sql or the generated EXPLAIN will throw an error, so the try/except pattern is used instead.

Resolves #84

I think this will also work for #64 as well, but I couldn't figure out a way to throw that particular error.

If Django finds a CTE during SQL generation that will result in empty results, it will throw an EmptyResultSet immediately. Because the CTEs are compiled to SQL prior to the base query in the CTEQueryCompiler, though, the base compiler is missing core information Django needs to gracefully handle this situation, like `col_count` and `klass_info`. To resolve this, in the case of EmptyResultSet being thrown by CTE compilation, the base `as_sql` will be called as well before reraising. The creation of the base queryset can not be move before the explain behaviors in `generate_sql` or the generated `EXPLAIN` will throw an error, so the try/except pattern is used instead.

millerdev · 2024-05-20T17:30:32Z

The wrong result is returned when using LEFT OUTER JOIN on the CTE. Here's a test to demonstrate:

    def test_left_outer_join_on_empty_result_set_cte(self):
        totals = With(
            Order.objects
            .filter(id__in=[])
            .values("region_id")
            .annotate(total=Sum("amount")),
            name="totals",
        )
        orders = (
            totals.join(Order, region=totals.col.region_id, _join_type=LOUTER)
            .with_cte(totals)
            .annotate(region_total=totals.col.total)
            .order_by("amount")
        )

        self.assertEqual(len(orders), 22)

This explicitly passes in the `elide_empty` parameter to each CTE compiler based on if the `join` used by it is an outer or inner join. By passing in `elide_empty` as false, instead of the compiler raising an `EmptyResultSet` error, it will instead alter the query to always return no results by setting the where clause to something that is always false.

camuthig · 2024-05-20T19:15:09Z

@millerdev I believe the latest commit should handle that case. It requires more introspection to handle the join, since the join aspect isn't what is throwing the EmptyResultSet - that is coming from the CTE itself. I leveraged the aliases map to understand how the CTE is being used in the rest of the query, and then the existing elide_empty parameter on the get_compiler function to not throw those errors if those CTEs were left joins outer joins.

In the master branch, the left outer joins still caused an empty result set and resulted in the klass_info access error. In this branch, that was originally altered to silently return no results after the empty result set was handled. Now it actually runs the query, but forces it to just not have results without throwing any errors.

millerdev · 2024-05-21T10:56:40Z

django_cte/query.py

+            elide_empty = True
+            alias = query.alias_map.get(cte.name)
+            if isinstance(alias, QJoin) and alias.join_type == LOUTER:
+                elide_empty = False


Why is this conditional value of elide_empty needed? Would it work to always pass elide_empty=False? Is there a disadvantage to doing that?

I'm kind of going with the flow of what Django does as default behavior in the core compiler. The default behavior in the compiler is to raise an assertion error and not run the query in the case of a knowable empty result set. The alternate flow, with elide_empty set to false, is used when queries need to be run regardless of the fact that they will not return results.

millerdev · 2024-05-21T18:46:50Z

Tests are failing. Unfortunately elide_empty is not available on all supported versions of Django. There is also a lint error.

Before Django 4.0, the `elide_empty` argument on the compiler did not exist. To ensure that left outer joins in these versions of Django are not optimized away by the SqlCompile, essentially the same behavior is added to the error handling in those versions of Django. This creates a copy of the CTE query, forces the where to have a always-false condition that the SqlCompiler cannot optimize away, and then builds the SQL for that query instead.

camuthig · 2024-05-22T00:14:53Z

It looks like elide_empty was added in 4.0.

I have fixed the lint errors and cleaned up the code a bit to enable specifically handling the situation in the except block for the older versions of Django. This is basically the same way that elide_empty works under the hood, it just isn't as clean of an implementation in the library-space.

millerdev

Looking good. I made a few suggestions.

django_cte/query.py

camuthig · 2024-05-23T14:25:56Z

Thanks for the feedback. It looks like the connector has always been optional, so I have removed it as requested and pulled in the other feedback.

millerdev

Sorry for dragging this out, but I noticed one more thing that I'd like to see changed. Otherwise this is looking great. Thanks for contributing to Django CTE!

django_cte/raw.py

millerdev · 2024-06-03T18:09:58Z

Thank you for the contribution @camuthig!

camuthig · 2024-06-05T19:57:33Z

No problem. Thanks for the awesome project @millerdev . My team is starting to lean on it more to improve our more out of control queries. Do you know when you might cut a release that includes this change? We are pinned to my branch until we can get back onto official releases, but we have a few behaviors that fall over without this fix.

millerdev · 2024-06-07T12:53:15Z

Thanks for the prompt. New release: https://pypi.org/project/django-cte/1.3.3/

SupImDos · 2024-06-25T04:00:22Z

@millerdev Not a huge deal, but would it be possible to do a tag / release on GitHub for v1.3.3 as well?

millerdev · 2024-06-25T10:44:19Z

Yes, it had been tagged, but I forgot to push it. https://github.com/dimagi/django-cte/releases/tag/v1.3.3

millerdev reviewed May 21, 2024

View reviewed changes

millerdev reviewed May 22, 2024

View reviewed changes

django_cte/query.py Outdated Show resolved Hide resolved

django_cte/query.py Outdated Show resolved Hide resolved

django_cte/query.py Outdated Show resolved Hide resolved

django_cte/query.py Outdated Show resolved Hide resolved

PR feedback

a3645e3

millerdev reviewed May 23, 2024

View reviewed changes

django_cte/raw.py Outdated Show resolved Hide resolved

Add explicit optional kwargs to raw CTE get_compiler

bd6fc3f

millerdev approved these changes Jun 3, 2024

View reviewed changes

millerdev merged commit c02b8e4 into dimagi:master Jun 3, 2024
15 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle empty result sets in CTEs #92

Handle empty result sets in CTEs #92

camuthig commented May 17, 2024

millerdev commented May 20, 2024

camuthig commented May 20, 2024 •

edited

Loading

millerdev May 21, 2024 •

edited

Loading

camuthig May 21, 2024

millerdev commented May 21, 2024

camuthig commented May 22, 2024

millerdev left a comment

camuthig commented May 23, 2024

millerdev left a comment

millerdev commented Jun 3, 2024

camuthig commented Jun 5, 2024

millerdev commented Jun 7, 2024

SupImDos commented Jun 25, 2024

millerdev commented Jun 25, 2024

Handle empty result sets in CTEs #92

Handle empty result sets in CTEs #92

Conversation

camuthig commented May 17, 2024

millerdev commented May 20, 2024

camuthig commented May 20, 2024 • edited Loading

millerdev May 21, 2024 • edited Loading

Choose a reason for hiding this comment

camuthig May 21, 2024

Choose a reason for hiding this comment

millerdev commented May 21, 2024

camuthig commented May 22, 2024

millerdev left a comment

Choose a reason for hiding this comment

camuthig commented May 23, 2024

millerdev left a comment

Choose a reason for hiding this comment

millerdev commented Jun 3, 2024

camuthig commented Jun 5, 2024

millerdev commented Jun 7, 2024

SupImDos commented Jun 25, 2024

millerdev commented Jun 25, 2024

camuthig commented May 20, 2024 •

edited

Loading

millerdev May 21, 2024 •

edited

Loading