Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Value Class Option [Arrow 2] #2950

Closed
wants to merge 6 commits into from
Closed

Value Class Option [Arrow 2] #2950

wants to merge 6 commits into from

Conversation

kyay10
Copy link
Collaborator

@kyay10 kyay10 commented Feb 26, 2023

See #2780 for prior discussion on the implementation details and drawbacks of this proposal. This time around, I've done some benchmarking to see what the performance improvement really is like.
The benchmarking code is available at branches option-benchmark for the current arrow impl and value-option-benchmark. The option-benchmark repo has the detailed benchmark results for each iteration in XML format which can be imported into Intellij.

The benchmarks focused on what I considered as regular usage of Option. nestedOptions creates an Option<Option<Int>> which can arise when mixing multiple Option-returning functions. nestedOptionsBoxed does the same thing but it stores the Options in a list because that forces the value class impl to box those Options, which could impact performance. The rest of the benchmarks would take the index of the current iteration, convert it to a string, then try to parse it back to an integer, as a representation of some workload. There's noneComprehension and someComprehension which uses an option comprehension combined with suspending functions. The XComprehensionBlocking variations are as before except that it uses regular functions instead of suspending ones. The noneTryCatchComprehensionBlocking and someSingleComprehensionBlocking both use a single comprehension instead of using comprehensions in a loop. Finally, XWithoutComprehension does the same operations but without relying on comprehensions at all.
Summary of the results:

data class Option:
Benchmark                                          (iterations)   Mode  Cnt      Score      Error  Units
OptionBenchmark.nestedOptions                              1000  thrpt    5  98943.392 ± 1582.071  ops/s
OptionBenchmark.nestedOptions                             10000  thrpt    5   9645.179 ±  191.599  ops/s
OptionBenchmark.nestedOptions                            100000  thrpt    5    982.886 ±   13.778  ops/s
OptionBenchmark.nestedOptionsBoxed                         1000  thrpt    5  40217.388 ±  998.822  ops/s
OptionBenchmark.nestedOptionsBoxed                        10000  thrpt    5   3887.956 ±  149.642  ops/s
OptionBenchmark.nestedOptionsBoxed                       100000  thrpt    5    425.593 ±   11.568  ops/s
OptionBenchmark.noneComprehension                          1000  thrpt    5  10752.354 ±  182.969  ops/s
OptionBenchmark.noneComprehension                         10000  thrpt    5   1048.075 ±   11.245  ops/s
OptionBenchmark.noneComprehension                        100000  thrpt    5    100.972 ±    2.167  ops/s
OptionBenchmark.noneComprehensionBlocking                  1000  thrpt    5  16619.324 ±  347.248  ops/s
OptionBenchmark.noneComprehensionBlocking                 10000  thrpt    5   1612.584 ±   83.146  ops/s
OptionBenchmark.noneComprehensionBlocking                100000  thrpt    5    156.700 ±    6.206  ops/s
OptionBenchmark.noneTryCatchComprehensionBlocking          1000  thrpt    5  26794.639 ±  312.442  ops/s
OptionBenchmark.noneTryCatchComprehensionBlocking         10000  thrpt    5   2458.319 ±  114.065  ops/s
OptionBenchmark.noneTryCatchComprehensionBlocking        100000  thrpt    5    215.947 ±    7.421  ops/s
OptionBenchmark.noneWithoutComprehension                   1000  thrpt    5  37276.589 ±  840.129  ops/s
OptionBenchmark.noneWithoutComprehension                  10000  thrpt    5   3163.258 ±   83.314  ops/s
OptionBenchmark.noneWithoutComprehension                 100000  thrpt    5    269.873 ±   14.460  ops/s
OptionBenchmark.someComprehension                          1000  thrpt    5   9705.450 ±  116.473  ops/s
OptionBenchmark.someComprehension                         10000  thrpt    5    948.898 ±   30.872  ops/s
OptionBenchmark.someComprehension                        100000  thrpt    5     90.398 ±    3.330  ops/s
OptionBenchmark.someComprehensionBlocking                  1000  thrpt    5  23228.768 ±  465.442  ops/s
OptionBenchmark.someComprehensionBlocking                 10000  thrpt    5   2219.911 ±  157.663  ops/s
OptionBenchmark.someComprehensionBlocking                100000  thrpt    5    202.700 ±    6.555  ops/s
OptionBenchmark.someSingleComprehensionBlocking            1000  thrpt    5  35067.601 ±  968.436  ops/s
OptionBenchmark.someSingleComprehensionBlocking           10000  thrpt    5   3135.560 ±  178.330  ops/s
OptionBenchmark.someSingleComprehensionBlocking          100000  thrpt    5    292.889 ±    6.322  ops/s
OptionBenchmark.someWithoutComprehension                   1000  thrpt    5  36958.048 ± 1697.719  ops/s
OptionBenchmark.someWithoutComprehension                  10000  thrpt    5   3203.423 ±  216.732  ops/s
OptionBenchmark.someWithoutComprehension                 100000  thrpt    5    282.825 ±    9.438  ops/s
value class Option:
Benchmark                                          (iterations)   Mode  Cnt       Score      Error  Units
OptionBenchmark.nestedOptions                              1000  thrpt    5  121050.701 ± 4445.625  ops/s
OptionBenchmark.nestedOptions                             10000  thrpt    5   10676.386 ±  288.812  ops/s
OptionBenchmark.nestedOptions                            100000  thrpt    5    1147.537 ±   49.080  ops/s
OptionBenchmark.nestedOptionsBoxed                         1000  thrpt    5   42691.150 ±  899.174  ops/s
OptionBenchmark.nestedOptionsBoxed                        10000  thrpt    5    4068.500 ±   44.628  ops/s
OptionBenchmark.nestedOptionsBoxed                       100000  thrpt    5     461.864 ±    9.269  ops/s
OptionBenchmark.noneComprehension                          1000  thrpt    5    9994.827 ±  304.826  ops/s
OptionBenchmark.noneComprehension                         10000  thrpt    5     988.170 ±   27.019  ops/s
OptionBenchmark.noneComprehension                        100000  thrpt    5      95.010 ±    2.123  ops/s
OptionBenchmark.noneComprehensionBlocking                  1000  thrpt    5   16315.570 ±  480.630  ops/s
OptionBenchmark.noneComprehensionBlocking                 10000  thrpt    5    1547.726 ±   42.745  ops/s
OptionBenchmark.noneComprehensionBlocking                100000  thrpt    5     150.565 ±    5.414  ops/s
OptionBenchmark.noneTryCatchComprehensionBlocking          1000  thrpt    5   26333.820 ±  714.362  ops/s
OptionBenchmark.noneTryCatchComprehensionBlocking         10000  thrpt    5    2424.888 ±  102.567  ops/s
OptionBenchmark.noneTryCatchComprehensionBlocking        100000  thrpt    5     215.527 ±   15.542  ops/s
OptionBenchmark.noneWithoutComprehension                   1000  thrpt    5   33729.492 ± 1298.667  ops/s
OptionBenchmark.noneWithoutComprehension                  10000  thrpt    5    3224.646 ±   38.023  ops/s
OptionBenchmark.noneWithoutComprehension                 100000  thrpt    5     266.426 ±    5.774  ops/s
OptionBenchmark.someComprehension                          1000  thrpt    5    9416.257 ±  293.389  ops/s
OptionBenchmark.someComprehension                         10000  thrpt    5     928.702 ±   22.118  ops/s
OptionBenchmark.someComprehension                        100000  thrpt    5      87.540 ±    2.544  ops/s
OptionBenchmark.someComprehensionBlocking                  1000  thrpt    5   24039.296 ±  748.190  ops/s
OptionBenchmark.someComprehensionBlocking                 10000  thrpt    5    2265.340 ±  123.046  ops/s
OptionBenchmark.someComprehensionBlocking                100000  thrpt    5     210.640 ±    4.187  ops/s
OptionBenchmark.someSingleComprehensionBlocking            1000  thrpt    5   35795.118 ±  771.840  ops/s
OptionBenchmark.someSingleComprehensionBlocking           10000  thrpt    5    3225.734 ±  127.685  ops/s
OptionBenchmark.someSingleComprehensionBlocking          100000  thrpt    5     294.222 ±   14.506  ops/s
OptionBenchmark.someWithoutComprehension                   1000  thrpt    5   36877.766 ±  447.130  ops/s
OptionBenchmark.someWithoutComprehension                  10000  thrpt    5    3334.449 ±   42.788  ops/s
OptionBenchmark.someWithoutComprehension                 100000  thrpt    5     301.686 ±    8.844  ops/s

Interesting observations here include that nestedOptions saw anywhere from a 10-20% increase in throughput, with the boxed version seeing more modest increases of about 5-10% (which is marginal). The basic comprehension saw a (marginal) decrease in throughput of about 5%, which I suspect is because in coroutines, when a value object is created before a suspending call and is used after a suspending all, it obviously needs to survive inside the Continuation, but for some reason the compiler decides to box any such surviving values. This, combined with the overhead of boxing and unboxing, likely caused that decrease in throughput. This is supported by the Blocking variations, where throughput was comprabable for noneComprehensionBlocking and saw an increase of 3-4% for someComprehensionBlocking. The throughput difference is lower likely because the aforementioned comprehension functions all create a comprehension in each iteration of the loop, so any benefit from using a value class Option is marginal compared to the cost of creating a Raise. Also, the benefit of value class Option is further marginalised by the cost of throwing exceptions in noneComprehensionBlocking, hence why throughput was comparable.
Because of the cost of creating comprehensions, I then tried using a single comprehension (and using try-catch in the none case to recover from errors). I also ran a benchmark with no comprehensions whatsoever. The results were in value class Option's favour, but always by a very small percentage of extra throughput.

In conclusion, benchmarking Options alone seemed to show a significant performance benefit from using a value class impl, but it seems that those benefits become marginal when the Option code is part of some workload. In fact, there is extra overhead when combined with suspending functions due to a missing optimization for value classes surviving through suspending calls. Don't forget either that turning Option into a value class and optimising its nesting within itself requires inlining and using reified types practically everywhere, which could be an ergonomic disadvantage. I can change the code to make it so that it is less optimised for nested Options but doesn't use inline and reified everywhere, and that could decrease the invasiveness of this PR (Edit: see #2951 for a less-intrusive implementation of value class Option, but with less performance benefits). Ultimately, the decision is in your hands, and I'm more than happy to discuss any of the details of this and make changes until the PR makes sense for the library.

@serras
Copy link
Member

serras commented Feb 27, 2023

This seems quite promising. However, I'm a bit concerned about the requirement of making almost every function inline and almost every type parameter reified.

@myuwono
Copy link
Collaborator

myuwono commented Feb 27, 2023

I have to try this branch against arrow-integrations-jackson.. I have a bad feeling that Jackson might not play well with generic value classes

@nomisRev
Copy link
Member

In addition to the concerns of @serras and @myuwono I am wondering if this is worth it. The only good use-case IMO for Option is the nested nullability problem. Which arises when using Option within A either when writing generic code, or when using it in combination with Project Reactor/RxJava. In these cases the boxed version will always be used with is getting marginal performance increase in contrast to the downside of using a lot of inline reified. Breaking the possibility to use when is also big downside for me.

I took a look at the benchmarks, and the ones that seem to give the most performance benefits are the least used pattern.

There is also the ticket to support inline sealed class, which could already bring a lot of these enhancements without requiring this encoding. @kyay10 do you know if there is any mention of a timeline on those? And how would that potentially impact these changes?

@kyay10
Copy link
Collaborator Author

kyay10 commented Feb 27, 2023

@nomisRev with inline sealed class, we'd be able to keep pretty much the current Option encoding and preserve the use of when. It seems like there's an active PR for it, but with no timeline estimate.

Here's an interesting idea to address both your and @serras' concerns. The sister PR #2951 doesn't require the usage of inline reified, and it seems to offer similar-ish performance (the benchmarks probably need to be more representative). I can add an API to #2951 that allows access to the raw value inside of the Option, thus allowing it to be unboxed on one side, and then reboxed (but without an actual object creation) on the other. This would allow users writing generic/RxJava code to optimize it when need be, while still providing access to the fluent API that Option has. This would basically be a general-purpose replacement for EmptyValue.

The only downside of this idea would be the inability to use when (although do note that when(option) { None -> ... else -> ... } does work, but inside the else you can't access the Option's value safely. Perhaps we could allow users to call Option.value on any Option and throw if it is None?)

One final note, if/when KT-51417 Restricted Types get introduced into the language, Option could then be implemented using them. It might be best to wait until the value classes, sealed value classes, and restricted types ideas are clarified before committing to a new Option encoding.

@serras
Copy link
Member

serras commented Feb 28, 2023

Although I applaud the great work you've put into this (honestly, having a PR with a performance review is just awesome!), I think the disadvantages are more than the potential advantages here. For me, the main tipping point is that Option is not so used in practice, since Kotlin already has nullable types.

@kyay10
Copy link
Collaborator Author

kyay10 commented Feb 28, 2023

Agreed. Should I close this PR then? And what about #2951 then? Should I close that one as well, or could it have some (minimal) merit?
Thank you for taking the time to consider this PR!

@nomisRev
Copy link
Member

nomisRev commented Mar 1, 2023

Hey @kyay10,

Thank you for referencing the resources to sealed inline class and your YouTrack issue on restrictive types! +215,277 −517 changes in the sealed inline classes 🤯

It might be best to wait until the value classes, sealed value classes, and restricted types ideas are clarified before committing to a new Option encoding.

Like you already said yourself, I think it's best to wait. So I am sad to say I think we should also close #2951.

This would basically be a general-purpose replacement for EmptyValue.

The use-case for this is so rare IMO, that I am personally happy to redefine it on a per-use-case basis. Actually so much I've thought about having it defined a couple times in Arrow, so that I can make it completely private...

I feel bad closing all this work, after you've done such an amazing job with all the benchmarks and everything @kyay10 😞 I hope you learned a lot, and had a good time working on this.

@kyay10 kyay10 closed this Mar 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants