Gradle tasks executing benchmarks succeed even if some benchmarks fail #186

fzhinkin · 2024-02-06T10:05:40Z

Currently, Gradle tasks executing benchmarks don't fail if some benchmarks fail. That might not be a problem if benchmarks are executed within the IDE, as failure status will be reported explicitly, but in other scenarios, it may lead to failures being unnoticed as generated reports will not contain any hints of failures and the only way to figure out that something went wrong is by inspecting logs.

For example, if benchmarks are executed in CI then, most likely, nobody will check the logs until there's a failure, but since a benchmarking task will succeed in any case and there will also be a report with all benchmarks but a failed one, it may take a long time until somebody will notice a failure.

Here's a reproducer: https://github.com/fzhinkin/kotlinx-benchmark-success-on-benchmark-failure

./gradlew benchmark
> Task :jvmBenchmark
...
<failure>

java.lang.RuntimeException
        at org.example.FaultyBenchmark.thisOneIsNoBetter(FaultyBenchmark.kt:14)
        at org.example.generated.FaultyBenchmark_thisOneIsNoBetter_jmhTest.thisOneIsNoBetter_thrpt_jmhStub(FaultyBenchmark_thisOneIsNoBetter_jmhTest.java:121)
        at org.example.generated.FaultyBenchmark_thisOneIsNoBetter_jmhTest.thisOneIsNoBetter_Throughput(FaultyBenchmark_thisOneIsNoBetter_jmhTest.java:83)
...
> Task :macosArm64Benchmark
...
… org.example.FaultyBenchmark.faulty
  EXCEPTION: kotlin.RuntimeException
0   macosArm64Benchmark.kexe            0x102b6fc73        kfun:org.example.FaultyBenchmark#faulty(){} + 99 
1   macosArm64Benchmark.kexe            0x102b71edb        kfun:kotlinx.benchmark.generated.org.example.FaultyBenchmark_Descriptor.$faulty$FUNCTION_REFERENCE$5.invoke#internal + 23 
...
BUILD SUCCESSFUL in 4m 25s

The build is successful, and the reports contain some results (there's one non-failing benchmark in the demo project), so without inspecting the logs it's hard to detect failures. And even with logs one may decide that everything is fine as the task succeeded.

I am suggesting starting failing Grade tasks if there's at least one failed benchmark.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gradle tasks executing benchmarks succeed even if some benchmarks fail #186

Gradle tasks executing benchmarks succeed even if some benchmarks fail #186

fzhinkin commented Feb 6, 2024

Gradle tasks executing benchmarks succeed even if some benchmarks fail #186

Gradle tasks executing benchmarks succeed even if some benchmarks fail #186

Comments

fzhinkin commented Feb 6, 2024