Extension with new subcommands: status, approve, rerun #400

mjambon · 2023-11-21T21:45:42Z

Hi. I'm looking into adding functionality to our test suite. We'll be porting tests from Pytest to Ocaml and we'll need a convenient way to check stdout and stderr outputs. I'm not sure what the implementation would look like. At best, it extends the command-line interface generated by alcotest without breaking existing code. At worst, it's some custom code in our (open-source) project that nobody else can use conveniently.

Here's what I have in mind:

Subcommands offered by the generated test program

run (today known as test): run the tests
list: list the tests - shows the names, identifiers, or tags that can be used to filter tests
status*: show the status of the last run for each test - OK, FAIL, XFAIL, XPASS, or Unknown
approve*: approve the output of the last run for the selected tests, similar to dune promote and pytest --snapshot-update without rerunning the tests
rerun*: rerun the failed tests (and those that haven't run)

* new subcommand

All these subcommands support filters that restrict the selection of tests. Filtering can be done:

by specifying a regexp (like today) to match the test or test case name
by specifying an exact substring to match the test or test case name
by selecting a combination of tags e.g. some_product and not todo

Library changes

A test definition comes with more options than the tuple (name, quick/slow, test function). New options include:
- check against expected output on stdout, stderr, or both (separately or interleaved)
- check against expected output files
- a list of tags (similar to pytest markers) for selecting/deselecting groups of tests - generalizes the existing quick/slow tagging
- expected test status: success or failure ("xfail") - essential for writing tests ahead of time
A compact, unique, stable identifier exists for each test. These IDs can be generated. They're important for selecting individual tests simply and unambiguously.
Support for the legacy way of creating a test.

Please let me know if there's already something we can use that supports capturing stdout/stderr and the status and approve subcommand.

Other requirements

Independence from the build system (dune) or from any framework whose primary job is not to manage tests.
Only one kind of tool/framework/library may be used to manage all test suites for OCaml code.

The text was updated successfully, but these errors were encountered:

I want to add support for capturing and checking the output of tests. We want the test framework to manage the output snapshots, show us the differences with expectations, and let us approve new output. This will require specifying options such as "capture stdout for this test and compare it against the expectation". This PR creates a richer type for specifying tests and provides a function `create_test` with optional arguments. The legacy way of specifying a test using just a pair (name, function) is still supported and used in many places but it requires a call to `simple_test` or `simple_tests` to convert to the new type. Future steps will involve implementing was I mentioned in mirage/alcotest#400. If things work out, it may be integrated into Alcotest or become a standalone library that wraps around Alcotest. For now, we're keeping it in semgrep. The next steps are: * figure out how to add subcommands to the test program (`status`, `approve`, `rerun`) * implement stdout/stderr capturing * implement the new subcommands These initial changes will break semgrep-proprietary but it should only be a matter of inserting calls to `Testutil.simple_test` like I did on the semgrep code base. I'm not a fan of the name `Testutil`, I'm thinking of renaming it to something that alludes more to an experimental extension of Alcotest. Update: I renamed `Testutil` and isolated it in its own library named `Alcotest_ext`.

mjambon · 2023-12-21T22:18:34Z

I did a bit of work toward these goals. This is a lot of code (~2000 lines), all of it as a layer on top of the Alcotest library. The code, for now, is kept within the semgrep project at https://github.com/semgrep/semgrep/tree/develop/libs/alcotest_ext. Given its size and complexity, it should probably not be merged into the main Alcotest project but I'm still up for it. The current status is "alpha" in the sense that many features will need to be adjusted after we use it. The visual output is a bit of a mess.

Here's what was implemented and works:

3 subcommands: run, status, approve. All 3 write or read the status of tests stored on disk.
"xfail" results = tests that are expected to fail. A completed test is in 1 of 4 classes: PASS, XFAIL, FAIL, XPASS (same terminology as pytest).
capturing and comparing output against expectations (stdout, stderr, merged stdout/sderr, separate stdout/stderr, none).
a test is now a record created with a function that takes a bunch of optional arguments. The required arguments are the test name and the test function.
hierarchical categories. These are nice to keep things organized and easy to filter, as well as to avoid name clashes.
run --lazy: only reruns the tests that previously failed. Useful when working on fixing a small number of tests.
unique test IDs: these are hashes of the full test name (category + name) with a low chance of accidental collision. Makes file names meaningless but compact. Allows easy filtering by specifying the test ID e.g. run -s d84498857f5e.

Unchanged:

the tests are written with Alcotest.check and accompanying functions (or whatever exception-raising checks the user likes).
running the tests is still delegated to Alcotest. This will probably change so as to make the output less confusing, notably with respect to tests that are expected to fail.

Partially implemented:

tags equivalent to pytest markers (I just prefer the term "tag"). Filtering by tag isn't implemented yet.
test filtering: currently, only filtering by substring occurring in the test category, name or ID is supported. --lazy is another form of filtering that's usable. Filtering by tags and combining all this in a unified boolean query language may be a good idea in addition to shortcut options such as --lazy or --tags.
readability and rich output formatting: needs improvement.
JUnit output (somewhere in a private project of ours, using https://github.com/Khady/ocaml-junit)

Unavailable:

many Alcotest options I haven't used, which may or may not be important to some users.
Lwt support: only the conversion to a regular Alcotest test suite is provided. I have questions about the need for such support.
Async support

I want to add support for capturing and checking the output of tests. We want the test framework to manage the output snapshots, show us the differences with expectations, and let us approve new output. This will require specifying options such as "capture stdout for this test and compare it against the expectation". This PR creates a richer type for specifying tests and provides a function `create_test` with optional arguments. The legacy way of specifying a test using just a pair (name, function) is still supported and used in many places but it requires a call to `simple_test` or `simple_tests` to convert to the new type. Future steps will involve implementing was I mentioned in mirage/alcotest#400. If things work out, it may be integrated into Alcotest or become a standalone library that wraps around Alcotest. For now, we're keeping it in semgrep. The next steps are: * figure out how to add subcommands to the test program (`status`, `approve`, `rerun`) * implement stdout/stderr capturing * implement the new subcommands These initial changes will break semgrep-proprietary but it should only be a matter of inserting calls to `Testutil.simple_test` like I did on the semgrep code base. I'm not a fan of the name `Testutil`, I'm thinking of renaming it to something that alludes more to an experimental extension of Alcotest. Update: I renamed `Testutil` and isolated it in its own library named `Alcotest_ext`.

mjambon · 2024-01-18T03:44:33Z

The project is now known as "Testo" and has its own repo: https://github.com/semgrep/testo. We're using it in Semgrep.

mjambon mentioned this issue Nov 22, 2023

Turn the test type from a simple tuple into a record semgrep/semgrep#9322

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extension with new subcommands: status, approve, rerun #400

Extension with new subcommands: status, approve, rerun #400

mjambon commented Nov 21, 2023 •

edited

Loading

mjambon commented Dec 21, 2023

mjambon commented Jan 18, 2024

Extension with new subcommands: status, approve, rerun #400

Extension with new subcommands: status, approve, rerun #400

Comments

mjambon commented Nov 21, 2023 • edited Loading

Subcommands offered by the generated test program

Library changes

Other requirements

mjambon commented Dec 21, 2023

mjambon commented Jan 18, 2024

mjambon commented Nov 21, 2023 •

edited

Loading