Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extension with new subcommands: status, approve, rerun #400

Open
mjambon opened this issue Nov 21, 2023 · 2 comments
Open

Extension with new subcommands: status, approve, rerun #400

mjambon opened this issue Nov 21, 2023 · 2 comments

Comments

@mjambon
Copy link
Contributor

mjambon commented Nov 21, 2023

Hi. I'm looking into adding functionality to our test suite. We'll be porting tests from Pytest to Ocaml and we'll need a convenient way to check stdout and stderr outputs. I'm not sure what the implementation would look like. At best, it extends the command-line interface generated by alcotest without breaking existing code. At worst, it's some custom code in our (open-source) project that nobody else can use conveniently.

Here's what I have in mind:

Subcommands offered by the generated test program

  • run (today known as test): run the tests
  • list: list the tests - shows the names, identifiers, or tags that can be used to filter tests
  • status*: show the status of the last run for each test - OK, FAIL, XFAIL, XPASS, or Unknown
  • approve*: approve the output of the last run for the selected tests, similar to dune promote and pytest --snapshot-update without rerunning the tests
  • rerun*: rerun the failed tests (and those that haven't run)

* new subcommand

All these subcommands support filters that restrict the selection of tests. Filtering can be done:

  • by specifying a regexp (like today) to match the test or test case name
  • by specifying an exact substring to match the test or test case name
  • by selecting a combination of tags e.g. some_product and not todo

Library changes

  • A test definition comes with more options than the tuple (name, quick/slow, test function). New options include:
    • check against expected output on stdout, stderr, or both (separately or interleaved)
    • check against expected output files
    • a list of tags (similar to pytest markers) for selecting/deselecting groups of tests - generalizes the existing quick/slow tagging
    • expected test status: success or failure ("xfail") - essential for writing tests ahead of time
  • A compact, unique, stable identifier exists for each test. These IDs can be generated. They're important for selecting individual tests simply and unambiguously.
  • Support for the legacy way of creating a test.

Please let me know if there's already something we can use that supports capturing stdout/stderr and the status and approve subcommand.

Other requirements

  • Independence from the build system (dune) or from any framework whose primary job is not to manage tests.
  • Only one kind of tool/framework/library may be used to manage all test suites for OCaml code.
mjambon added a commit to semgrep/semgrep that referenced this issue Dec 2, 2023
I want to add support for capturing and checking the output of tests. We
want the test framework to manage the output snapshots, show us the
differences with expectations, and let us approve new output. This will
require specifying options such as "capture stdout for this test and
compare it against the expectation".

This PR creates a richer type for specifying tests and provides a
function `create_test` with optional arguments. The legacy way of
specifying a test using just a pair (name, function) is still supported
and used in many places but it requires a call to `simple_test` or
`simple_tests` to convert to the new type.

Future steps will involve implementing was I mentioned in
mirage/alcotest#400. If things work out, it
may be integrated into Alcotest or become a standalone library that
wraps around Alcotest. For now, we're keeping it in semgrep. The next
steps are:
* figure out how to add subcommands to the test program (`status`,
`approve`, `rerun`)
* implement stdout/stderr capturing
* implement the new subcommands

These initial changes will break semgrep-proprietary but it should only
be a matter of inserting calls to `Testutil.simple_test` like I did on
the semgrep code base. I'm not a fan of the name `Testutil`, I'm
thinking of renaming it to something that alludes more to an
experimental extension of Alcotest. Update: I renamed `Testutil` and
isolated it in its own library named `Alcotest_ext`.
@mjambon
Copy link
Contributor Author

mjambon commented Dec 21, 2023

I did a bit of work toward these goals. This is a lot of code (~2000 lines), all of it as a layer on top of the Alcotest library. The code, for now, is kept within the semgrep project at https://github.com/semgrep/semgrep/tree/develop/libs/alcotest_ext. Given its size and complexity, it should probably not be merged into the main Alcotest project but I'm still up for it. The current status is "alpha" in the sense that many features will need to be adjusted after we use it. The visual output is a bit of a mess.

Here's what was implemented and works:

  • 3 subcommands: run, status, approve. All 3 write or read the status of tests stored on disk.
  • "xfail" results = tests that are expected to fail. A completed test is in 1 of 4 classes: PASS, XFAIL, FAIL, XPASS (same terminology as pytest).
  • capturing and comparing output against expectations (stdout, stderr, merged stdout/sderr, separate stdout/stderr, none).
  • a test is now a record created with a function that takes a bunch of optional arguments. The required arguments are the test name and the test function.
  • hierarchical categories. These are nice to keep things organized and easy to filter, as well as to avoid name clashes.
  • run --lazy: only reruns the tests that previously failed. Useful when working on fixing a small number of tests.
  • unique test IDs: these are hashes of the full test name (category + name) with a low chance of accidental collision. Makes file names meaningless but compact. Allows easy filtering by specifying the test ID e.g. run -s d84498857f5e.

Unchanged:

  • the tests are written with Alcotest.check and accompanying functions (or whatever exception-raising checks the user likes).
  • running the tests is still delegated to Alcotest. This will probably change so as to make the output less confusing, notably with respect to tests that are expected to fail.

Partially implemented:

  • tags equivalent to pytest markers (I just prefer the term "tag"). Filtering by tag isn't implemented yet.
  • test filtering: currently, only filtering by substring occurring in the test category, name or ID is supported. --lazy is another form of filtering that's usable. Filtering by tags and combining all this in a unified boolean query language may be a good idea in addition to shortcut options such as --lazy or --tags.
  • readability and rich output formatting: needs improvement.
  • JUnit output (somewhere in a private project of ours, using https://github.com/Khady/ocaml-junit)

Unavailable:

  • many Alcotest options I haven't used, which may or may not be important to some users.
  • Lwt support: only the conversion to a regular Alcotest test suite is provided. I have questions about the need for such support.
  • Async support

mjambon added a commit to semgrep/testo that referenced this issue Jan 14, 2024
I want to add support for capturing and checking the output of tests. We
want the test framework to manage the output snapshots, show us the
differences with expectations, and let us approve new output. This will
require specifying options such as "capture stdout for this test and
compare it against the expectation".

This PR creates a richer type for specifying tests and provides a
function `create_test` with optional arguments. The legacy way of
specifying a test using just a pair (name, function) is still supported
and used in many places but it requires a call to `simple_test` or
`simple_tests` to convert to the new type.

Future steps will involve implementing was I mentioned in
mirage/alcotest#400. If things work out, it
may be integrated into Alcotest or become a standalone library that
wraps around Alcotest. For now, we're keeping it in semgrep. The next
steps are:
* figure out how to add subcommands to the test program (`status`,
`approve`, `rerun`)
* implement stdout/stderr capturing
* implement the new subcommands

These initial changes will break semgrep-proprietary but it should only
be a matter of inserting calls to `Testutil.simple_test` like I did on
the semgrep code base. I'm not a fan of the name `Testutil`, I'm
thinking of renaming it to something that alludes more to an
experimental extension of Alcotest. Update: I renamed `Testutil` and
isolated it in its own library named `Alcotest_ext`.
@mjambon
Copy link
Contributor Author

mjambon commented Jan 18, 2024

The project is now known as "Testo" and has its own repo: https://github.com/semgrep/testo. We're using it in Semgrep.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant