Skip to content

Release v0.5.20221013

Compare
Choose a tag to compare
@github-actions github-actions released this 13 Oct 20:26
bb765d4

0.5.20221013 (2022-10-13)

Features

  • add dummy mappings for english and mohawk (eaf2c70)
  • add dummy mappings for english and mohawk (85d06f0)
  • add iku-ipa to hamming-eng-ipa mapping (9ae5484)
  • add und and str dummy mappings as well as distance specification for mapping alignment (7f93447)
  • mappings: added more dummy mappings (d89f5b7)
  • add und-ipa to hamming-eng-ipa (7cf4118)
  • basic Finnish mapping (36d17e0)
  • check Python version (9e086e1)
  • do arpabet checking for hamming-eng-arpabet too (11728ec)
  • include NFC/D normalization in g2p graph (f3b918c), closes #158 #158
  • und now maps colon to an empty string (0b8c8d9)
  • mappings: allow abbreviations to be declared recursively (e1a270f)
  • show-mappings: add --csv option (43573ac)
  • show-mappings: added cli cmd g2p show-mappings (9b2e489)

Bug Fixes

  • studio: sort nodes in language echart (11b92f2)
  • accept single or multiple mappings in config.yaml (0c4961f)
  • always declare your file encoding, or Windows barfs (67de22f)
  • always declare your file encoding, or Windows barfs (56e327b)
  • avoid failure on corrupted pickles (6f30f0e)
  • catch same input and output g2p mapping bug (0a9d141)
  • correct fin diphthong mappings slightly (b8ae37d)
  • doh! always run the test suites before pushing your changes... (6af7d5e)
  • edit fin-ipa to eng-ipa mappings to fix some vowels (19a9189)
  • ensure python g2p/mappings/langs/init.py can always run (ea45afa)
  • find language_name robustly (4e0ccb9)
  • g2p show-mappings -v show in, out, rest, in that order (da06abd)
  • generated mappings should prevent feedback and apply-longest-first (d10944a)
  • grammar (e14cbcd)
  • lock click==8.0.4 since we support Python 3.6 (f941c64)
  • make make_tokenizer disambiguate in_lang and tok_path (9e7f986)
  • make Mohawk tokenizer recognize colon as a letter (767fed4)
  • make Mohawk tokenizer recognize colon as a letter (a32a3a7)
  • make und work in g2p studio (0dd3e25), closes #165
  • mic "o" didn't get mapped to proper eng-arpabet (1373b41)
  • name g2p in package.json, not readalongs (b18eea8)
  • recreate langs.pkl to allow merging (387c6fc)
  • remove stray BOMs everywhere (3c0f13b)
  • supported renamed dolgo/dogol distance in panphon (0c73399)
  • indices: handle orphan characters with heuristic of attaching to index of previous character if it exists, otherwise attaching to the index of the following character, if no characters exist before or after, then none type is returned. Fixes #172 (b1ca2cb)
  • moe: add self-mappings for k, m, n, p, s, t (a82d098)
  • regenerate mappings and configs (0d71872)
  • remove UTF-8 BOM and CRLF (will fix code separately) (5806ab9)
  • rules with alternations should tokenize correctly (9fd6407)
  • show default directory in help (679ffe1)
  • tell user to rerun g2p update (they can) (eeac7dd)
  • tli_equiv and tce_equiv had BOMs, update to remove them (5b5d0b5)
  • use an automatically generated mapping for moe-ipa -> eng-ipa (1bc9b05)
  • ci: trailing space for json (a58c205)
  • git: make fixes to ejective mappings (27003e2)
  • indices: fix numerous errors within the indices functionality (7a6c631)
  • mappings: fix normalization issues in win and eng mappings (fd2bb0f)
  • moe: remove two more duplicate rules (bd36902)
  • studio: updated reverse initialization and rule ordering values (2c9c61f)
  • win: use the \u02D0, not :; use prevent-feeding (80f80da), closes #100
  • update mappings (1cba84f)
  • use longest mapping for fin (12dc3da)
  • warn of missing language_name before caching (3eb0ed7)
  • write compact json rules with in+out first, then rest (e9cb0e6)

Performance Improvements

  • dockerfile: bump the OS to bullseye and optimize the build (20f1926)
  • test: speed up test_studio by minizing keyup events (4dad4d5)

Reverts

  • revert accidental removal of moe generated mapping (e11db2f)

Build Systems

  • Dockerfile should update pip before using it (3ca1ea2)
  • move flake8 config to setup.cfg (b637ac8)

Continuous Integration

  • add CI job to run tests on Windows (f4e5dc3)
  • adjust coverage configuration for CI and manual testing (907e4f1)
  • before a release, do full os/pyversion matrix testing (e01319f)
  • publish with build instead of calling setup.py directly (a1a024f)
  • quote python versions so 3.10 works (528ea8c)
  • run full matrix CI only on PRs to release (888404c)
  • run the CI tests on pull requests too (bb88b9b)
  • skip CI tests for commits with #no-ci in the message (5393045)
  • tests workflow should update pip (49abd45)
  • test-studio: adjust to test-studio CI workflow (bb03712)

Tests

  • tokenizer: more unit testing for refactored tok (119d2b7)
  • fix typo testing create_multi_mapping (3d7d23d)
  • show-mappings: unit testing for show-mappings (36f4996)
  • add eng-ipa and eng-arpabet tests (e5ed8b8)
  • add fin tests (6c608af)
  • allow testing update on input/output directories (eacd5de)
  • exercise win ejectives being removed from eng (6a69f60)
  • fix fin test (3639f69)
  • hopefully fully test everything (699ba10)
  • proper e2e tests for studio (5692a93)
  • silence expected warnings by asserting them (5fd2be8)
  • update fin tests (9ac849f)
  • studio: make the test_studio.py test suite more robust (303b794), closes #195

Documentation

  • add citation (7416775)
  • add citation to table of contents (1713e0c)
  • add CITATION.cff file (a9abc07)
  • add minimal coding docs to init.py (8593aa2)
  • add plain text citation and update name (bf5889c)
  • add See also section in README.md TOC (517b190)
  • fix help text for g2p update -o (c23fcf3)
  • let run_studio.py log a bit on screen where it's launching (2e38153)
  • update docstring (2ab8313)
  • update to stable version and link to blog (6478537)

Styles

  • add pre-commit hooks for black, etc and commitlint (6297bee)
  • blackify all Python files (9c26df4)
  • blackify and isort a few more files (cd31657)
  • blackify g2p/mappings/tokenizer.py, and globals should be lowercase, not uppercase (94561df)
  • blackify test_create_mapping.py (460deb0)
  • bump dev dependencies (93ea50f)
  • convert moe_to_ipa.json to our new, compact format (9545832)
  • isort all Python files (1fa3d9d)
  • rewrite und_to_ipa.json in new compact format (95e25af)
  • show-mappings: black my work, and an extra test (4738921)
  • remove husky commitlint (replace by python gitlint) (e40a06f)
  • use gitlint to enforce conventional commits (29c4b05)
  • use gitlint to validate commit messages (904fb4c)

Code Refactoring

  • tokenizer: get_tokenizer still works, with deprecated warning (d6aa3ab)
  • tokenizer: refactor to same import strategy as make_g2p (24557b7)
  • apply a lot of flake8 recommendations (1bf8ae0)
  • black and isort and mypy, oh my (ac4ce57)
  • clean up imports (df56d8f)
  • clean up normalize with indices code (9a5bc3c)
  • factor out loading public test data (6cc278e)
  • further readability refactorizations (2b7d777)
  • general refactoring (bea2c09)
  • make test_fallback unit test easier to read (940af15)
  • spruce up distance metric code (876021e)
  • ci: move tests out of suite because they require 3.8 (cff8d1e)
  • show-mappings: -v as synonym for --verbose (f18220f)
  • show-mappings: remove old mapping viz generator (57b5aa1)
  • show-mappings: suggested edits (95653ed)
  • move cache_langs to utils (1aef962)
  • replace function attributes by module globals (e5199f0)
  • use else block (85fad34)