Versioning

Overview and Requirements

In order to ensure consistency and stability across SHR definitions and implementations, SHR must adopt a versioning strategy.

The SHR versioning strategy must support the following features:

Versioning of SHR data element definitions, terminology definitions, and mapping definitions
Identification/understanding of versions for referenced base elements, values, and fields
Easy serialization of versioning information for persistence and exchange

The SHR versioning strategy should support the following features:

Understanding of backwards-compatibility between versions
Independently versioned mappings (to allow mappings to evolve separately from definitions)

The SHR versioning strategy may support the following features:

Independently versioned terminologies (to allow terminologies to evolve separately from data element definitions)

Out of Scope: Maturity Levels

Maturity levels are closely related to versioning, in that they indicate the degree of change that can be expected for a given definition. Very mature definitions are not expected to change in significant ways, where as less mature definitions may be subject to significant change.

While identification of a maturity model for SHR definitions is important, it will be addressed separately. Given the compositional nature of SHR, there are complications to consider -- for example, can a definition with a high maturity level contain a field with a lower maturity level? If so, what does maturity really indicate?

For now, we are tabling these questions and proposing only how versioning should work, independent of any maturity model.

Versioning Approaches

While there may be many possible approaches to versioning, a significant factor is the granularity at which you version. Consider these two extremes:

Extreme #1: Monolithic Versioning

In this approach, the entire set of SHR data elements shares a single version number. If any element is changed, the version number for the entire set is incremented. This is the approach followed by HL7 FHIR.

This has the advantage of simplicity and consistency, since there are never any inter-version references. On the other hand, it requires a very high level of coordination between all authors. Every author must be in lock-step, causing long and complicated release schedules.

Extreme #2: Hyper-Specific Versioning

In this approach, every single data element is versioned on its own, and only changes versions if its own definition has changed. This also requires that data elements declare the versions of every element they reference.

This has the advantage of allowing every author to edit and release data elements on their own schedule. Managing large sets of independently versioned data elements, however, is very tedious -- and there can be any number of combinatorial possibilities of interacting versions. In addition, when a data element version changes, all other elements that reference it must also change (giving this approach a viral property).

Proposed Approach: Package-Level Versioning

The proposed approach (naturally) falls between the two extremes: while not versioning everything together, it versions at a high-enough level to reduce the burden of hyper-specific versioning. In this approach, we introduce the concept of a "package", which is a set of namespaces that are versioned and released together.

The SHR Core Package

The SHR Core package contains the core namespaces and definitions that every other package will be based upon. This includes common elements like CodeableConcept, Quantity, and Name. It also includes PersonOfRecord and all of its fields, as well as base elements like Problem, Procedure, etc. It does not, however, include elements related to specific domains (e.g., Oncology).

The SHR Core package is analogous to FHIR's un-profiled resources, and cannot depend on any other packages.

Domain Packages

Specific domains, like Oncology, are each their own package. They can contain multiple, unique namespaces and data elements, specific to their own domain. They may also declare dependencies on other packages, and must indicate the package versions that they rely upon. To allow some flexibility, we will enable an approach to declaring dependency versions similar to NPM Semver for Consumers.

This is somewhat analogous to FHIR IGs / profiles in that packages can declare a dependency on a specific version of FHIR and other IGs, but they increment their own version independently.

Value Sets

Value sets that enumerate answers with different semantic meanings (e.g., a status value set) are considered definitional, and may be included in and versioned with their relevant packages. They may, however, also be externally sourced from FHIR or VSAC.

Value sets that are simply a grouping of similar concepts (e.g., an immunizations value set) may be managed inside or outside of the SHR tooling, depending on how often the definition of the value set is expected to change. Value sets whose definition is likely to change frequently (for example, as new drugs come on the market) are better managed outside of SHR packages, in a registry like VSAC. This avoids frequent versioning of entire SHR packages.

Package-Level Versioning Advantages

This approach allows SHR Core releases to happen on their own schedule, independent of domain-specific packages. It also allows domain-specific package authors to release at their own pace, and supports the concept of de-centralized, open-world authoring. In addition, this approach also provides a mechanism for cleanly delineating IGs from SHR Core and other packages they depend on.

Package-Level Versioning Disadvantages

This approach, however, still requires some level of coordination between authors. If domain-level authors wish to release on the same schedule as SHR Core, they'll need to remain in sync w/ SHR Core between releases. In addition, if one domain-level package relies on another domain-level package, they must ensure that they both use compatible versions of SHR Core.

A second disadvantage, which we must seriously consider, is that some tasks become very difficult in this framework. For example, there is not a simple way for a domain-level author to add a domain-specific field to PersonOfRecord. They could create a new element based on PersonOfRecord, but that would then require instances to declare their type as the non-core variant of PersonOfRecord. This becomes even more problematic if two domain-level authors both want to add fields to PersonOfRecord. In FHIR, this use case would be covered by extensions -- but SHR doesn't have the same concept of arbitrary late-bound extensions like FHIR does.

Mappings

Mappings to other representations, such as FHIR, do not necessarily need to be developed in tandem with data element definitions (although sometimes target representations may help to inform the development of a data element itself).

This being the case, mappings should be packaged and versioned separately from the definitional aspects of SHR. These mapping packages should have their own versions, but must also identify the version of SHR data elements they provide mappings for. This allows mappings to evolve over time independent of the data elements.

What's Needed to Implement This

In order to implement this approach, there are several significant changes needed to the tooling and definitions.

First, we need a way for authors to indicate a package version, and for that version to be persisted in the canonical JSON.

Second, we need a way to support individual "packages" and the dependencies between them.

Third, we need a way to support "mapping packages" separately.

Fourth, we need to determine which of the existing definitions go into "SHR Core" and split the others into meaningful packages.

Semantic Versioning

Regardless of the approach used (version together, version separately, or hybrid), semantic versioning is recommended. This will allow insight into the impact/significance of a release. According to semantic versioning 2.0:

Given a version number MAJOR.MINOR.PATCH, increment the:

MAJOR version when you make incompatible API changes,

MINOR version when you add functionality in a backwards-compatible manner, and

PATCH version when you make backwards-compatible bug fixes.

Additional labels for pre-release and build metadata are available as extensions to the MAJOR.MINOR.PATCH format.

While semantic versioning is usually applied to software releases, it is abstract enough to apply to versioning of SHR definitions as well. What constitutes MAJOR, MINOR, or PATCH changes must be defined in terms of the item that is being modified.

For a given system built on SHR version sMAJOR.sMINOR.sPATCH and data instances based on SHR version dMAJOR.dMINOR.dPATCH:

If sMAJOR != dMAJOR or sMINOR < dMINOR, they are incompatible
If sMAJOR == dMAJOR and sMINOR >= dMINOR, the system can process the data losslessly
PATCH versions are insignificant

SEMVER for Packages

Given a version number MAJOR.MINOR.PATCH, increment the:

MAJOR version when
- data element changes render existing instances invalid
  - renaming or removing an element
  - changing an element's inheritance tree
  - adding a new required or modifier field
  - renaming or removing a field
  - adding a new constraint to a field
  - narrowing cardinality of a field
  - narrowing existing constraints on a field
- value set changes might exclude a code that was previously included in the value set.
MINOR version when
- data element changes do not render existing instances invalid, but rather, loosen or expand their definition
  - adding a new element
  - adding a new optional field
  - widening the cardinality of a field
  - widening existing constraints on a field
  - removing existing constraints on a field
- value set changes might include a code that was previously excluded from the value set
PATCH version when
- data element changes do not change the structure or meaning of the element
  - changing the text "Description"
  - updating the "Concept" (as long as semantic meaning is not changed)
  - changing comments in the source code
- value set changes do not change actual code membership

SEMVER for Mapping Packages

TODO: Is this too strict? Should mapping versioning be a little looser?