Information Model

draft

This document provides an explanation of what constitutes an information model of a data product, how it is represented and how the life cycle of it and its versions is managed.


Table of Contents

  1. Definition
  2. Identification
    1. Version identifier
  3. Life cycle management
    1. Information model
    2. Information model version
  4. Representation
    1. Attachments
  5. Retrieval

Definition

Definitions

Information model
Specification of semantics and constraints of the use case represented by the data product.
Information model version
Expression of the information model, including documentation and supporting attachments.
Attachment
Resources which support the model, e.g. example data or additional schemas.

Information models are abstract1 resources of which one or more versions exist, and only information model versions are information resources with representations suited for humans and machines alike. For each version supporting attachments may be present which are referenced by the representations through hyperlinks.

Colloquial speech

Throughout this documentation the term (information) model may be used more colloquially.

For example, I’m unlikely for me to say I have openend in my text editor “a YAML representation of version 1.2 of the information model Capaciteitskaart”, and may instead refer to it as “Capaciteitskaart model 1.2”, or in some contexts even as “the Capaciteitskaart model”. This is perfectly fine and human, just be aware of this.

Identification

Each information model and its versions are assigned URIs so they can be identified.


URI templatesRFC6570

Information model
https://modellen.netbeheernederland.nl/{name}
Information model version
https://modellen.netbeheernederland.nl/{name}/{version}

where:

name
meaningful name in kebab-case
version
version identifier

Version identifier

Given an information model, to discern one version from another we need a version identifier.

VCS systems like Git track versions and assign unique hashes which identify them. Although these serve as version identifiers, we are looking for a logical versioning scheme which is decoupled from the physical implementation.

Besides identifying a specific version, a version identifier also informs you about the life cycle state of the version.

Currently, the following version identification scheme is maintained:

Version identifier and the associated life cycle state

draft/*
draft
main
accepted
X.Y
released

Life cycle management

The (abstract) information model and a version thereof have distinct life cycles which are described in the following sections.

Information model

The life cycle of an information model is, naturally, intimately related to the life cycle of the associated data product. With the birth of a new data product, a new information model will be required as well. Similarly, if a data product is retired, so is its information model.

We discern the following possible states during the life cycle of an information model:

Possible states

active
in use by a data product
retired
no longer supported

Information model version

The life cycle of a version of an information model is described by the following state diagram:

Possible states

draft
under (active) development
accepted
at some point in time accepted to have its changes be an anticipated part of the upcoming release
released
stable and ready to be used by end-users
retired
once released, but no longer supported

Representation

An information model version is expressed in the LinkML data modeling language2, i.e. each information model version is a LinkML schema.

Currently, the following representations are provided for a model version:

Representations of LinkML schema

Media type For human or machine
YAML (application/yaml) machine
HTML (text/html) human

The YAML representation of is intended for processing by machines, enabling the parsing and validation of data, as well as the generation of additional schemas or artifacts.

The HTML representation is intended for humans. It contains:

  • reference documentation generated from the formal structure defined in the LinkML YAML.
  • additional informative documentation, (usually) written by humans
  • references to attachments

The machine-processable representation can be downloaded from the human-readable documentation.

Attachments

Information model versions may refer to attachments such as example data and additional generated schemas (e.g. JSON Schema, SHACL, etc.) which support the model.

Unless specified otherwise, attachments are informal.

Every representation of the information model version should provide hyperlinks to these attachments.

Retrieval

Retrieving the HTML representation of an information model version can be achieved simply by dereferencing its URI. The YAML representation can be downloaded from there.3


References

RFC6570
For the specification of URI templates, see: https://datatracker.ietf.org/doc/html/rfc6570.

Footnotes

  1. Abstract resources are non-informational, i.e. they have no representations. 

  2. If at some point a different modeling language is chosen, this may lead to different versions of an information model using different metamodels but still be versions of the same abstract information model. 

  3. Standard practice would be to implement content-negotiation, but this has not been implemented at this time.