Information Model
draft
This document provides an explanation of what constitutes an information model of a data product, how it is represented and how the life cycle of it and its versions is managed.
Table of Contents
Definition
Definitions
- Information model
- Specification of semantics and constraints of the use case represented by the data product.
- Information model version
- Expression of the information model, including documentation and supporting attachments.
- Attachment
- Resources which support the model, e.g. example data or additional schemas.
Information models are abstract1 resources of which one or more versions exist, and only information model versions are information resources with representations suited for humans and machines alike. For each version supporting attachments may be present which are referenced by the representations through hyperlinks.
Colloquial speech
Throughout this documentation the term (information) model may be used more colloquially.
For example, I’m unlikely for me to say I have openend in my text editor “a YAML representation of version 1.2 of the information model Capaciteitskaart”, and may instead refer to it as “Capaciteitskaart model 1.2”, or in some contexts even as “the Capaciteitskaart model”. This is perfectly fine and human, just be aware of this.
Identification
Each information model and its versions are assigned URIs so they can be identified.
URI templatesRFC6570
- Information model
https://modellen.netbeheernederland.nl/{name}- Information model version
https://modellen.netbeheernederland.nl/{name}/{version}
where:
name- meaningful name in kebab-case
version- version identifier
Version identifier
Given an information model, to discern one version from another we need a version identifier.
VCS systems like Git track versions and assign unique hashes which identify them. Although these serve as version identifiers, we are looking for a logical versioning scheme which is decoupled from the physical implementation.
Besides identifying a specific version, a version identifier also informs you about the life cycle state of the version.
Currently, the following version identification scheme is maintained:
Version identifier and the associated life cycle state
Life cycle management
The (abstract) information model and a version thereof have distinct life cycles which are described in the following sections.
Information model
The life cycle of an information model is, naturally, intimately related to the life cycle of the associated data product. With the birth of a new data product, a new information model will be required as well. Similarly, if a data product is retired, so is its information model.
We discern the following possible states during the life cycle of an information model:
Possible states
- active
- in use by a data product
- retired
- no longer supported
Information model version
The life cycle of a version of an information model is described by the following state diagram:
Possible states
- draft
- under (active) development
- accepted
- at some point in time accepted to have its changes be an anticipated part of the upcoming release
- released
- stable and ready to be used by end-users
- retired
- once released, but no longer supported
Representation
An information model version is expressed in the LinkML data modeling language2, i.e. each information model version is a LinkML schema.
Currently, the following representations are provided for a model version:
Representations of LinkML schema
| Media type | For human or machine |
|---|---|
YAML (application/yaml) | machine |
HTML (text/html) | human |
The YAML representation of is intended for processing by machines, enabling the parsing and validation of data, as well as the generation of additional schemas or artifacts.
The HTML representation is intended for humans. It contains:
- reference documentation generated from the formal structure defined in the LinkML YAML.
- additional informative documentation, (usually) written by humans
- references to attachments
The machine-processable representation can be downloaded from the human-readable documentation.
Attachments
Information model versions may refer to attachments such as example data and additional generated schemas (e.g. JSON Schema, SHACL, etc.) which support the model.
Unless specified otherwise, attachments are informal.
Every representation of the information model version should provide hyperlinks to these attachments.
Retrieval
Retrieving the HTML representation of an information model version can be achieved simply by dereferencing its URI. The YAML representation can be downloaded from there.3
References
- RFC6570
- For the specification of URI templates, see: https://datatracker.ietf.org/doc/html/rfc6570.
Footnotes
-
Abstract resources are non-informational, i.e. they have no representations. ↩
-
If at some point a different modeling language is chosen, this may lead to different versions of an information model using different metamodels but still be versions of the same abstract information model. ↩
-
Standard practice would be to implement content-negotiation, but this has not been implemented at this time. ↩