Skip to content

unitsml/unitsdb

Repository files navigation

UnitsDB

Purpose

This data repository is used in conjunction with UnitsML.

Sources include:

  • BIPM SI Brochure

  • NIST SP 811

  • ISO 80000 series

  • (in some cases, "Encyclopaedia of Scientific Units, Weights and Measures, Their SI Equivalences and Origins" by François Cardarelli, is consulted)

Note
Conversion factors here are not updated with the revised SI.

Structure

This repository contains the following YAML files:

dimensions.yaml

Contains the base dimensions and their symbols.

prefixes.yaml

Contains the SI prefixes and their symbols.

scales.yaml

Contains the measurement scale types and their properties.

units.yaml

Contains the units and their symbols, with English and French names. French translations come from BIPM.

unit_systems.yaml

Contains the unit systems and their symbols.

quantities.yaml

Contains the quantities and their symbols, with English and French names. French translations come from BIPM.

Supported languages

The database currently supports the following languages:

  • English (en): Primary language for all entries

  • French (fr): Provided for units and quantities, sourced from the BIPM SI Digital Framework

Versioning

Version scheme

UnitsDB follows semantic versioning (SemVer) using a MAJOR.MINOR.PATCH structure.

Where,

MAJOR

Incremented for incompatible changes that may require updates to consuming applications

MINOR

Incremented for backwards-compatible additions or changes

PATCH

Incremented for backwards-compatible bug fixes

Version field

Each YAML file in the database contains a schema_version field (string) at the top level.

schema_version: "2.0.0"
dimensions:
  # content follows

The version number is consistent across all files in a given release, ensuring the database is treated as a cohesive unit.

Version changes

Version level Types of changes

MAJOR

  • Structural changes to the data model

  • Removal of units, dimensions, or quantities

  • Changes to identifier schemes

  • Any other backwards-incompatible changes

MINOR

  • Addition of new units, dimensions, or quantities

  • Addition of new properties to existing entities

  • Other backwards-compatible enhancements

PATCH

  • Corrections to symbols, names, or other properties

  • Fixes to references between entities

  • Other backwards-compatible fixes

Version usage

Applications consuming UnitsDB should:

  1. Check the schema_version field to ensure compatibility, ensure all files are consistent with the same version

  2. Handle version transitions appropriately based on the semantic versioning rules

  3. Document which version(s) of UnitsDB they support

Content classes

Basic structures

General

Several shared content models are used across the different entities.

Identifier

The Identifier object represents a unique identifier within the identifier type.

Every entity in the database includes an identifiers array that contains one or more identifiers.

Each identifier has:

id

A unique string that identifies the entity

type

The system or authority that defines the identifier

The type: unitsml identifier is mandatory for every entry, where it uniquely identifies the entity within the UnitsML framework.

Other types of identifier types are also supported, such as type: nist for legacy UnitsML identifiers that start with a prefix of NIST.

The UnitsML identifier follows a specific format:

id: {entity-type}:{entity-name}
type: unitsml

Where,

entity-type

Abbreviated code for the type of entity.

u

Unit

d

Dimension

p

Prefix

q

Quantity

s

Scale

us

Unit system

entity-name

The short name of the entity, typically identical to the short: attribute of the entity.

The legacy UnitsML identifier ("NIST" prefix) is used for backward compatibility and is not required for new entries.

id: NIST{entity-type}{entity-code}
type: nist

Where,

entity-type

Abbreviated code for the type of entity.

u

Unit

d

Dimension

p

Prefix

q

Quantity

entity-code

A unique code for the entity. It is a number for units, quantities and dimensions, and a string pattern of {base_number}_{power} for prefixes.

identifiers:
- id: u:meter
  type: unitsml
- id: NISTu1
  type: nist

Symbols

The Symbol object represents the notation used to represent the entity in different contexts and formats.

The Symbol object is used for units, prefixes, certain quantities and certain dimensions.

A Symbol object contains the following attributes which allow for rendering in various formats:

id

A unique identifier identifying the symbol across UnitsDB

ascii

Plain text ASCII representation

unicode

Unicode representation, used in UnicodeMath

html

HTML markup representation

mathml

MathML markup representation

latex

LaTeX markup representation

This multi-format approach ensures that the symbols can be rendered correctly across various applications and documents.

symbols:
- latex: "\\ensuremath{\\mathrm{m}}"
  unicode: m
  ascii: m
  html: m
  id: m
  mathml: "<mi mathvariant='normal'>m</mi>"

Localized string

The localized string object represents a human-readable string that can be localized into different languages.

It is used for names, descriptions, and other text fields that may require translation or localization. Multiple values are allowed per language.

A localized string object contains the following attributes:

value

The localized string value

lang

The language code (e.g., "en" for English, "fr" for French)

names:
- value: metre
  lang: en
- value: meter
  lang: en
- value: mètre
  lang: fr

References

A reference object represents a link to an external resource or standard that provides additional context or information about the entity.

Currently, UnitsDB units, quantities and prefixes link to the SI Digital Framework only.

A reference contains the following attributes:

uri

The URI or URL of the reference

type

The type of reference (e.g., "normative", "informative")

authority

The authority responsible for the reference

references:
- uri: http://si-digital-framework.org/SI/units/metre
  type: normative
  authority: si-digital-framework

Dimensions

General

Dimensions represent the fundamental physical quantities that form the basis of any measurement system. They are the abstract properties that define the nature of a physical quantity, independent of any specific unit of measurement.

  • The SI system defines seven base dimensions (length, mass, time, electric current, thermodynamic temperature, amount of substance, and luminous intensity).

  • A special dimensionless dimension is also defined, which has no physical significance but is used in certain contexts.

  • Derived dimensions are formed by combining base dimensions through multiplication, division, or exponentiation. For example, velocity is a derived dimension formed from length and time (L/T), and force is derived from mass and acceleration (ML/T²).

A dimension object in dimensions.yaml typically has the following structure:

- identifiers:
  - id: dimension_id
    type: type_identifier
  short: short_name
  {dimension_name}:
    power: power_numerator
    symbol: ASCII representation
    symbols:
    - ascii: ASCII representation
      html: "HTML representation"
      id: dimension_symbol_id
      latex: "LaTeX representation"
      mathml: "MathML representation"
      unicode: "Unicode representation"

For example:

- identifiers:
  - id: NISTd1
    type: nist
  length:
    power: 1
    symbol: L
    symbols:
    - ascii: L
      html: "&#x1D5AB;"
      id: dim_L
      latex: "\\ensuremath{\\mathsf{L}}"
      mathml: "<mi mathvariant='sans-serif'>L</mi>"
      unicode: "\U0001D5AB"
  short: length
- identifiers:
  - id: NISTd2
    type: nist
  mass:
    power: 1
    symbol: M
    symbols:
    - ascii: M
      html: "&#x1D5AC;"
      id: dim_M
      latex: "\\ensuremath{\\mathsf{M}}"
      mathml: "<mi mathvariant='sans-serif'>M</mi>"
      unicode: "\U0001D5AC"
  short: mass

Notes

The following dimensions are identified by NIST but they are excluded from dimensions.yaml since they are unused:

- identifiers:
  - id: NISTd87
    type: nist
  dimensionless: true
  short: dimensionless

- identifiers:
  - id: NISTd86
    type: nist
  dimensionless: true
  short: dimensionless

- identifiers:
  - id: NISTd81
    type: nist
  dimensionless: true
  short: dimensionless

Prefixes

General

Prefixes are modifiers that can be attached to units to indicate decimal multiples or submultiples. They provide a concise way to express very large or small quantities without resorting to scientific notation.

The prefix "kilo" (k) multiplies a unit by 1000, so 1 kilometer equals 1000 meters.

The SI system defines a set of standard prefixes included in UnitsDB.

A prefix object in prefixes.yaml typically has the following structure:

- identifiers:
  - id: prefix_id
    type: type_identifier
  base: base
  names:
  - prefix_name
  short: prefix_short_name
  power: power
  symbol:
    id: symbol_id
    ascii: ASCII representation
    html: "HTML representation"
    latex: "LaTeX representation"
    unicode: "Unicode representation"
    mathml: "MathML representation"

For example:

- identifiers:
  - id: NISTp10_2
    type: nist
  base: 10
  names:
  - hecto
  short: hecto
  power: 2
  symbol:
    latex: h
    unicode: h
    ascii: h
    html: h
    id: hecto
    mathml: "<mi>h</mi>"
- identifiers:
  - id: NISTp10_1
    type: nist
  base: 10
  names:
  - deka
  short: deka
  power: 1
  symbol:
    latex: da
    unicode: da
    ascii: da
    html: da
    id: deka
    mathml: "<mi>da</mi>"

Notes

Decimal prefixes are identified by their power of 10, e.g. NISTp10_1

The prefix NISTp10_0 is a placeholder for unity.

Binary prefixes are identified by their power of 2, e.g. NISTp2_10

Quantities

General

Quantities are measurable properties of physical phenomena, objects, or substances. They represent the attributes that can be measured and expressed with numbers. Each quantity is associated with a specific dimension (or combination of dimensions) and can be measured using various units.

The SI system categorizes quantities into two main types:

  • Base quantities: The seven fundamental quantities from which all others are derived (length, mass, time, electric current, thermodynamic temperature, amount of substance, and luminous intensity)

  • Derived quantities: Quantities formed by algebraic combinations of base quantities (e.g., area, volume, velocity, force)

Each quantity has a unique identity and can be measured in specific units that share the same dimension.

"length" is a quantity that can be measured in meters, feet, or kilometers.

A quantity can be combined with other quantities to form compound quantities, such as velocity (length/time) or force (mass × acceleration).

A quantity object in quantities.yaml typically has the following structure:

- identifiers:
  - id: quantity_id
    type: type_identifier
  names:
  - names
  quantity_type: {base|derived}
  short: short_name
  unit_references:
  - id: unit_id
    type: type_identifier
  dimension_reference:
    id: dimension_id
    type: type_identifier
  references:
  - uri: http://si-digital-framework.org/quantities/QUANTITY_ID
    type: normative
    authority: si-digital-framework

For example:

- identifiers:
  - id: NISTq8
    type: nist
  names:
  - value: area
    lang: en
  - value: superficie
    lang: fr
  quantity_type: base
  unit_references:
  - id: NISTu164
    type: nist
  - id: NISTu165
    type: nist
  - id: NISTu1e2/1
    type: nist
  - id: NISTu283
    type: nist
  - id: NISTu317
    type: nist
  - id: NISTu42
    type: nist
  - id: NISTu43
    type: nist
  - id: NISTu44
    type: nist
  - id: NISTu45
    type: nist
  - id: NISTu46
    type: nist
  dimension_reference:
    id: NISTd8
    type: nist
  references:
  - uri: http://si-digital-framework.org/quantities/AREA
    type: normative
    authority: si-digital-framework
- identifiers:
  - id: NISTq166
    type: nist
  names:
  - electric potential
  quantity_type: derived
  unit_references:
  - id: NISTu261
    type: nist
  - id: NISTu268
    type: nist
  dimension_reference:
    id: NISTd18
    type: nist
- identifiers:
  - id: NISTq7
    type: nist
  names:
  - luminous intensity
  quantity_type: base
  short: luminous_intensity
  unit_references:
  - id: NISTu7
    type: nist
  dimension_reference:
    id: NISTd7
    type: nist

Units

General

Units are standardized quantities used to express measurements of physical properties. They provide a reference scale for measuring quantities, allowing values to be communicated uniformly. Each unit is associated with a specific physical quantity and dimension, and has a defined relationship to other units in the same dimension.

The SI system classifies units into several categories:

  • Base units: The seven fundamental units corresponding to the base quantities (meter, kilogram, second, ampere, kelvin, mole, and candela)

  • Derived units: Units formed from combinations of base units (e.g., newton, joule, pascal)

  • Special named units: Derived units that have been given unique names for convenience (e.g., newton for kg·m/s²)

  • Units outside the SI: Non-SI units that are accepted for use with the SI (e.g., minute, hour, degree)

Units can be combined with prefixes to express multiples or submultiples, and can be combined algebraically (through multiplication and division) to form compound units.

A unit object in units.yaml typically has the following structure:

- identifiers:
  - id: unit_id
    type: type_identifier
  quantity_references:
  - id: quantity_id
    type: type_identifier
  names:
  - names
  root: true|false
  short: unit_short_name
  symbols:
  - id: symbol_id
    ascii: ASCII representation
    html: "HTML representation"
    mathml: "MathML representation"
    latex: "LaTeX representation"
    unicode: "Unicode representation"
  dimension_reference:
    id: dimension_id
    type: type_identifier
  unit_system_reference:
  - id: unit_system_id
    type: type_identifier
  # Optional fields for derived units:
  prefixed: true|false

  # Note: Use either root_units OR si_derived_bases, but not both together

  # Use root_units for composite units that include any non-SI or non-SI-derived units
  # root_units can also contain SI units
  root_units:
  - power: power_numerator
    unit_reference:
      id: unit_id
      type: type_identifier
    prefix_reference:
      id: prefix_id
      type: type_identifier

  # Use si_derived_bases only for composite units that use exclusively SI base units
  si_derived_bases:
  - power: power_numerator
    unit_reference:
      id: unit_id
      type: type_identifier
    prefix_reference:
      id: prefix_id
      type: type_identifier

For example:

- identifiers:
  - id: NISTu5
    type: nist
  quantity_references:
  - id: NISTq5
    type: nist
  names:
  - value: kelvin
    lang: en
  - value: kelvin
    lang: fr
  root: true
  short: kelvin
  symbols:
  - latex: "\\ensuremath{\\mathrm{K}}"
    unicode: K
    ascii: K
    html: K
    id: K
    mathml: "<mi mathvariant='normal'>K</mi>"
  - latex: "\\ensuremath{\\mathrm{^{\\circ}K}}"
    unicode: "°K"
    ascii: degK
    html: "&#176;K"
    id: degK
    mathml: "<mi mathvariant='normal'>&#176;K</mi>"
  dimension_reference:
    id: NISTd5
    type: nist
  unit_system_reference:
  - id: si-base
    type: unitsml

Scales

General

Scales define the mathematical properties of measurement systems and how values relate to each other. They specify important characteristics like whether a scale has a true zero point, whether equal intervals represent equal differences, or whether the scale uses logarithmic relationships.

Understanding the scale type is crucial for proper interpretation of measurements and for determining what mathematical operations can legitimately be performed on the values.

On a ratio scale (like length), you can say that 10 meters is twice as long as 5 meters. On an interval scale (like Celsius temperature), 20°C is not "twice as hot" as 10°C.

UnitsDB defines several scale types including continuous ratio scales, continuous interval scales, logarithmic scales, and discrete scales, each with specific properties that determine how measurements should be interpreted and what mathematical operations are valid.

A scale object in scales.yaml typically has the following structure:

- identifiers:
  - id: scale_id
    type: type_identifier
  names:
  - scale_name
  short: short_name
  description:
  - description of scale type
  properties:
    continuous: true|false
    ordered: true|false
    logarithmic: true|false
    interval: true|false
    ratio: true|false

For example:

- identifiers:
  - id: continuous_ratio
    type: unitsml
  names:
  - continuous ratio scale
  short: continuous_ratio
  description:
  - A measurement scale with continuous values and a true zero point, allowing for ratio comparisons
  properties:
    continuous: true
    ordered: true
    logarithmic: false
    interval: true
    ratio: true
- identifiers:
  - id: logarithmic_ratio
    type: unitsml
  names:
  - logarithmic ratio scale
  short: logarithmic_ratio
  description:
  - A scale where equal ratios appear as equal intervals, with a zero point representing a reference value
  properties:
    continuous: true
    ordered: true
    logarithmic: true
    interval: true
    ratio: true

Units refer to scales using a scale_reference object:

scale_reference:
  id: scale_id
  type: type_identifier

Notes

The following scale types are currently defined:

continuous_ratio

A measurement scale with continuous values and a true zero point (most physical quantities)

continuous_interval

A measurement scale with continuous values but an arbitrary zero point (e.g., temperature in °C)

logarithmic_ratio

A logarithmic scale where equal ratios appear as equal intervals (e.g., decibels)

logarithmic_field

A logarithmic scale used for field quantities that uses natural logarithms (e.g., neper)

discrete

A scale consisting of discrete, countable values (e.g., bits)

Unit systems

General

Unit systems are coherent collections of units that are used together to express measurements in a consistent way. Each unit system has its own set of base units and derived units, organized according to specific principles.

The most widely used unit system is the International System of Units (SI), but the database also includes information about other unit systems and their relationships to SI.

Unit systems help provide context for units, indicating whether they are part of the official SI system, acceptable for use with SI (like minutes or degrees), or belong to entirely different measurement frameworks (like imperial units). This context is important for interoperability and conversion between different measurement standards.

In the UnitsDB, units are associated with their respective unit systems through references, allowing applications to identify which units belong to which systems and to determine compatibility between measurements expressed in different systems.

A unit system object in unit_systems.yaml typically has the following structure:

- acceptable: {true|false} # whether the unit system is SI acceptable
  short: short_name
  identifiers:
  - id: unit_system_id
    type: type_identifier
  name: unit_system_name

For example:

- acceptable: true
  short: si-compatible
  identifiers:
  - id: SI_compatible
    type: nist
  name: Units compatible with SI
- acceptable: true
  short: si-base
  identifiers:
  - id: SI_base
    type: nist
  name: SI base units

Notes

The following unit systems are currently defined:

si-base

Contains the seven SI base units (meter, kilogram, second, ampere, kelvin, mole, and candela). These form the foundation of the SI system.

si-derived-special

Contains SI-derived units with special names and symbols (e.g., newton, pascal, joule). These units are derived from base units but have been given unique names for convenience.

si-derived-nonspecial

Contains SI-derived units that use combinations of base unit symbols without special names (e.g., m/s, kg/m³).

si-compatible

Contains all units that are compatible with the SI system, including base units, derived units, and accepted non-SI units.

nonsi-acceptable

Contains non-SI units that are accepted for use with the SI system based on the BIPM SI Brochure (e.g., minute, hour, degree).

nonsi-unacceptable

Contains non-SI units that are not acceptable for use with the SI system (e.g., deprecated or obsolete units).

nonsi-nist-acceptable

Contains non-SI units that are acceptable according to NIST SP 811 but may not be officially accepted by BIPM (e.g., certain customary or traditional units).

In UnitsDB, a unit can belong to multiple unit systems. This classification helps applications determine the appropriate use contexts for each unit.

A unit might be categorized as both nonsi-unacceptable (meaning it is not acceptable for SI usage) but also as nonsi-nist-acceptable (meaning NIST SP 811 considers it acceptable in certain contexts).

When using unit systems in applications, the acceptable property can be used as a quick filter for SI-compliant units, while more specific categorization can be achieved using the individual unit system references.

Contributing

General

When contributing to this repository, please follow these guidelines.

Proposing a new unit, dimension, or quantity

  1. Check if the unit already exists in the database.

  2. If it does not exist, create a new YAML file in the appropriate directory (e.g., units.yaml, dimensions.yaml, etc.).

  3. Follow the structure outlined in the "Content classes" section.

  4. Ensure that the new unit is properly referenced in the relevant files.

  5. Add a test case for the new unit in the appropriate test file.

  6. Run the tests to ensure everything is working correctly.

  7. Submit a pull request with a clear description of the changes made.

  8. Include any relevant references or sources for the new unit.

  9. Ensure that the pull request is against the main branch of the repository.

  10. Wait for review and feedback from the repository maintainers.

  11. Address any comments or suggestions made during the review process.

  12. Once approved, the pull request will be merged into the main branch.

UnitsML validation

All YAML files must be validated against the UnitsML schema before submitting a pull request.

To validate UnitsDB content using UnitsML Ruby:

# Navigate to the scripts directory
cd scripts

# Install dependencies
bundle install

# Validate content using RSpec which uses UnitsML Ruby
BUNDLE_GEMFILE=scripts/Gemfile bundle exec rspec ../spec --format documentation

Copyright CalConnect. Incorporates public domain work from NIST.

This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this license, visit CC-BY 4.0