Skip to main content

Data formats

Why do measurement data require specialized data formats?

All data in our world has a temporal reference. A movie is released at a specific time. A passport has an issue date and an expiration date. We open doors, observe the weather, or have conversations—everything happens along a timeline. But although time plays a role in every form of information, the requirements for recording and processing it vary considerably in some cases—especially when it comes to observing physical, mechanical, or electrical processes.

In measurement technology, we often observe processes that occur so quickly that humans cannot perceive them with the naked eye or ear. Examples of this include:

  • The vibration of a machine component with 1,000 measurements per second (Hz)
  • The current consumption of an electric motor during the start-up process (recorded at 10,000 Hz)
  • The ignition in a combustion engine with up to 100,000 measurement points per second
  • The measurement of structural stresses during a crash test (several million samples per second)

Such data is generated continuously and in high density. It must be stored reliably, reproducibly, and with as little loss as possible. This is not just a matter of simple numerical values such as temperature or voltage. Rather, there are different types of data—such as discrete signals like the opening and closing of a door, analog measured variables with physical units, audio signals, image data, or entire data blocks, each with a specific meaning.

All of this requires data formats that can do more than typical office applications. Formats such as CSV, JSON, or Excel are useful for structured data exchange, but they quickly reach their limits when it comes to:

  • High data rates
  • different data types
  • precise timestamps
  • Streaming and robustness against failures
  • Flexible expandability.

As a result, many proprietary measurement data formats have become established over time, including:

  • MATLAB (.mat)

  • MATLAB (.mat)

  • imc (.dat)

  • HBK / ASAM (.atfx)

  • NI / National Instruments (.tdm / .tdms)

  • Gantner Instruments (.dat in UDBF format)

  • Vector / CSM (ASAM MDF 4)

  • HDF5 (a generic, widely used format for scientific data)

These formats are well suited to the specific requirements of their manufacturers and users, but are often proprietary, complex, or not ideal for applications with open, modular systems.

Why OSF? Why yet another new data format?

When we at optiMEAS were faced with the task of defining a robust and flexible data format for our own devices and platforms, we first examined intensively whether an existing format could be integrated in a meaningful way. However, none of the existing solutions met all of our requirements at the same time:

  • Continuous, lossless writing during measurement (streaming)
  • Robustness against power failures or abrupt shutdowns
  • Support for equidistant and non-equidistant, time-stamped data
  • Mappability of different data types – numerical values, images, audio signals, or other structured data blocks
  • Easy integration and implementation
  • Low complexity with open expandability

Some formats, such as MDF4 (ASAM Measurement Data Format Version 4), come very close to meeting these requirements from a technical standpoint. MDF4 is powerful and widely used in the automotive industry. However, the specification is several hundred pages long, not freely accessible, and only available to members of the ASAM association or for a fee. Implementation is also correspondingly complex.

HDF5 (Hierarchical Data Format) is another interesting, widely used format that offers a high degree of flexibility and broad support in science and technology. It can efficiently structure large amounts of data and offers an open ecosystem. However, HDF5 is very generic – and that is precisely where the challenge lies: without a concrete convention tailored to measurement data for structuring and interpreting content, implementation remains complex, inconsistent, and potentially error-prone.

That is why, after careful consideration, we have deliberately chosen our own path: the Open Streaming Format (OSF).

OSF meets the same high requirements as established industry formats – but is openly documented, freely usable, easy to understand, and can be implemented with little effort. It was developed specifically for practical use in measurement systems where data must be stored continuously and reliably on embedded systems – even under difficult conditions. At the same time, OSF enables efficient processing and analysis in the laboratory or on a PC, where higher system performance is available.

In the following, we explain the structure, philosophy, and practical advantages of OSF – and why it is the optimal choice for modern measurement systems.

From measurement data formats to open data exchange formats

Specialized formats such as OSF are ideal for recording physical measurement values efficiently, without loss, and in a structured manner.
But measurement data is only the first step: real added value is created when it is combined with other information.

At a certain point, the recorded values must find their way out of the measurement technology:

  • In data lakes, where they merge with large data sets from other sources.
  • In billing and maintenance systems that derive business decisions from measured values.
  • In analysis platforms that identify trends, find anomalies, or calculate KPIs.

For this exchange, we need file formats that may not be as efficient or robust as OSF, but which have one decisive advantage: they can be read by almost any platform and software.

Formats such as CSV, TSV, JSON, and Parquet play an important role here.

  • CSV / TSV: Very simple, human-readable, suitable for small amounts of data.
  • JSON: Flexible, structured, good for API transport and smaller data blocks.
  • Parquet: Column-oriented, compressed format, often used in big data environments.

They are not intended for the permanent storage of high-frequency measurement data, but they are the link between the world of measurement technology and the world of all other "consumers" of data. Click here for more information

Conclusion:
OSF ensures loss-free and robust acquisition – the open exchange formats make the data accessible to all other systems and ultimately lead to what matters: added value from information.