Architecture of the C++ implementation
This page describes the internal structure of the C++ implementation: the layering, the modules and how they interact, the data model and the central design decisions. It is aimed at developers who integrate the library and at those who want to contribute to it. The overview page gives a quick tour; the detail topics — reading, writing, error handling, the C ABI and building — have their own pages.
Guiding principles
The implementation follows four principles:
- Standalone C++17. Idiomatic modern C++ with no foreign-language bridges and no external runtime dependencies; its behaviour is defined solely by the OSF format specification.
- Exception-free core. Every operation that can fail returns
osf::Result<T>(atl::expected<T, osf::Error>). Exceptions exist only in the opt-inosf::throwinglayer. - Best-effort on read. Truncated files (power loss on the embedded writer) yield all fully readable blocks instead of an error; unknown future data types are skipped rather than aborting the load.
- Lean dependencies. Three vendored header libraries
(
tl::expected,nlohmann/json,pugixml) plus zlib (FetchContent or system). No Boost, no Qt dependency.
Layering
Most applications work exclusively at the high level (DataManager for
reading, one of the two writers for writing). The low level is public
and stable — anyone who wants to stream-read or build their own tools
uses BlockReader directly.
Modules and responsibilities
| Header | Contents | Layer |
|---|---|---|
osf/error.h | Error (code + message), Result<T> | Foundation |
osf/types.h | DataType, ChannelType, SpectrumType + parsers | Foundation |
osf/header.h | Magic header: OsfVersion, MagicHeader, parseMagicHeader | Low |
osf/metablock.h | MetaBlock/FileInfo/Channel/Info; JSON and XML parsers; JSON serialization | Low |
osf/block.h | Block data model: Block, BlockKind, payload variants, control-byte decoder | Foundation |
osf/reader.h | BlockReader — iterator over the block stream | Low |
osf/stats.h | ReaderStats / ChannelStats — read telemetry | Low |
osf/compression.h | DecompressingIStream, detectCompression — transparent OSFZ | Low |
osf/datachannel.h | DataChannel variant (Equidistant / Timestamped / Variable), Segment, flat accessors | High |
osf/manager.h | DataManager — load + typed channel list | High |
osf/streamingwriter.h | StreamingWriter + ChannelDef | High |
osf/blockwriter.h | BlockWriter + free functions writeToFile / writeTo | High |
osf/stalevalueguard.h | StaleValueGuard — freshness layer over StreamingWriter | High |
osf/binarysample.h | BinarySample — non-owning byte view (span substitute) | Foundation |
osf/throwing.h | osf::Exception, throwing::unwrap/load/writeToFile — not in the umbrella | Convenience |
osf/capi.h | pure C99 ABI of the osf-c library — not in the umbrella | Convenience |
osf/osf.h | umbrella header (everything except throwing.h and capi.h) | — |
osf/version.h | generated; osf::version() and OSF_VERSION_* | Foundation |
Private implementation building blocks (under src/, not installable):
blockencode_p.{h,cpp} (OSF5 block encoder), writercommon_p.{h,cpp}
(chunking maths + metablock assembly), durablefile_p.{h,cpp}
(RAII file with fsync), binaryio_p.h (little-endian helpers).
See Internals for details.
Three data models — who sees what
The library deliberately has three representations of the same data, depending on the level of abstraction:
-
osf::MetaBlock(metablock.h) — the definitions: file metadata (FileInfo), channel definitions (osf::Channel) and optionalInfoentries. OSF4 (XML) and OSF5 (JSON) differ only in serialization; both parsers fill the same model symmetrically. -
osf::Block(block.h) — the stream view: a decoded block with channel index andBlockKindvariant (StartData,ContinuedData,AbsTimestampData,ContinuedRelStampData,Skipped). Payloads are unpacked, typed vectors — no zero-copy (blocks are KB to a few MB; the simple lifetime semantics outweigh the allocation). -
osf::DataChannel(datachannel.h) — the channel view: astd::variantover three storage layouts, because the storage genuinely differs:Variant Storage EquidistantChannelflat sample vector + std::vector<Segment>TimestampedChannelparallel vectors timestampsNs+valuesVariableChanneltimestamps + string or binary samples
Naming note: osf::Channel is the channel definition from the
metablock; osf::DataChannel is the assembled samples. Both share
the osf namespace, hence the different names.
Naming and API conventions
- Types in PascalCase (
DataManager,BlockReader). - Methods and free functions in camelCase (
loadFromFile,channelName,asDoublesFlat,writeToFile). - Public struct fields in camelCase without a prefix (
blocksTotal,sizeOfLengthValue,startTimestampNs,compressionFormat). - Private members with an
m_prefix + camelCase (m_channelData,m_writer). - Constants in UPPER_SNAKE_CASE (
MAX_MAGIC_HEADER_LEN,GPS_WIRE_SIZE). - Header file names lowercase, no separators, extension
.h(blockwriter.h,streamingwriter.h,datachannel.h). Internal headers in thesrc/directory carry the_p.hsuffix (blockencode_p.h,writercommon_p.h). - The C ABI (
osf_*symbols inosf/capi.h) follows the C-conventionalsnake_caseand is exempt from the C++ rules. - Variant discriminators are named
kind(BlockKind,SkipReason::Kind,VariableValueRef::Kind). - Everything fallible returns
Result<T>and is[[nodiscard]]. - Construction via static factories (
DataManager::loadFromFile) or builder-style configuration (writers:set*→addChannel→ write phase). - Fluent setters on
BlockReader(withCaptureSkippedPayload,withFileSize) returnBlockReader&. - Timestamps are uniformly
std::int64_tnanoseconds since the Unix epoch (UTC); sample rates aredoublein Hz.
Central design decisions
Result<T> instead of exceptions in the core
The library also targets embedded and industrial codebases where
exceptions are disabled or unwanted. The core therefore never throws;
tl::expected (vendored, CC0) provides the monad. Callers who prefer
exceptions take osf::throwing — a thin,
header-only layer that was deliberately not added to the umbrella
header, so that core users pull in no exception machinery.
Best-effort and forward compatibility
Real OSF files are produced on devices that can lose power at any time, and with spec revisions the reader does not yet know about. Three behavioural rules follow:
- Truncation is not an error. If the file ends mid-block, the
BlockReaderyields all complete blocks, bumpsstats().blocksTruncatedto 1 and ends the iteration cleanly. - Unknown is skipped, not swallowed. Channels with an unknown
(future) data type parse as
DataType::Unsupported; their blocks appear asBlockKind::Skipped(payload bytes are consumed so the stream stays aligned). The original spelling is preserved inChannel::dataTypeRaw. - Removed spec elements are hard errors. Data types removed by the
2026-05-04 spec revision (
pair,triple,candata,gpsdata) are rejected withError::Code::RemovedInSpec— their payload layout cannot be reproduced from a current build, and silent guessing would be data corruption.
Two writers instead of one
StreamingWriter (embedded: fsync per block, constant memory,
power-loss safe) and BlockWriter (analyst: accumulates in memory,
emits at the end, can auto-bump sizeOfLengthValue) have incompatible
invariants — a single writer would have diluted both profiles. Shared
building blocks (chunking, metablock assembly) live in
src/writercommon_p.*. Details on the Writing page.
Transparent OSFZ on read only
OSFZ (= gzip- or zlib-compressed OSF) is detected and decompressed
transparently on read (DecompressingIStream ahead of the magic
header parse). On write the library deliberately never compresses
inline: compression is a downstream step after the file is finalized,
so that write and compression failure modes stay decoupled.
Thread safety
| Class | Contract |
|---|---|
DataManager (loaded) | immutable → readable from any number of threads |
BlockReader | not thread-safe; one instance per thread |
StreamingWriter / BlockWriter / StaleValueGuard | not thread-safe; serialize calls externally (e.g. std::mutex) |
| different writers to different files | fine in parallel |
osf-c | osf_last_error_message() is thread-local; do not share handles across threads without serializing |
Directory layout
implementations/cpp/
├── CMakeLists.txt — project, options, targets
├── BUILD.md — build guide (EN)
├── cmake/ — CompilerWarnings.cmake, version.h.in
├── include/osf/ — public headers (the API surface)
├── src/ — implementation + private headers
├── tests/
│ ├── unit/ — GoogleTest units (synthetic data)
│ ├── integration/ — tests against examples/*.osf(z)
│ └── capi/ — pure C99 test for osf-c
├── examples/ — inspect, dump, write, copy
└── third_party/ — tl::expected, nlohmann/json, pugixml (vendored)
Further reading
- Reading — DataManager, DataChannel, BlockReader, OSFZ
- Writing — StreamingWriter, BlockWriter, StaleValueGuard
- Error handling — Result, error catalogue, throwing
- C ABI — osf-c for C, C#, OCX
- Building & integrating — CMake, options, CI
- Cookbook — recipes for typical tasks
- Internals — encoder, chunking, builder state machine