Formats, Validation & Identity

XML ↔ JSON-LD conversion, multi-layer validation, identifier translation, idempotent event hashing.

3 min read

Before an EPCIS document touches the event store it goes through this layer: it gets parsed, validated, canonicalised, hashed for deduplication, and its identifiers are normalised to Digital Link form. XML or JSON-LD goes in; a trusted, deduplicated, canonical event comes out. By the time anything is indexed, the platform has guaranteed that it's standards-conformant, semantically equivalent across XML and JSON-LD representations, and impossible to insert twice.

Two converters ship for two scopes. The open-source edition includes an XSLT-based converter — load the document, run the transform, emit the result. It's a clean approach for single events, small batches and plain event shapes. The Business edition adds a SAX-streaming converter for production-volume work: multi-gigabyte EPCIS exports stream through at network speed with bounded memory, the JVM heap stays flat as the document grows, deep extension trees and sensor payloads survive intact, mixed 1.2 / 2.0 batches pass through cleanly, and the edge cases where load-then-transform either struggles or quietly drops information are handled correctly. A streamed conversion plugs into the same validation and event-hash stages, so it lands as a fully canonical, deduplicated event in one pass. For organisations moving production volumes — especially migrating live 1.2 corpora to 2.0 — this is the headline difference.

Validation runs in layers. JSON Schema first, then custom-extension shapes at every nesting level of the event (parent, readPoint, bizLocation, errorDeclaration, sensorElement, ILMD, bizStep, disposition), then sensor-element rules. Anything that fails any layer is rejected at the boundary. Custom namespaces (battery:, eudr:, textile:, customer extensions) only get validated when the request declares them via the GS1-Extensions HTTP header; without that declaration the validator lets them through untouched. The header is the explicit opt-in that activates regulation-specific or vendor-specific validation rules.

Event hashes are computed against a canonicalised representation of the event content — field order, types, whitespace all ironed out per the EPCIS specification — not against the raw bytes that arrived. Two events that differ only in JSON whitespace or attribute order produce the same hash, so re-sends and round-trips through different serialisers produce the same event ID. Canonicalisation is CBV-version-aware: the rules evolve alongside the spec without breaking historical hashes.

EPCIS 1.2 ↔ 2.0 XML migration works in both directions, which matters when an organisation is still receiving 1.2 from upstream partners while shipping 2.0 downstream.

Capabilities by edition

CapabilityOSSBusiness
XML ↔ JSON-LD conversion (XSLT, load-then-transform)
Streaming XML ↔ JSON-LD conversion (SAX, bounded memory)
EPCIS 1.2 ↔ 2.0 XML migration
EPCIS document validation
Multi-level custom-extension validation
Sensor element validation
Pre-canonical event hash (idempotent IDs)
Web UI for format conversion
Hash generator as a service

See also

Last updated: