Formats, Validation & Identity
XML ↔ JSON-LD conversion, multi-layer validation, identifier translation, idempotent event hashing.
Before an EPCIS document touches the event store it goes through this layer: it gets parsed, validated, canonicalised, hashed for deduplication, and its identifiers are normalised to Digital Link form. XML or JSON-LD goes in; a trusted, deduplicated, canonical event comes out. By the time anything is indexed, the platform has guaranteed that it's standards-conformant, semantically equivalent across XML and JSON-LD representations, and impossible to insert twice.
Two converters ship for two scopes. The open-source edition includes an XSLT-based converter — load the document, run the transform, emit the result. It's a clean approach for single events, small batches and plain event shapes. The Business edition adds a SAX-streaming converter for production-volume work: multi-gigabyte EPCIS exports stream through at network speed with bounded memory, the JVM heap stays flat as the document grows, deep extension trees and sensor payloads survive intact, mixed 1.2 / 2.0 batches pass through cleanly, and the edge cases where load-then-transform either struggles or quietly drops information are handled correctly. A streamed conversion plugs into the same validation and event-hash stages, so it lands as a fully canonical, deduplicated event in one pass. For organisations moving production volumes — especially migrating live 1.2 corpora to 2.0 — this is the headline difference.
Validation runs in layers. JSON Schema first, then custom-extension shapes at every nesting level of the event (parent, readPoint, bizLocation, errorDeclaration, sensorElement, ILMD, bizStep, disposition), then sensor-element rules. Anything that fails any layer is rejected at the boundary. Custom namespaces (battery:, eudr:, textile:, customer extensions) only get validated when the request declares them via the GS1-Extensions HTTP header; without that declaration the validator lets them through untouched. The header is the explicit opt-in that activates regulation-specific or vendor-specific validation rules.
Event hashes are computed against a canonicalised representation of the event content — field order, types, whitespace all ironed out per the EPCIS specification — not against the raw bytes that arrived. Two events that differ only in JSON whitespace or attribute order produce the same hash, so re-sends and round-trips through different serialisers produce the same event ID. Canonicalisation is CBV-version-aware: the rules evolve alongside the spec without breaking historical hashes.
EPCIS 1.2 ↔ 2.0 XML migration works in both directions, which matters when an organisation is still receiving 1.2 from upstream partners while shipping 2.0 downstream.
Capabilities by edition
| Capability | OSS | Business |
|---|---|---|
| XML ↔ JSON-LD conversion (XSLT, load-then-transform) | ✓ | ✓ |
| Streaming XML ↔ JSON-LD conversion (SAX, bounded memory) | — | ✓ |
| EPCIS 1.2 ↔ 2.0 XML migration | ✓ | ✓ |
| EPCIS document validation | ✓ | ✓ |
| Multi-level custom-extension validation | — | ✓ |
| Sensor element validation | — | ✓ |
| Pre-canonical event hash (idempotent IDs) | ✓ | ✓ |
| Web UI for format conversion | — | ✓ |
| Hash generator as a service | — | ✓ |
See also
- Architecture → GS1 conformance contract — the discipline rules these modules enforce.
- Modules → EPCIS Events — where the validated, hashed events go next.
- Modules → Testdata — generators that respect the same conformance rules.