The four pillars

OISG defines four interdependent dimensions that every AI system must satisfy simultaneously. Each pillar has a precise scope, measurable properties, and an adequacy metric.

Open

An AI system satisfies the Open requirement when its components — models, training methodology, governance infrastructure, communication protocols, and audit logs — are inspectable, reproducible, and interoperable by independent parties.

Openness in OISG is broader than open source licensing. It begins with model transparency: not necessarily open weights, but documented capabilities, limitations, and provenance. The EU AI Act requires transparency obligations for general-purpose AI models (GPAI), including technical documentation, training methodology summaries, and copyright compliance information. These requirements establish a floor, not a ceiling.

The systems that enforce policy on AI behaviour must themselves be auditable. A proprietary policy engine governing an open model creates a trust displacement, not a trust resolution. Governance infrastructure — the code that decides what an agent may or may not do — must be open for the same reasons that judicial proceedings are public: accountability requires visibility. Frameworks such as Admina (Apache 2.0) and the Agent Governance Toolkit (MIT) demonstrate that governance infrastructure can be fully open without compromising operational security.

Agent-to-agent and agent-to-infrastructure communication must use open standards — Model Context Protocol, Agent-to-Agent Protocol (A2A), OpenTelemetry for AI observability — rather than vendor-locked APIs. Protocol interoperability prevents lock-in and enables the kind of independent verification that the other three pillars require.

Code availability is necessary but not sufficient. Sustainable openness requires governance of the open project itself: foundation membership, transparent roadmaps, contribution processes, security disclosure policies. A repository with an open licence but no maintainer response, no security process, and no governance structure is open in form but not in substance.

Adequacy metric. What fraction of the AI system's decision-affecting components can be audited by an independent third party without requiring proprietary access?

Intelligent

An AI system satisfies the Intelligent requirement when its capabilities are measured, documented, bounded, and aligned with explicitly stated objectives.

Intelligent in OISG is not about maximising capability but about governing it. Every model in production must have a capability profile: benchmark results, known failure modes, confidence calibration, domain-specific evaluation suites. Article 13 of the EU AI Act mandates that high-risk AI systems provide sufficient transparency for users to interpret and use output appropriately. A model without a capability profile is an uncharacterised tool — and uncharacterised tools are ungovernable.

The ability to execute models in controlled environments — on-premise, private cloud, air-gapped — is a prerequisite for data sovereignty, particularly in healthcare, public administration, and defence. This requires runtime infrastructure (inference engines such as Ollama, vLLM; vector databases; embedding pipelines) that operates independently of third-party cloud APIs. Sovereign infrastructure is not protectionism; it is the operational foundation for data classification policies that governance requires.

Retrieval-Augmented Generation (RAG) pipelines must be auditable: which document, at which version, with which embedding model, informed which response. Without retrieval traceability, the "intelligence" of the system is a black box within a black box. Every retrieval step must produce a trace that can be reconstructed after the fact.

Autonomous agents must operate within an explicit autonomy taxonomy. Task execution (API calls within defined scope), choice (selection among pre-approved alternatives), commitment (actions with organisational impact requiring human approval), and self-modification (always requiring human authorisation) represent distinct levels of autonomy with distinct governance requirements. This taxonomy must be machine-readable and enforceable at runtime.

Adequacy metric. Can the system produce, on demand, a complete explanation of why it gave a specific response — including the data sources consulted, the model version used, and the confidence level — within a defined latency budget?

Secure

An AI system satisfies the Secure requirement when it is resilient to adversarial manipulation across all interaction surfaces, at runtime, with measurable detection and response latencies.

Security for autonomous AI agents differs qualitatively from traditional application security. Prompt injection is not limited to user-to-model attacks. Indirect injection via compromised retrieval documents, tool outputs, or inter-agent messages represents an equally critical attack surface. Defence must operate on both request and response paths, for every interaction. Pattern-based detection should operate at microsecond latency to avoid becoming a bottleneck in agent execution loops.

In multi-agent systems, every agent must possess a verifiable identity — Decentralised Identifiers, Ed25519 key pairs. Inter-agent trust must be quantifiable and dynamic, not assumed. Protocols such as the Inter-Agent Trust Protocol (IATP), implemented within the Agent Governance Toolkit's AgentMesh module, demonstrate how inter-agent trust can be formalised through cryptographic identity verification and dynamic trust scoring. An agent that cannot prove its identity should not be trusted, regardless of what it claims about itself.

Emergency termination of an autonomous agent must preserve forensic state, enable transactional rollback, and guarantee system consistency. "Turning it off" is not a kill switch; it is an uncontrolled halt. A kill switch is an architectural component with defined pre-conditions, state preservation guarantees, and recovery procedures. The distinction matters when the agent has in-flight transactions or has made commitments to external systems.

Model fingerprinting — identifying a model through its observable behaviour rather than its declared identity — is necessary to verify that the model executing in production is the model that was evaluated and approved. Software supply chain practices (SLSA, SBOM, cryptographic provenance) must extend to model weights, adapter layers, and fine-tuning datasets. Personally identifiable information must be redacted before reaching any model endpoint not explicitly authorised for PII processing. Redaction must be performed at the infrastructure level, not delegated to the model itself, and must operate at latencies that do not degrade user experience.

Adequacy metric. If an agent is compromised at 03:00, what is the mean time to detection, mean time to containment, and mean time to forensic-quality state recovery?

Governed

An AI system satisfies the Governed requirement when its compliance with applicable regulations, organisational policies, and ethical constraints is verified automatically, continuously, and with immutable evidence.

Governed in OISG is an operational discipline, not a periodic audit. EU AI Act (Articles 6–15), NIS2, GDPR, ISO/IEC 42001 requirements must be enforced at the system level, not satisfied through annual reviews. Every decision by an autonomous agent must be automatically classified against the applicable risk tier and logged with the corresponding compliance evidence.

By analogy with aviation flight data recorders, every high-risk AI system must maintain an immutable, hash-chained (SHA-256) log of all interactions, decisions, human interventions, and system state changes. This forensic black box must be tamper-evident, retention-policy-compliant, and retrievable within defined SLAs. The log is not for debugging — it is for accountability.

Not all AI systems carry equal risk. A FAQ chatbot is not a credit scoring system. Governance controls must be proportional to the classified risk level, and classification must be automated, auditable, and updatable as system capabilities evolve. Over-governing low-risk systems wastes resources; under-governing high-risk systems creates liability.

Human oversight must be defined as an architectural component, not a policy aspiration. This means specifying which decisions require human review, what information the reviewer receives, what response time is required, and what happens if the human does not respond. Absent these specifications, "human-in-the-loop" is a compliance fiction.

Distributed tracing (OpenTelemetry), governance dashboards, SLOs, error budgets, and circuit breakers — operational reliability practices established in cloud-native engineering — must extend to AI systems. AI agents are services; they deserve the same observability rigour as any production microservice.

Adequacy metric. If a supervisory authority requests evidence of compliance for a specific AI system, how many hours does it take to produce the required documentation?