Executive summary
The word "ontology" is widely misused in enterprise software discourse, where it most often stands in for database schema or data dictionary. In the formal sense established by Gruber (1993) and refined across three decades of work in artificial intelligence, the semantic web, and applied ontology engineering, the term denotes something architecturally distinct: a structured model of the entities that exist in a domain, the attributes those entities possess, the relationships they can sustain, and the axioms that constrain coherent inference over the resulting graph. The distinction is not pedantic. It is the difference between a system that stores data about a domain and a system that represents the domain itself.
This paper synthesizes the canonical literature on ontology, knowledge representation, and knowledge graphs, and applies the resulting framework to public safety | the operational domain that combines temporal urgency, multi-source ground truth, severe data heterogeneity, and consequential decisions made under uncertainty. It advances three claims. First, the schema-versus-ontology category error is responsible for the substantial majority of operational failures in public safety intelligence systems, and the failure mode is structural rather than tactical. Second, entity resolution | normalizing variant references to canonical entities | is the binding constraint on any working ontology, and the field's progress is gated by this problem rather than by representation formalism. Third, the convergence of large language models and structured knowledge graphs, articulated by Lewis et al. (2020) and matured since, is the architectural inflection point for the domain.
I. The category error
The contemporary use of "ontology" in enterprise software marketing has degraded the word to a substitute for schema, taxonomy, controlled vocabulary, or data catalog. The erosion has consequences: buyers evaluating intelligence systems frequently cannot distinguish products that genuinely operate on a formal ontology from products that map a schema to a graph database and claim the same lineage. The two architectures have materially different operational properties.
The canonical definition comes from Gruber (1993): "an ontology is an explicit specification of a conceptualization," refined by Studer, Benjamins, and Fensel (1998) to "a formal, explicit specification of a shared conceptualization." Each adjective bears weight. Formal requires machine-interpretability; explicit requires that concepts and constraints be enumerated rather than implicit in the data; shared requires consensus across a community rather than a single agent's private model. Neuhaus (2018) critiques the definition on the ground that "conceptualization" is murkier than the term it defines, but the four constraints it imposes remain operationally useful regardless.
A schema, by contrast, specifies how data is physically stored | tables, columns, types, referential constraints. It is a contract between a database and the systems reading and writing against it, and it says nothing about what the data means. Two systems can share an identical schema and refer to incompatible realities; two systems can have incompatible schemas and refer to the same reality. The schema is the storage layer. The ontology is the semantic layer.
Each layer constrains the one above it. The axiom layer holds the system's commitments to coherent inference; the relationship layer is where analytic leverage is realized; the property layer encodes the data quality contract; the entity layer declares what is real in the domain. These are independent of any storage backend, query language, or vendor. The same ontology can be materialized in OWL (Bechhofer et al. 2004), RDF (Cyganiak et al. 2014), a property graph, or a relational schema with constraint enforcement above it. The substrate is not the ontology. The ontology is the model. A schema can be ontology-compatible, serving as the storage substrate for entities specified independently, but it is not itself an ontology, and a system that mistakes one for the other will fail predictably when it tries to integrate heterogeneous sources or reason about relationships the schema does not encode.
II. The lineage
Ontology engineering descends from several traditions that converged in the late twentieth century. The philosophical tradition, from Aristotelian categorization through analytic metaphysics, supplied the central question: what categories of thing exist, and on what grounds. The AI research program of the 1970s and 1980s supplied the formalisms | description logics, frame systems, semantic networks. The Semantic Web initiative, articulated by Berners-Lee, Hendler, and Lassila (2001), produced the shared standards (RDF, OWL, SPARQL). The enterprise software wave, beginning with Palantir's founding in 2003 and accelerated by Google's Knowledge Graph in 2012 (Singhal 2012), demonstrated that ontology-driven architectures could operate at industrial scale.
Three reference points bear directly on the public safety case. First, top-level ontologies have matured into international standards: Basic Formal Ontology (BFO), developed by Barry Smith and colleagues, was published as ISO/IEC 21838-2:2021 and adopted by more than 500 projects across biomedicine, manufacturing, and the intelligence community, including the Department of Defense and the Office of the Director of National Intelligence (Smith 2024). The most rigorous framework available has been embraced precisely where the cost of incoherent representation is highest. Public safety has not been among them.
Second, operational ontologies matured commercially through Palantir's Foundry and Gotham. The Foundry documentation (Palantir 2024) describes the ontology as "the operational layer for the organization" and distinguishes it from "data cataloging or schema design." Its four primitives | objects, properties, links, and actions | map onto the four ontology layers above, with actions adding a kinetic layer for decision-making: if the data elements are the nouns of the enterprise, the actions are the verbs. The category exists for public safety; the implementations, with few exceptions, do not.
Third, knowledge graph practice consolidated across the largest platforms. Hogan, Blomqvist, Cochez, et al. (2021) survey the convergence of academic knowledge representation with industrial knowledge graphs at Google, Amazon, Microsoft, and others, establishing the modern knowledge graph as the realization of the four-layer architecture at the scale of billions of facts. Google's "things, not strings" framing (Singhal 2012), launched with 500 million objects and 3.5 billion facts, marked the moment knowledge graphs entered consumer-visible deployment.
III. The binding constraint
One practical problem consistently surfaces as the determinative obstacle to deploying an ontology at scale. The ontology itself | the schema of entities, properties, relationships, and axioms | is the tractable part. The hard problem is entity resolution: determining which references in the data refer to the same underlying entity in the world.
Entity resolution is hard because human records and speech are full of variation. A person referenced by full name in one record, initials in a second, badge number in a third, and a nickname in a fourth must be normalized to one canonical entity. An address spoken over radio ("one-two-three Main"), typed into CAD ("123 Main St"), and stored in records management ("123 N Main Street, Apt 4B") must resolve to one location entity with canonical coordinates. The literature, from Fellegi and Sunter (1969) through Christen (2012) and Köpcke and Rahm (2010), establishes the problem as foundational wherever structured data is assembled from heterogeneous sources.
Geocoding is the most consequential special case in public safety. Addresses are dense, frequently referenced, and almost universally degraded, appearing across CAD, records management, radio, license plate readers, and citizen reports in different canonical forms with abbreviations, typos, missing apartment numbers, and idiosyncratic local conventions. The problem is not solved by purchasing a commercial geocoding API; it is solved by a resolution layer that combines geocoding with domain-specific normalization, fuzzy matching against historical incident locations, and progressive disambiguation against a canonical location entity.
The variant references are what every operational system actually receives; the canonical entity is what the ontology requires; the resolution layer between them is the locus of nearly all the engineering effort. A system can be elegant in its ontology design and its interface and still fail in production if resolution is incomplete. Most failures attributed to ontology design are, on inspection, failures of entity resolution. The conclusion that recurs across the literature is that no single technique solves the problem in isolation; production-grade resolution requires composing multiple approaches, calibrating against domain ground truth, and correcting continuously against operational feedback. It is the single largest source of unmodeled engineering effort in ontology-driven systems.
IV. The public safety case
The public safety domain presents a structural problem the knowledge representation literature is well-suited to diagnose. The Bureau of Justice Statistics enumerates 17,541 state and local law enforcement agencies as of 2018, employing roughly 1.21 million full-time personnel (Bureau of Justice Statistics 2022); fire and EMS add thousands more. The operational picture in any jurisdiction is assembled from CAD, records management, radio, license plate readers, gunshot detection, drone telemetry, body-worn cameras, 911 records, and a growing tail of municipal sensors. Each is procured separately, on different cycles, from different vendors, with different schemas, and with no requirement that the data be interpretable across systems.
The 9/11 Commission Report (2004) articulated the consequence with force: the information needed to anticipate the attacks existed in the aggregate across agencies but could not be integrated into an actionable picture. The "connect the dots" metaphor shaped the post-2001 investment in information sharing, including fusion centers (DHS 2019) and the National Information Exchange Model (NIEM 2005). NIEM, formally instantiated by the CIOs of DOJ and DHS in April 2005, was the most serious federal investment in standardized exchange. Two decades later it has matured into an OASIS Open standard (NIEMOpen 2024), but its operational impact at the level of individual agency capability remains modest, for an instructive reason.
NIEM is a controlled vocabulary and exchange specification, not an operational ontology. It defines the canonical fields for information that crosses agency boundaries; it does not define the entity model, relationship structure, or inference rules that make exchanged information usable. An agency conforming to NIEM is exchanging data in a standard format; an agency with an operational ontology is reasoning over a unified representation of its world. The two are complementary, not substitutable, and the federal investment in NIEM has not been matched by investment in agency-level ontology construction. That absence is the structural reason most of the 17,541 agencies continue to operate on schema-first architectures that prevent the cross-source reasoning the Commission identified as critical.
The gap between federal intelligence practice and local public safety is wide. BFO's adoption by the DOD and intelligence community (Smith 2024) shows the architectural argument has been won at the highest tier. Yet roughly 0.3 percent of agencies operate a Real-Time Crime Center of the kind the largest metropolitan departments have built (Police Executive Research Forum 2023), and only a fraction of those run an ontology-first architecture rather than a schema-driven dashboard. The category error is architectural, not procurement-specific: systems are built schema-first because that is how enterprise software has historically been built, because procurement favors incremental modernization over rebuilds, and because the dominant vendors | Tyler Technologies, Motorola Solutions, Hexagon, Central Square | have built portfolios around schema-driven modules integrated through point-to-point exchanges rather than a unified semantic substrate. The result is a market of architectures that are individually defensible and collectively incapable of producing the integrated picture the Commission identified two decades ago.
V. The inflection
A pattern that emerged in the past half-decade materially changes the deployment math. It combines large language models, which process unstructured human content well but produce unreliable outputs in isolation, with structured knowledge graphs built against a formal ontology, which produce reliable structured queries but cannot parse natural language. The combination, articulated by Lewis et al. (2020) as retrieval-augmented generation (RAG), is now the dominant deployment architecture for knowledge-intensive applications.
The argument is straightforward. Standalone language models exhibit well-characterized failure modes on knowledge-intensive tasks: hallucination, inability to access information outside the training distribution, and lack of provenance (Lewis et al. 2020; Ji et al. 2023). These are structural consequences of the architecture, not artifacts that scale eliminates. Knowledge graphs exhibit the inverse failure mode: reliable, provenance-tracked, structurally constrained outputs to structured queries, but no ability to accept natural-language input directly. RAG composes them so the language model handles the natural-language interface while the knowledge graph supplies the ground truth that constrains the model's output.
The pattern has converged across enterprise deployments over five years. The original formulation (Lewis et al. 2020) paired a sequence-to-sequence generator with a dense vector index over Wikipedia; the architecture has since been extended to structured knowledge graphs for ground truth (Edge et al. 2024), agentic multi-step reasoning over the graph (Yao et al. 2023), and hybrid retrieval combining vector search with graph traversal. The unifying commitment is that the model is constrained by a structured representation of the domain, updated continuously from operational data.
For public safety the alignment is unusually strong. The two dominant operational artifacts | radio transmissions and incident reports | are exactly the unstructured natural-language data language models process well and schema-driven systems cannot accept. The two dominant requirements | reasoning across relationships among incidents, units, persons, and locations, and producing legally defensible records | are exactly the structured query and provenance requirements knowledge graphs satisfy. The reason this has not produced widespread deployment is not the absence of architecture; it is the absence of the underlying ontology and entity resolution work, and the absence of an operational substrate that captures the unstructured data in the first place.
VI. Synthesis: diagnosis, frontier, and recommendation
Diagnosis. Most contemporary public safety intelligence systems run on schema-driven architectures that prevent the cross-source reasoning the field's most consequential failures have identified as critical. The category error is well-characterized in the academic literature (Gruber 1993; Studer et al. 1998) and the enterprise literature (Palantir 2024; Hogan et al. 2021), but it has not propagated into procurement. NIEM addresses exchange, not representation. BFO's adoption at the DOD and intelligence community (ISO/IEC 21838-2:2021; Smith 2024) shows the argument has been won at the top tier, but the implementation gap to local public safety is wide and entrenched.
Frontier. The next generation of systems is being built at the intersection of three problems: constructing a public safety ontology of sufficient fidelity to support cross-source reasoning; building the entity resolution layer that converts heterogeneous data into canonical entities, with geocoding as the highest-impact special case and unit normalization second; and integrating the resulting knowledge graph with language model architectures so the system accepts unstructured operational inputs and returns structured, provenance-tracked outputs. None of the three is conceptually open; each is engineering-bound. The work of integrating them at production scale, at the operational tempo public safety requires, and at a unit economics that permits deployment below the top tier, is where the field's progress now occurs.
Recommendation. The systems that will look architecturally obvious in retrospect | the public safety equivalents of what Google Knowledge Graph has been for search, what Palantir's ontology has been for defense, what BFO has been for biomedicine | are being built now, by teams that committed to ontology-first architecture and did the unglamorous entity resolution work. The opportunity is not incremental. The shift from schema-first to ontology-first is the most consequential infrastructural decision a public safety agency or vendor will make this decade. The literature has been clear about what is required for nearly thirty years. The work that remains is in the implementation.
VII. Applications: how the architecture instantiates at Multido
This section makes the abstract concrete, using Multido as the worked example, to show that the literature's prescriptions are buildable and have been built.
The radio entry point as ontology ingestion. The central decision is where operational data enters the system. Multido's entry point is the public safety radio transmission, captured at the edge, transcribed, and resolved into the ontology as a first-class entity. The radio transmission is the densest stream of unstructured operational language the domain produces, and it requires no procurement to access because it is broadcast in the clear. The transmission becomes an entity (AssociatedRadioTransmission) linked to the incident it concerns, the unit that produced it, and the location it references. No competing platform that begins from CAD or records management possesses this linkage, because the radio layer is the one stream those systems do not ingest. That linkage is a structural feature of the data model, not of the interface.
Entity resolution as the core engineering investment. Section III identifies entity resolution as the binding constraint, and the engineering investment reflects that diagnosis. Geocoding | resolving spoken and written address references to canonical location entities with coordinates | is the highest-impact first investment. Unit normalization | resolving "Engine 247," "Unit 247," "E247," and "two-four-seven" to one canonical apparatus entity | is the second. These are not incidental cleaning tasks; they are the substrate every downstream query depends on. The historical archive that accumulates as resolution improves compounds daily and cannot be backfilled by a later entrant, because resolution accuracy is a function of the volume and duration of operational data it has been calibrated against.
The convergence architecture in the product line. Each product reads from and writes to the same underlying ontology, mapping onto the temporal arc of an incident. Roger serves community awareness, translating resolved transmissions into plain-English incident awareness for the public. Overture operates ahead of the call, surfacing anticipatory signals from operational patterns. Tempo operates at the moment of decision, supplying the live picture to a watch commander. Coda operates after resolution, producing the official record. These are not four databases; they are four interfaces onto one ontology | the property that distinguishes an ontology-first system from a suite of integrated point products. The language-model layer handles the natural-language surfaces; the knowledge graph supplies the constrained, provenance-tracked ground truth, precisely the division Lewis et al. (2020) formalized.
The addressable gap. The overwhelming majority of agencies operate without the substrate to make their own data legible, and the RTCC capability the largest departments have built is structurally inaccessible to them. The radio entry point makes the unit economics feasible by removing the procurement barrier that has gated access to integrated intelligence infrastructure. An agency does not need to replace its CAD system or sign a seven-figure platform contract to begin; it needs only the operational data already being broadcast in its jurisdiction to be captured, resolved, and reasoned over. The architecture the literature has prescribed for nearly thirty years becomes deployable to the agencies that have never been able to afford it.
References
Bechhofer, S., et al. (2004). OWL Web Ontology Language Reference. W3C Recommendation.
Berners-Lee, T., Hendler, J., & Lassila, O. (2001). The Semantic Web. Scientific American, 284(5), 34-43.
Bureau of Justice Statistics. (2022). Census of State and Local Law Enforcement Agencies, 2018 – Statistical Tables (NCJ 302187). U.S. DOJ, Office of Justice Programs.
Christen, P. (2012). Data Matching: Concepts and Techniques for Record Linkage, Entity Resolution, and Duplicate Detection. Springer.
Cyganiak, R., Wood, D., & Lanthaler, M. (2014). RDF 1.1 Concepts and Abstract Syntax. W3C Recommendation.
Department of Homeland Security. (2019). 2019 National Network of Fusion Centers Final Report.
Edge, D., et al. (2024). From Local to Global: A Graph RAG Approach to Query-Focused Summarization. arXiv:2404.16130.
Fellegi, I. P., & Sunter, A. B. (1969). A Theory for Record Linkage. JASA, 64(328), 1183-1210.
Gruber, T. R. (1993). A Translation Approach to Portable Ontology Specifications. Knowledge Acquisition, 5(2), 199-220.
Hogan, A., et al. (2021). Knowledge Graphs. ACM Computing Surveys, 54(4), Article 71.
ISO/IEC. (2021). ISO/IEC 21838-2:2021 — Top-Level Ontologies — Part 2: Basic Formal Ontology (BFO).
Ji, Z., et al. (2023). Survey of Hallucination in Natural Language Generation. ACM Computing Surveys, 55(12), Article 248.
Köpcke, H., & Rahm, E. (2010). Frameworks for entity matching: A comparison. Data & Knowledge Engineering, 69(2), 197-210.
Lewis, P., et al. (2020). Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. NeurIPS, 33, 9459-9474.
National Commission on Terrorist Attacks Upon the United States. (2004). The 9/11 Commission Report. U.S. GPO.
NIEM. (2005). Launched by the CIOs of the U.S. DOJ and DHS, April 2005.
NIEMOpen. (2024). NIEM transition to OASIS Open Project. OASIS.
Neuhaus, F. (2018). What is an Ontology? arXiv:1810.09171.
Palantir Technologies. (2024). Foundry Ontology Documentation.
Police Executive Research Forum. (2023). Real-Time Crime Centers: Lessons Learned from Major Cities.
Singhal, A. (2012). Introducing the Knowledge Graph: Things, Not Strings. Official Google Blog.
Smith, B. (2024). BFO adoption across the U.S. DOD and ODNI. National Center for Ontological Research, University at Buffalo.
Studer, R., Benjamins, V. R., & Fensel, D. (1998). Knowledge engineering: Principles and methods. Data & Knowledge Engineering, 25(1-2), 161-197.
Yao, S., et al. (2023). ReAct: Synergizing Reasoning and Acting in Language Models. ICLR.

