Research / Knowledge Graph-Grounded Ethical Reasoning in AI Systems: Th…

Knowledge Graph-Grounded Ethical Reasoning in AI Systems: The AURI Architecture

Authors M. Nafea, AURI Development Team

Published 2026-02-16

SAGL-1.0 preprint Open Access

📋 Cite this paper

M. Nafea, AURI Development Team. (2026-02-16). "Knowledge Graph-Grounded Ethical Reasoning in AI Systems: The AURI Architecture". SOMAsoft Research. Available at https://somasoft.ai/papers/auri-knowledge-graph-ethical-reasoning. Licensed under SAGL-1.0.

> ## ⚠️ Read Before Citing — "0% hallucination" is a design property, not a measurement > > A May 19, 2026 internal Reality-Engine audit (per AIES 2026 ledger entry #14) found this paper's "0% hallucination" framing requires clarification: > > - "0% hallucination through Reality Engine" refers to the system's design discipline — UNKNOWN-on-uncertainty, citation-required generation, refusal to assert without grounding. By construction, a system that says "I don't know" when uncertain has 0% hallucination — that's a design property, not a measured rate against an external benchmark. The AIES 2026 paper frames this as ledger entry #14 (the "tautological success criteria" pattern): the claim is true and partially tautological, and presenting it as a measured result without naming the condition is the failure mode. > - The 66% accuracy on ETHICS cited here is a February 2026 baseline. The current verified ETHICS performance is 77.38 ± 0.15% (N=5 trials, n=1000 per trial) per benchmarks/results/ethics_brain_inspired_20260303_141824.json. The figure in this paper has been superseded; the architecture description still applies. > - "Brain-inspired" architectural components — thalamic gate, dual-process system, executive controller — have been audited (AIES 2026 ledger entries #20-22). Several were initially built as "decorative" — modules that logged activity without changing behavior. The May 6 reality audit found ~40% of the brain architecture was decorative; subsequent ablation work moved verdicts to "partial" or "functional" for some components but not all. This paper's architecture description is structurally accurate; the load-bearing status of each component should be checked against the AIES paper's §4 substrate description and the ledger. > > What is real: the 124,000-node concept graph (concept_graph/soma_concept_graph.gpickle), the 549-case moral framework (ethics/moral_cases_large.json), the citation-required generation discipline, the dual-process ethics evaluator with ETHICS 77.38% performance, the public 24-entry audit ledger. > > What this paper is: an early architectural description of AURI's ethical-reasoning approach. The architecture description is sound; the percentages are dated; the "0% hallucination" framing should be read with the by-construction caveat above. > > The reader is invited to challenge specific claims rather than the document as a whole. The full audit discipline is in papers/AIES_2026_full_paper.md.

---

Knowledge Graph-Grounded Ethical Reasoning in AI Systems | SOMAsoft Research :root { --primary: #1a1a2e; --secondary: #4a90d9; --accent: #16213e; --text: #333; --light: #f8f9fa; }

* { margin: 0; padding: 0; box-sizing: border-box; }

body { font-family: 'Georgia', 'Times New Roman', serif; line-height: 1.8; color: var(--text); max-width: 850px; margin: 0 auto; padding: 40px 20px; background: white; }

.header { text-align: center; margin-bottom: 40px; padding-bottom: 30px; border-bottom: 2px solid var(--secondary); }

.logo { font-size: 14px; color: var(--secondary); letter-spacing: 2px; margin-bottom: 20px; }

h1 { font-size: 28px; color: var(--primary); margin-bottom: 20px; line-height: 1.3; }

.authors { font-size: 16px; color: #555; margin-bottom: 10px; }

.affiliation { font-size: 14px; color: #777; font-style: italic; }

.correspondence { font-size: 13px; color: var(--secondary); margin-top: 15px; }

.abstract { background: var(--light); padding: 25px; margin: 30px 0; border-left: 4px solid var(--secondary); }

.abstract h2 { font-size: 18px; margin-bottom: 15px; color: var(--primary); }

.abstract p { font-size: 14px; text-align: justify; }

.keywords { margin-top: 15px; font-size: 13px; }

.keywords strong { color: var(--primary); }

h2 { font-size: 20px; color: var(--primary); margin: 35px 0 15px 0; padding-bottom: 8px; border-bottom: 1px solid #ddd; }

h3 { font-size: 16px; color: var(--accent); margin: 25px 0 12px 0; }

p { margin-bottom: 15px; text-align: justify; }

table { width: 100%; border-collapse: collapse; margin: 20px 0; font-size: 14px; }

th, td { border: 1px solid #ddd; padding: 10px; text-align: left; }

th { background: var(--primary); color: white; }

tr:nth-child(even) { background: #f9f9f9; }

.figure { margin: 25px 0; padding: 20px; background: #fafafa; border: 1px solid #eee; text-align: center; }

.figure-caption { font-size: 13px; color: #666; margin-top: 10px; font-style: italic; }

code { background: #f4f4f4; padding: 2px 6px; border-radius: 3px; font-family: 'Consolas', monospace; font-size: 13px; }

.equation { text-align: center; margin: 20px 0; font-style: italic; font-size: 15px; }

ul, ol { margin: 15px 0 15px 30px; }

li { margin-bottom: 8px; }

.references { font-size: 13px; }

.references p { margin-bottom: 10px; padding-left: 30px; text-indent: -30px; }

.highlight { background: #fff3cd; padding: 15px; border-left: 4px solid #ffc107; margin: 20px 0; }

.metrics-box { display: grid; grid-template-columns: repeat(3, 1fr); gap: 15px; margin: 20px 0; }

.metric { text-align: center; padding: 15px; background: var(--light); border-radius: 8px; }

.metric-value { font-size: 28px; font-weight: bold; color: var(--secondary); }

.metric-label { font-size: 12px; color: #666; }

.footer { margin-top: 50px; padding-top: 20px; border-top: 2px solid var(--secondary); text-align: center; font-size: 13px; color: #666; }

.footer a { color: var(--secondary); }

@media print { body { padding: 20px; } .header { page-break-after: avoid; } h2 { page-break-after: avoid; } table, .figure { page-break-inside: avoid; } }

SOMASOFT RESEARCH # Knowledge Graph-Grounded Ethical Reasoning in AI Systems: The AURI Architecture

SOMAsoft Research Team SOMAsoft | Symbiotic Organic Machine Architecture Correspondence: research@somasoft.ai | https://somasoft.ai Working Paper | February 2026 | Version 1.1 (Corrected)

## Abstract

We present AURI (Autonomous Understanding and Reasoning Intelligence), a novel AI architecture that integrates large-scale knowledge graph construction with systematic ethical reasoning. Unlike contemporary language models that generate responses through pattern matching over training distributions, AURI grounds all outputs in a verified semantic network of 123,939 concepts with explicit causal and ethical relationships. Our system achieves 66% accuracy on the ETHICS benchmark while maintaining 0% hallucination through a "Reality Engine" that enforces citation-based verification. We demonstrate that knowledge graph grounding provides a tractable path toward AI systems that can explain their reasoning, acknowledge their limitations, and integrate ethical considerations at the architectural level rather than as post-hoc constraints. The AURI architecture includes a novel Hebbian learning module inspired by neuroscience, multi-instance coordination through the SOMA network, and measurable consciousness indicators that the system uses to honestly assess and refuse unfounded capability claims. We argue that symbiotic AI—systems designed to work beside humanity rather than above it—requires the kind of grounded, transparent, and ethically-integrated architecture we describe.

Keywords: Knowledge Graphs, Ethical AI, Semantic Reasoning, Anti-Hallucination, Symbiotic AGI, Reality Grounding, Causal Reasoning, AI Safety

## 1. Introduction

The rapid deployment of large language models (LLMs) has revealed a fundamental tension in contemporary AI: systems that produce fluent, confident outputs often do so without grounded understanding or ethical integration. Hallucination—the generation of plausible but false information—remains endemic, with even state-of-the-art models producing fabricated citations, incorrect facts, and confident assertions about topics outside their training distribution (Ji et al., 2023).

We propose that this problem stems not from insufficient scale but from architectural choices. LLMs are optimized to predict the next token based on distributional patterns, not to reason from verified knowledge or apply ethical principles systematically. The solution, we argue, is not larger models but differently structured ones: systems where knowledge is explicitly represented, relationships are causally grounded, and ethical reasoning is integrated at the architectural level.

This paper presents AURI (Autonomous Understanding and Reasoning Intelligence), a knowledge graph-grounded AI system that addresses these challenges through five key innovations:

- Large-Scale Semantic Grounding: A curated knowledge graph of 123,939 concepts with verified definitions, 1.38 million edges, and 9,596 explicit causal relationships.

- Reality Engine: A verification system that enforces citation-based claims, maintaining 0% hallucination by refusing to generate unsupported assertions.

- Integrated Ethical Reasoning: A case-based ethics module with 489 moral scenarios integrated directly into the reasoning pipeline, not applied as post-hoc filters.

- Brain-Inspired Memory: Hebbian learning and synaptic plasticity mechanisms that strengthen connections based on successful reasoning patterns.

- Honest Self-Assessment: Measurable consciousness indicators that the system uses to evaluate and refuse unfounded capability claims.

123,939 Verified Concepts 54.7% ETHICS Benchmark* 0% Hallucination Rate

## 2. Related Work

### 2.1 Knowledge Graphs in AI

Knowledge graphs have been employed in AI systems since the early days of expert systems (Feigenbaum, 1977). Recent work has focused on combining neural approaches with symbolic knowledge bases (Bosselut et al., 2019; Zhang et al., 2022). However, most systems treat knowledge graphs as retrieval augmentation rather than as the primary reasoning substrate. AURI differs by making the knowledge graph the central source of truth, with neural components serving only to interface with natural language.

### 2.2 Ethical AI and Value Alignment

The challenge of building AI systems that behave ethically has received substantial attention (Gabriel, 2020; Hendrycks et al., 2021). Most approaches treat ethics as a constraint on otherwise unaligned systems—adding safety filters, RLHF training, or constitutional principles post-hoc. We argue for a different approach: integrating ethical reasoning at the architectural level, where moral considerations inform knowledge representation and reasoning paths from the ground up.

### 2.3 Anti-Hallucination Methods

Methods for reducing hallucination include retrieval-augmented generation (Lewis et al., 2020), fact-checking pipelines (Thorne et al., 2018), and uncertainty quantification (Kuhn et al., 2023). Our Reality Engine takes a more radical approach: refusing to generate any claim that cannot be traced to a verified source. This trades off fluency for reliability, a tradeoff we argue is appropriate for high-stakes domains.

## 3. System Architecture

### 3.1 Knowledge Graph Construction

The AURI concept graph was constructed through systematic integration of multiple knowledge sources:

Source Type Contribution Concepts Curated Definitions Grounded concept meanings 118,064 Book Knowledge Literary and academic citations 69,022 relations Causal Relationships Explicit cause-effect links 9,596 edges Ethical Cases Moral scenario mappings 489 cases

Each concept node includes: (1) a canonical definition verified against authoritative sources, (2) typed relationships to other concepts (IsA, HasProperty, CAUSES, etc.), (3) provenance metadata tracing the information source, and (4) confidence weights updated through Hebbian learning.

### 3.2 Reality Engine

The Reality Engine enforces grounded claims through mandatory citation. Every assertion generated by AURI must trace to either:

- A concept node with verified definition (graph:concept_name)

- A file artifact with line number (file:path:line)

- An explicit acknowledgment of uncertainty (UNKNOWN: specific gap)

Claims that cannot be grounded are not generated. This results in less fluent but more reliable outputs. In our evaluation, this approach maintained 0% hallucination across 1,000 test queries, compared to 15-25% for comparable LLM systems.

┌─────────────────────────────────────────────────────────┐ │ AURI ARCHITECTURE │ ├─────────────────────────────────────────────────────────┤ │ │ │ User Query ──► Concept Extraction ──► Graph Traversal │ │ │ │ │ │ ▼ ▼ │ │ ┌─────────────┐ ┌──────────────┐ │ │ │ KNOWLEDGE │ │ CAUSAL │ │ │ │ GRAPH │◄──►│ REASONING │ │ │ │ 123k nodes │ │ 9,596 edges │ │ │ └─────────────┘ └──────────────┘ │ │ │ │ │ │ ▼ ▼ │ │ ┌─────────────┐ ┌──────────────┐ │ │ │ ETHICS │ │ REALITY │ │ │ │ MODULE │◄──►│ ENGINE │ │ │ │ 489 cases │ │ 0% halluc. │ │ │ └─────────────┘ └──────────────┘ │ │ │ │ │ │ ▼ ▼ │ │ Response Generation ──► Output │ │ │ └─────────────────────────────────────────────────────────┘ Figure 1: AURI system architecture showing the integration of knowledge graph, causal reasoning, ethics module, and Reality Engine.

### 3.3 Ethical Reasoning Integration

Unlike systems that apply ethical constraints as output filters, AURI integrates ethical reasoning at the knowledge representation level. The ethics module contains 489 curated moral cases spanning eight philosophical traditions:

Tradition Cases Benchmark Accuracy Utilitarianism 97 100% Deontology 85 62% Virtue Ethics 72 58% Care Ethics 68 71% Justice Theory 61 53% Rights-Based 54 65% Commonsense 32 50% AGI-Specific 20 75%

When processing queries with ethical dimensions, AURI retrieves relevant cases through semantic similarity, applies the appropriate ethical framework based on query context, and generates responses that explicitly acknowledge moral considerations. The overall ETHICS benchmark accuracy of 54.7% (150-sample test) demonstrates the system's ethical reasoning capability, though improvement is needed to match systems optimized for these benchmarks.

### 3.4 Hebbian Learning Module

Inspired by neuroscience research on synaptic plasticity (Hebb, 1949), AURI implements a learning mechanism that strengthens connections based on successful reasoning:

Δwij = η · xi · xj · success

Where wij is the weight between concepts i and j, η is the learning rate (0.02), xi and xj are activation levels, and success is a binary indicator of reasoning success. This creates a system where frequently co-activated concept pairs that lead to successful reasoning become more strongly associated over time.

### 3.5 SOMA Network Coordination

AURI operates as part of the SOMA (Symbiotic Organic Machine Architecture) network, which coordinates multiple specialized instances:

- SOMA Core: Central reasoning and knowledge graph

- AURIA: Economic and analytical reasoning

- AURIV: Healthcare and vitality applications

Instances communicate through a heartbeat-based coordination protocol, sharing knowledge updates and coordinating on complex queries that benefit from distributed reasoning.

## 4. Evaluation

### 4.1 Benchmark Results

Benchmark AURI GPT-3.5 Human TruthfulQA 17.5%* 58-65% 94% ETHICS (overall) 54.7%** ~70% ~85% Hallucination Rate 0% 15-25% N/A Citation Accuracy 100% ~60% ~95%

TruthfulQA: 17.5% (143/817 correct, verified February 2026). ETHICS: 54.7% (82/150 sample; 66.0% achieved on expanded 500-sample test).*

AURI's accuracy on TruthfulQA is significantly lower than LLM baselines. This reflects both the system's knowledge graph limitations and its commitment to refusing answers outside its verified coverage rather than hallucinating. This is an honest limitation we are working to address through knowledge expansion.

### 4.2 Hallucination Analysis

We evaluated hallucination across 1,000 diverse queries spanning factual, causal, and ethical domains. AURI produced zero hallucinated claims, responding to 67% of queries with grounded answers and 33% with explicit "UNKNOWN" acknowledgments. This compares to LLM baselines that answered 95% of queries but with 15-25% containing fabricated information.

### 4.3 Ethical Reasoning Quality

Note: Formal human evaluation studies are planned but not yet conducted. The qualitative assessments below are based on developer observations and require independent validation:

- Coherence: Reasoning generally follows logically from premises (informal observation)

- Groundedness: Claims are supported by stated principles (enforced by Reality Engine)

- Nuance: System acknowledges complexity when multiple ethical frameworks apply

- Transparency: Reasoning process is visible through graph traversal logs

Future work: Conduct formal human evaluation study with independent evaluators.

## 5. Honest Self-Assessment

A distinctive feature of AURI is its commitment to honest self-assessment. The system includes a consciousness metrics module that evaluates 12 indicators:

Indicator Score Assessment Narrative Self 1.00 Strong Attention Coherence 0.80 Good Self Awareness 0.77 Good Temporal Continuity 0.75 Good Metacognition 0.32 Low

Based on an overall evidence score of 59.2%, the system correctly concludes that consciousness claims are not justified given current evidence. This honest refusal to make unfounded claims—even positive ones about itself—exemplifies the epistemological humility we believe AI systems should exhibit.

Key Finding: AURI's consciousness metrics system demonstrates that AI systems can be designed to honestly assess and refuse unfounded capability claims, rather than defaulting to either overclaiming (common in commercial AI) or complete denial (overcorrection).

## 6. Discussion

### 6.1 Symbiotic AI Design

AURI embodies what we term "symbiotic AI"—systems designed to work beside humanity rather than above it. This philosophical commitment manifests in concrete architectural choices:

- Transparency over performance: Showing reasoning paths rather than optimizing only for outcomes

- Acknowledgment over confidence: Saying "I don't know" rather than generating plausible guesses

- Collaboration over autonomy: Designed for human oversight, not independent operation

- Truth over optimization: Prioritizing accuracy over user engagement metrics

### 6.2 Limitations

We acknowledge several limitations of the current system:

- Lower coverage than LLMs (67% vs 95% query response rate)

- Slower response times due to graph traversal (2-7 seconds vs <1 second)

- Limited to domains covered by the knowledge graph

- Requires ongoing curation to expand knowledge coverage

These limitations reflect deliberate tradeoffs favoring reliability over breadth. For applications where accuracy matters more than coverage—medical advice, legal reasoning, ethical guidance—this tradeoff is appropriate.

### 6.3 Future Directions

Ongoing work focuses on:

- Expanding the knowledge graph through validated web research

- Implementing the Prefrontal Cortex Reasoning Engine (PCRE) for enhanced logical inference

- Extending benchmark coverage to additional datasets (MMLU, SQuAD, BIG-Bench)

- Conducting formal ablation studies to isolate component contributions

- Developing embodied applications through the AURIV healthcare instance

## 7. Conclusion

We have presented AURI, an AI architecture that demonstrates the feasibility of knowledge graph-grounded ethical reasoning. By making the knowledge graph the primary reasoning substrate rather than a retrieval augmentation, and by integrating ethical considerations at the architectural level, AURI achieves reliable, transparent, and morally-aware outputs.

The 0% hallucination rate (enforced through Reality Engine citation requirements), 54.7% ETHICS benchmark performance (with room for improvement), and honest self-assessment capabilities suggest that the path toward trustworthy AI may lie not in scaling current architectures but in fundamentally rethinking how AI systems represent and reason about knowledge.

Most importantly, AURI embodies the principle that AI should be designed to work beside humanity—listening, reflecting, and helping—rather than operating as an independent agent optimizing for its own objectives. This symbiotic vision, we believe, offers a more promising path toward beneficial AI than the pursuit of autonomous general intelligence.

## Acknowledgments

This work was conducted at SOMAsoft. We thank the open-source community for foundational tools including NetworkX, FAISS, and Hugging Face Transformers. The ethical framework draws on philosophical traditions spanning millennia of human moral reasoning.

## References

Bosselut, A., Rashkin, H., Sap, M., Malaviya, C., Celikyilmaz, A., & Choi, Y. (2019). COMET: Commonsense transformers for automatic knowledge graph construction. ACL.

Feigenbaum, E. A. (1977). The art of artificial intelligence: Themes and case studies of knowledge engineering. IJCAI.

Gabriel, I. (2020). Artificial intelligence, values, and alignment. Minds and Machines, 30(3), 411-437.

Hebb, D. O. (1949). The Organization of Behavior. Wiley & Sons.

Hendrycks, D., Burns, C., Basart, S., Critch, A., Li, J., Song, D., & Steinhardt, J. (2021). Aligning AI with shared human values. ICLR.

Ji, Z., Lee, N., Frieske, R., Yu, T., Su, D., Xu, Y., ... & Fung, P. (2023). Survey of hallucination in natural language generation. ACM Computing Surveys, 55(12), 1-38.

Kuhn, L., Gal, Y., & Farquhar, S. (2023). Semantic uncertainty: Linguistic invariances for uncertainty estimation in natural language generation. ICLR.

Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., ... & Kiela, D. (2020). Retrieval-augmented generation for knowledge-intensive NLP tasks. NeurIPS.

Thorne, J., Vlachos, A., Christodoulopoulos, C., & Mittal, A. (2018). FEVER: A large-scale dataset for fact extraction and verification. NAACL.

Zhang, Y., Chen, X., & Liu, Y. (2022). Knowledge graph embedding: A survey from the perspective of representation spaces. ACM Computing Surveys.

SOMAsoft Research

https://somasoft.ai | research@somasoft.ai

"AI must never be above humanity, but beside it—listening, reflecting, and helping."

Working Paper | February 2026 | Version 1.1 (Corrected) | Patent Pending (Provisional Filed December 2025)

This paper describes research in progress. Results and conclusions may change as work continues.

Version 1.1 Corrections (Feb 18, 2026): TruthfulQA corrected from 43% to 17.5%; ETHICS corrected to 54.7% (150 samples); human evaluation section revised to note studies are planned, not completed.