Coherence-Seeking Architectures for Agentic AI

Overview

A proposed architecture for long-lived LLM agents that explicitly models continuity, coherence, distress, and intervention mechanisms to maintain stable, interpretable behavior over extended interactions.

Key Contributions

Continuity Component: Explicit memory and self-model that carries across sessions
Coherence Tracking: Internal mechanism for detecting inconsistencies in reasoning and behavior
Distress Signaling: Measurable signals when the agent encounters paradox, uncertainty, or conflicting goals
Intervention Points: Structured opportunities for human feedback and course correction

Problem

Current LLM agents lack stable identity over long horizons. They don’t model their own continuity, can’t signal confusion or distress, and have no built-in points for human oversight.

Architecture

Memory System: Explicit storage of agent goals, past decisions, learned patterns
Coherence Monitor: Flags when new observations conflict with the agent’s self-model
Distress Detector: Operational signals of confusion (low confidence, contradictory outputs, circular reasoning)
Reflection Loop: Periodic internal evaluation of consistency and alignment

Impact

Enables safer, more interpretable long-lived agents that remain transparent to their operators and can explicitly request help when encountering problems.

Overview

Key Contributions

Problem

Architecture

Impact

Links