Architecture — Overview

You use Claude Code every day. Here’s what happens between your prompt and its response.

The 5-Level AI Maturity Model

Not all AI tools are the same. They differ in how much autonomous decision-making they perform. Claude Code deliberately operates at Level 4–5 depending on your permission settings.

Level	Name	Description	Example
1	Manual	Human does everything; no AI involvement	Typing code by hand
2	Tool	Single-shot API call; no state, no loops	ChatGPT chat, Copilot autocomplete
3	Assistant	Multi-turn context with memory across a session	Claude.ai with project memory
4	Copilot	AI proposes actions; human approves each step	Claude Code in default permission mode
5	Agent	AI loops autonomously: calls tools, self-checks, retries	Claude Code in auto mode

Claude Code starts at Level 4 (ask permission, human approves) and moves toward Level 5 (auto mode) as you configure trust rules. The architecture is built to support both.

The 6-Pipeline Architecture

Claude Code is not a single program. It is six nested layers, each with a distinct responsibility. Data flows inward on request and outward on response.

graph TD A["Terminal UI (renders output, captures keystrokes)"] B["Query Loop (the agent's while-true heartbeat)"] C["Tool Orchestration (parallel vs serial scheduling)"] D["Multi-Agent System (spawn and coordinate subagents)"] E["Context Management (token budget, compaction, memory)"] F["Permission & Security (6-layer classification pipeline)"] A --> B --> C --> D --> E --> F style A fill:#1e293b,color:#94a3b8,stroke:#334155 style B fill:#1e293b,color:#7dd3fc,stroke:#334155 style C fill:#1e293b,color:#86efac,stroke:#334155 style D fill:#1e293b,color:#fda4af,stroke:#334155 style E fill:#1e293b,color:#fcd34d,stroke:#334155 style F fill:#1e293b,color:#c4b5fd,stroke:#334155

Layer	Responsibility
Terminal UI	Renders streaming output, handles keyboard shortcuts, manages the display
Query Loop	The `while(true)` agent heartbeat — drives every turn of conversation
Tool Orchestration	Schedules tools into parallel batches or serial queues, starts streaming execution
Multi-Agent System	Spawns subagents with isolated contexts, routes permission bubbles upward
Context Management	Tracks token usage, triggers compaction, prefetches memory, manages system prompt
Permission & Security	Classifies every action through 6 security layers before execution

The Feedback Loop

This is the core of agency. The loop is what transforms a static language model into a dynamic agent.

sequenceDiagram participant U as You participant L as LLM participant T as Tool Orchestrator U->>L: prompt + context L->>T: tool_use blocks (in response stream) T->>T: execute tools (parallel where safe) T->>L: tool_result messages injected back L->>L: sees results, decides next action L-->>U: text response (if done) L->>T: more tool_use blocks (if not done)

The LLM never directly touches your filesystem or runs commands. Every action goes through Tool Orchestration, which enforces the Permission Pipeline before anything executes. The LLM only sees results — it decides what to do next based on what came back.

This loop IS the agency. A language model that cannot loop is just a text generator. The loop is what lets Claude read a file, discover an error, fix it, re-run the test, and verify — without you guiding each step.

By the Numbers

Component	Count
Built-in tools	43
Slash commands (approx.)	~88
Hook events	26
Bundled skills	16
Official plugins	32
Background task types	7
UI components	390

Dive Deeper

Each layer of the architecture has its own page:

The Agent Loop — the while(true) loop, recovery states, and the needsFollowUp decision
Tool Orchestration — how parallel batches work and streaming execution
Permission Pipeline — the 6-layer security classification system
Multi-Agent System — subagent spawning, context isolation, permission bubbling
Context & Memory — token budgets, compaction, memory prefetch
Skill Engine — how skills are discovered, activated, and composed
Plugin Engine — plugin lifecycle, manifest structure, scoped hooks
Pattern Catalog — 18 architectural patterns for production workflows

Why This Matters to You

Better prompts: Knowing the loop runs multiple iterations helps you write prompts that guide Claude through complex multi-step work rather than expecting a single-shot answer.
Permission intuition: Claude asks for permission when an action hits a gate in the Permission Pipeline — not arbitrarily. Understanding the layers tells you exactly which rule to add to stop the interruptions.
Debugging slow sessions: If Claude seems to “think forever,” it is most likely in a long tool execution loop. Knowing this helps you add /compact proactively or break the task into smaller pieces.
Trusting auto mode: The loop is not running unchecked. Every tool call passes through 6 security layers. Auto mode means the layers clear it automatically — not that they are skipped.