ACP vs RAG: Why Deterministic Context Beats Probabilistic Retrieval
Explanation for developers evaluating context approaches, ml/ai engineers
Document Type: Explanation
Audience: Developers evaluating context approaches, ML/AI engineers
Reading Time: 10 minutes
Executive Summary
RAG (Retrieval-Augmented Generation) uses similarity search to find "relevant" context for AI queries. This works well for knowledge retrieval but fails for constraint enforcement, where you need 100% guarantee that critical information is visible.
ACP takes a fundamentally different approach: pre-computed, deterministic context that's always available through exact-match queries—not retrieved by similarity.
Understanding RAG for Code
How RAG Works
┌─────────────────────────────────────────────────────────────────┐
│ RAG PIPELINE FOR CODE │
├─────────────────────────────────────────────────────────────────┤
│ │
│ 1. CHUNKING │
│ Split code into overlapping chunks (512-2048 tokens) │
│ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ │
│ │chunk1│ │chunk2│ │chunk3│ │chunk4│ ... │
│ └──────┘ └──────┘ └──────┘ └──────┘ │
│ │
│ 2. EMBEDDING │
│ Convert chunks to vectors │
│ [0.23, -0.45, 0.12, ...] per chunk │
│ │
│ 3. STORAGE │
│ Store in vector database (Pinecone, Weaviate, etc.) │
│ │
│ 4. RETRIEVAL (per query) │
│ Query → embed → similarity search → top-k chunks │
│ │
│ 5. INJECTION │
│ Inject retrieved chunks into AI context │
│ │
└─────────────────────────────────────────────────────────────────┘RAG Strengths
| Strength | Description |
|---|---|
| Automatic | No manual annotation required |
| Scalable | Works with large codebases |
| Flexible | Natural language queries |
| Semantic understanding | Finds conceptually related code |
Where RAG Fails for Constraints
Problem 1: Similarity ≠ Constraint Relevance
Query: "Help me improve the authentication flow"
Embedding similarity ranking:
┌──────────────────────────────────────────────────────────────┐
│ Rank │ File │ Similarity │ Constraint │
├──────┼─────────────────────────┼────────────┼────────────────┤
│ 1 │ auth/login.ts │ 0.92 │ normal │
│ 2 │ auth/register.ts │ 0.89 │ normal │
│ 3 │ auth/password-reset.ts │ 0.87 │ normal │
│ ... │ ... │ ... │ ... │
│ 12 │ auth/session.ts │ 0.71 │ FROZEN ⚠️ │
│ 15 │ auth/crypto.ts │ 0.68 │ FROZEN ⚠️ │
└──────────────────────────────────────────────────────────────┘
Top-5 retrieval: ✓ Relevant code found
✗ Frozen constraints MISSEDThe most semantically similar code isn't the most constrained code. Session management and crypto are foundational concerns with different vocabulary than "authentication flow."
Problem 2: Constraints Are Metadata, Not Content
Consider this frozen file:
// @acp:lock frozen - Security critical, DO NOT modify
export const JWT_SECRET = process.env.JWT_SECRET;
export const TOKEN_EXPIRY = 3600;The constraint ("frozen - Security critical, DO NOT modify") has no semantic relationship to the content (JWT configuration). Embedding the content doesn't capture the constraint.
Problem 3: Code Embedding Collapse
Code is structurally homogeneous. These functions embed nearly identically:
// All ~0.95 similarity to each other:
async function validateUser(user: User): Promise<boolean> { ... }
async function validatePayment(payment: Payment): Promise<boolean> { ... }
async function validateOrder(order: Order): Promise<boolean> { ... }
async function validateSession(session: Session): Promise<boolean> { ... }
// But constraints differ wildly:
// validateUser → normal
// validatePayment → FROZEN (PCI compliance)
// validateOrder → normal
// validateSession → FROZEN (security critical)Problem 4: Chunking Destroys Context
File-level constraints get split across chunks:
Original file:
┌────────────────────────────────────────────────────────────┐
│ // @acp:lock frozen - Security critical │
│ // @acp:domain auth │
│ // @acp:owner security-team │
│ │
│ export class SessionValidator { │
│ // ... 500 lines of code ... │
│ } │
└────────────────────────────────────────────────────────────┘
After chunking:
┌────────────────────┐ ┌────────────────────┐ ┌─────────────┐
│ Chunk 1: │ │ Chunk 2: │ │ Chunk 3: │
│ // @acp:lock frozen│ │ validateToken() { │ │ // helper │
│ // @acp:domain auth│ │ jwt.verify(...) │ │ functions │
│ export class... │ │ return session; │ │ ... │
└────────────────────┘ └────────────────────┘ └─────────────┘
▲ ▲
│ │
Constraint here Code here
(might be retrieved without constraint)If Chunk 2 is retrieved without Chunk 1, the AI sees the code but not the constraint.
Problem 5: Top-K Misses
Your codebase statistics:
- Total files: 1,000
- Frozen files: 10 (1%)
- Top-k retrieval: 20 chunks
For a random query about code modification:
- P(frozen file in top-20): ~2%
- P(frozen file missed): ~98%
For constraints, a 98% miss rate is catastrophic.Problem 6: Per-Request Overhead
Every RAG query incurs:
- Embedding computation
- Vector similarity search
- Context injection
For a coding session with 100 queries, that's 100 retrieval operations—even when constraints haven't changed.
How ACP Solves This
Approach: Minimal Bootstrap + On-Demand Expansion
┌─────────────────────────────────────────────────────────────────┐
│ ACP APPROACH │
├─────────────────────────────────────────────────────────────────┤
│ │
│ 1. ANNOTATION (one-time) │
│ Add structured annotations to code │
│ // @acp:lock frozen - Security critical │
│ │
│ 2. INDEXING (on commit) │
│ acp index → .acp.cache.json │
│ { │
│ "constraints": { │
│ "by_lock_level": { │
│ "frozen": ["session.ts", "crypto.ts"] │
│ } │
│ } │
│ } │
│ │
│ 3. BOOTSTRAP (always injected, ~40 tokens) │
│ "ACP active. Frozen files: session.ts, crypto.ts. │
│ Run `acp primer` for full context." │
│ │
│ 4. ON-DEMAND EXPANSION │
│ AI requests more context only when needed: │
│ - `acp primer --domain auth` for domain details │
│ - `acp query '.constraints'` for specific data │
│ - Variable expansion for symbol details │
│ │
│ Result: 100% constraint visibility, minimal token overhead │
│ │
└─────────────────────────────────────────────────────────────────┘Key Differences
| Aspect | RAG | ACP |
|---|---|---|
| Query type | Similarity search | Exact-match query |
| Retrieval | Probabilistic | Deterministic |
| Constraint visibility | ~80% (top-k dependent) | 100% |
| Injection model | Per-request retrieval | Bootstrap + on-demand |
| Token cost | O(k) per request | O(1) bootstrap (~40 tok) |
| Data structure | Unstructured chunks | Structured JSON |
| Versioning | Requires re-embedding | Git-integrated |
The Math: Why Deterministic Matters
RAG Constraint Visibility
P(constraint visible) = P(relevant chunk in top-k)
For a 1000-file codebase with 10 frozen files and k=20:
- Best case (frozen files cluster): ~40%
- Worst case (frozen files dispersed): ~2%
- Average case: ~15-20%
Even at 80% visibility:
- 10 frozen files × 20% miss rate = 2 frozen files invisible per query
- Over 50 queries = 100 potential constraint violationsACP Constraint Visibility
P(constraint visible) = 100%
All constraints in primer at session start.
All queries return complete, accurate data.
Zero probabilistic uncertainty.When RAG Makes Sense
RAG is excellent for:
| Use Case | Why RAG Works |
|---|---|
| Documentation search | Finding relevant docs, not enforcing rules |
| Code exploration | Discovering similar implementations |
| Knowledge Q&A | "How does X work?" queries |
| Large corpus search | When exhaustive retrieval isn't needed |
RAG fails for:
| Use Case | Why RAG Fails |
|---|---|
| Constraint enforcement | Must guarantee visibility |
| Access control | Can't probabilistically check permissions |
| Critical metadata | 80% accuracy is unacceptable |
| Audit requirements | Must prove constraint was visible |
Hybrid Architecture
The optimal approach uses both:
┌─────────────────────────────────────────────────────────────────┐
│ HYBRID ARCHITECTURE │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────┐ ┌─────────────────────┐ │
│ │ ACP │ │ RAG │ │
│ │ │ │ │ │
│ │ Constraints │ │ Code Discovery │ │
│ │ Domains │ │ Similar Patterns │ │
│ │ Ownership │ │ Documentation │ │
│ │ Architecture │ │ Knowledge Base │ │
│ │ │ │ │ │
│ │ DETERMINISTIC │ │ PROBABILISTIC │ │
│ │ "What are rules?" │ │ "Find similar" │ │
│ └─────────────────────┘ └─────────────────────┘ │
│ │ │ │
│ └────────────┬───────────────┘ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ AI Assistant │ │
│ │ │ │
│ │ Uses RAG for │ │
│ │ exploration │ │
│ │ │ │
│ │ Respects ACP │ │
│ │ for constraints│ │
│ └─────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘Example: Constraint-Aware Code Search
User: "Find code similar to our payment validation and help me improve it"
1. ACP Bootstrap (always present, ~40 tokens):
"ACP active. Frozen: payments/processor.ts, payments/gateway.ts.
Run `acp primer --domain payments` for details."
2. RAG Search:
- Query: "payment validation"
- Results: processor.ts (frozen), gateway.ts (frozen), utils.ts, billing/validator.ts
3. AI Behavior:
- Already knows from bootstrap that processor.ts and gateway.ts are frozen
- "I found similar code in processor.ts and gateway.ts, but these are frozen."
- "I can help you improve utils.ts and billing/validator.ts instead."
- "If you need to modify the frozen files, contact the payments team."
4. AI requests more context (on-demand):
- Runs: acp primer --domain payments
- Gets detailed ownership, architecture constraints for informed suggestionsImplementation Comparison
RAG Setup
# Install dependencies
pip install chromadb sentence-transformers
# Chunk codebase
python chunk_code.py --input ./src --output ./chunks
# Embed and store
python embed_chunks.py --chunks ./chunks --db ./vectordb
# Query (per request)
python query.py "help me with auth" --db ./vectordb --top-k 20Maintenance: Re-embed on every code change.
ACP Setup
# Install CLI
npm install -g @acp-protocol/cli
# Add annotations (one-time)
# @acp:lock frozen - Security critical
# Index (on commit)
acp index
# Query (exact-match)
acp query '.constraints.by_lock_level.frozen'Maintenance: Re-index on commit (incremental, fast).
Summary
| Aspect | RAG | ACP |
|---|---|---|
| Best for | Code discovery | Constraint enforcement |
| Retrieval | Probabilistic | Deterministic |
| Accuracy | ~80% | 100% |
| Query type | Natural language | Structured + NL |
| Setup | Complex (vectors) | Simple (CLI) |
| Maintenance | Re-embed on change | Re-index on change |
Recommendation: Use RAG for exploration and discovery. Use ACP for constraints and critical metadata. They complement each other.
Further Reading
- Why ACP? — Complete alternatives analysis
- ACP vs MCP — Protocol comparison
- Design Philosophy — Core principles
This document is part of the ACP Documentation. Report issues