AI coding assistants are powerful. But running multiple agents across different machines? That’s where things get interesting; and chaotic.

I’ve built a workflow that lets multiple AI agents work on the same codebase without stepping on each other’s toes. This post breaks down the complete system using open-source tools.

The Problem

Imagine this: You have Claude running in your IDE, another instance in your terminal, and maybe a third on your laptop. All working on the same repo. What could go wrong?

  • Race conditions: Two agents pick up the same task
  • Conflicting changes: Agent A refactors a function while Agent B adds to it
  • Build artifact conflicts: Agent A’s .venv gets corrupted by Agent B’s uv sync
  • Lost context: Agent crashes mid-task, work is abandoned
  • Configuration drift: Each agent follows different rules
  • No visibility: You have no idea what each agent is doing

The solution isn’t to use fewer agents; it’s to build coordination infrastructure.

Architecture Overview

┌──────────────────────────────────────────────────────────────┐
│                   Agent Configuration Layer                  │
│  ┌──────────────┐  ┌──────────────┐  ┌────────────────────┐  │
│  │ Ruler        │  │ Skillz       │  │ MCP Servers        │  │
│  │ (Rules)      │  │ (Validation) │  │ (Tools)            │  │
│  └──────────────┘  └──────────────┘  └────────────────────┘  │
├──────────────────────────────────────────────────────────────┤
│                    Issue Tracking Layer                      │
│  ┌────────────────────────────────────────────────────────┐  │
│  │ Beads - Git-native issue tracking                      │  │
│  │ - JSONL format, one issue per line                     │  │
│  │ - Sync via git pull/push                               │  │
│  │ - CLI for agents, not web UI                           │  │
│  └────────────────────────────────────────────────────────┘  │
├──────────────────────────────────────────────────────────────┤
│                    Coordination Layer                        │
│  ┌────────────────┐  ┌────────────────┐  ┌────────────────┐  │
│  │ Advisory Locks │  │ Git Worktrees  │  │ Claim Protocol │  │
│  │ (Who works)    │  │ (Isolation)    │  │ (Atomic ops)   │  │
│  └────────────────┘  └────────────────┘  └────────────────┘  │
├──────────────────────────────────────────────────────────────┤
│                    Context & Learning Layer                  │
│  ┌────────────────────────────────────────────────────────┐  │
│  │ Agent Learning - Institutional knowledge               │  │
│  │ - PostgreSQL (project_dev) + Git JSONL sync            │  │
│  │ - Automatic extraction from commits                    │  │
│  │ - Context-aware retrieval with token budgets           │  │
│  │ - Staleness detection & confidence decay               │  │
│  └────────────────────────────────────────────────────────┘  │
├──────────────────────────────────────────────────────────────┤
│                    Git Integration Layer                     │
│  ┌────────────┐  ┌─────────────┐  ┌──────────────────────┐   │
│  │ pre-commit │  │ pre-push    │  │ post-merge           │   │
│  │ (lint)     │  │ (validate)  │  │ (auto-sync)          │   │
│  └────────────┘  └─────────────┘  └──────────────────────┘   │
└──────────────────────────────────────────────────────────────┘

Let’s build each layer.

Layer 1: Ruler - Unified Agent Configuration

Every agent needs the same rules. Without consistency, you get chaos. Ruler solves this.

The Problem with Multiple Agents

Different AI tools expect configuration in different places:

Tool Config Location
Claude Code CLAUDE.md
Cursor .cursor/rules/
GitHub Copilot .github/copilot-instructions.md
Windsurf .windsurfrules

Maintaining 5 copies of the same rules? Nightmare.

Ruler: Single Source of Truth

Ruler generates agent-specific configs from a single source:

.ruler/
├── ruler.toml          # Which agents to generate for
├── AGENTS.md           # Main rules (the source of truth)
├── python-patterns.md  # Python-specific rules
├── testing-rules.md    # Testing requirements
└── security-rules.md   # Security constraints

ruler.toml:

[agents.claude]
output = "CLAUDE.md"
mcp_output = ".mcp.json"

[agents.cursor]
output = ".cursor/rules/AGENTS.md"
mcp_output = ".cursor/mcp.json"

[agents.copilot]
output = ".github/copilot-instructions.md"

AGENTS.md (your rules):

# Project Rules

## Critical Constraints

1. **Never decrease test coverage** - 90%+ required
2. **TODOs must reference issues** - Format: `# TODO(#123): description`
3. **No secrets in code** - Use environment variables

## Before Every Commit

pytest && pylint --recursive=y .

## Code Style

- Type hints on ALL functions
- Async/await everywhere
- 80 char lines

Generate all configs:

npx @intellectronica/ruler apply

Now Claude, Cursor, and Copilot all follow the same rules.

Modular Rules

Split rules into focused files:

<!-- .ruler/AGENTS.md -->
# Project Rules

{{include:python-patterns.md}}
{{include:testing-rules.md}}
{{include:security-rules.md}}

Each file handles one concern. Ruler concatenates them.

Layer 2: Beads - Git-Native Issue Tracking

GitHub Issues are great for humans. Terrible for agents. Why?

  • API rate limits
  • Network latency
  • No offline access
  • Can’t atomically claim work

Beads solves this with git-native issue tracking.

Installation

# macOS
brew install steveyegge/tap/bd

# Or from source
cargo install beads

Initialize in Your Repo

cd your-project
bd init

This creates:

.beads/
├── config.yaml      # Configuration
├── issues.jsonl     # Issues (git-tracked)
└── metadata.json    # Local state

Basic Commands

# Create issues
bd create --title="Add user authentication" --type=feature --priority=2

# List open issues
bd list --status=open

# Show ready work (no blockers)
bd ready

# View issue details
bd show proj-abc1

# Claim work
bd update proj-abc1 --status=in_progress

# Complete work
bd close proj-abc1 --reason="Implemented in commit abc123"

# Sync with remote
bd sync

Why JSONL?

Issues are stored as JSON Lines; one issue per line:

{"id":"proj-001","title":"Add auth","status":"open","priority":2,"created_at":"2026-01-08T10:00:00Z"}
{"id":"proj-002","title":"Fix login","status":"in_progress","priority":1,"created_at":"2026-01-08T11:00:00Z"}
{"id":"proj-003","title":"Update docs","status":"closed","priority":3,"created_at":"2026-01-08T12:00:00Z"}

Benefits:

  • Git-native: Changes tracked like code
  • Merge-friendly: Conflicts are per-line, easy to resolve
  • Offline-capable: No network needed
  • Fast: Local SQLite for queries, JSONL for sync

Dependencies

Beads tracks issue dependencies:

# Create related issues
bd create --title="Design auth API" --type=task
# Created: proj-xyz1

bd create --title="Implement auth API" --type=task
# Created: proj-xyz2

# Add dependency (implement depends on design)
bd dep add proj-xyz2 proj-xyz1

# See what's blocked
bd blocked

# proj-xyz2 won't show in `bd ready` until proj-xyz1 is closed

Conflict Resolution

When two agents edit issues.jsonl simultaneously:

<<<<<<< HEAD
{"id":"proj-001","status":"in_progress","assignee":"agent-a"}
=======
{"id":"proj-001","status":"in_progress","assignee":"agent-b"}
>>>>>>> origin/main

Resolution rules (built into bd sync):

  • closed beats in_progress beats open
  • Later timestamp wins for same status
  • Both versions kept for different issues

After resolving: bd sync rebuilds the local database.

Layer 3: PostgreSQL Advisory Locks - Atomic Claims

The beads claim protocol has a race window:

# Agent A                           # Agent B
bd sync                             bd sync
# Both see issue proj-abc1 as open
bd update proj-abc1 --status=in_progress
                                    bd update proj-abc1 --status=in_progress
# Both agents now working on same issue!

The solution: PostgreSQL advisory locks.

Why Advisory Locks?

Feature Benefit
Session-level Auto-release on agent crash (no orphaned locks)
Non-blocking Immediate failure if already claimed (no waiting)
Fast Pure in-memory, no table writes (~1ms overhead)
Cluster-aware Works across all PostgreSQL connections
Built-in No external dependencies (Redis, ZooKeeper, etc.)

Implementation

Create a simple lock module:

# db/advisory_locks.py
import hashlib
from contextlib import asynccontextmanager
from sqlalchemy import text
from sqlalchemy.ext.asyncio import AsyncSession

def _issue_id_to_lock_key(issue_id: str) -> int:
    """Convert issue ID to PostgreSQL advisory lock key (int32)."""
    hash_digest = hashlib.md5(issue_id.encode()).hexdigest()
    return int(hash_digest[:8], 16) % (2**31 - 1)

async def acquire_issue_lock(session: AsyncSession, issue_id: str) -> bool:
    """Try to acquire advisory lock. Returns True if acquired."""
    lock_key = _issue_id_to_lock_key(issue_id)
    result = await session.execute(
        text("SELECT pg_try_advisory_lock(:key)"),
        {"key": lock_key}
    )
    return result.scalar()

async def release_issue_lock(session: AsyncSession, issue_id: str) -> bool:
    """Release advisory lock. Returns True if released."""
    lock_key = _issue_id_to_lock_key(issue_id)
    result = await session.execute(
        text("SELECT pg_advisory_unlock(:key)"),
        {"key": lock_key}
    )
    return result.scalar()

@asynccontextmanager
async def issue_lock(session: AsyncSession, issue_id: str):
    """Context manager for issue locks with auto-cleanup."""
    acquired = await acquire_issue_lock(session, issue_id)
    try:
        yield acquired
    finally:
        if acquired:
            await release_issue_lock(session, issue_id)

The Claim Script

Create a script that atomically claims issues:

# scripts/bd_claim.py
import asyncio
import subprocess
import sys
from db.advisory_locks import acquire_issue_lock
from db.client import get_session

async def claim_issue(issue_id: str) -> bool:
    """Atomically claim an issue via advisory lock + beads update."""
    async with get_session() as session:
        # Step 1: Try to acquire lock
        if not await acquire_issue_lock(session, issue_id):
            print(f"Issue {issue_id} already claimed by another agent")
            return False

        # Step 2: Update beads status (lock held, safe from races)
        result = subprocess.run(
            ["bd", "update", issue_id, "--status=in_progress"],
            capture_output=True, text=True
        )
        if result.returncode != 0:
            print(f"Failed to update beads: {result.stderr}")
            return False

        print(f"Successfully claimed {issue_id}")
        return True

if __name__ == "__main__":
    issue_id = sys.argv[1]
    success = asyncio.run(claim_issue(issue_id))
    sys.exit(0 if success else 1)

Shell wrapper for convenience:

#!/bin/bash
# scripts/bd_claim.sh
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
cd "$SCRIPT_DIR/.."
exec uv run python scripts/bd_claim.py "$@"

Updated Claim Protocol

# Agent A                           # Agent B
bd_claim.sh proj-abc1               bd_claim.sh proj-abc1
# ✓ Lock acquired                   # ✗ Already claimed
# ✓ Issue claimed                   # Agent B picks different task

No more race conditions.

Layer 4: Git Worktrees - Filesystem Isolation

Advisory locks solve coordination (who works on what). But agents still share the same working directory. This causes:

  • Build artifact conflicts: Agent A’s pytest corrupts Agent B’s __pycache__
  • Virtual environment races: Simultaneous uv sync commands
  • Uncommitted change conflicts: Agent A’s WIP blocks Agent B’s tests

The solution: git worktrees.

What Are Worktrees?

Git worktrees let you check out multiple branches simultaneously in different directories:

# Main repo
/project/              # main branch

# Worktrees (separate directories, shared .git)
/project-worktrees/
├── proj-abc1/         # work/proj-abc1 branch
├── proj-xyz2/         # work/proj-xyz2 branch
└── proj-def3/         # work/proj-def3 branch

Key insight: All worktrees share the same .git directory, saving disk space while providing complete filesystem isolation.

Worktree Management Script

Create a script that combines claiming with worktree creation:

#!/bin/bash
# scripts/agent_worktree.sh
set -euo pipefail

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
REPO_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
WORKTREE_BASE="${WORKTREE_BASE:-$REPO_ROOT/../project-worktrees}"

create_worktree() {
    local issue_id="$1"
    local worktree_path="$WORKTREE_BASE/$issue_id"
    local branch_name="work/$issue_id"

    # Step 1: Claim issue via advisory lock (atomic)
    echo "Claiming issue $issue_id..."
    if ! "$SCRIPT_DIR/bd_claim.sh" "$issue_id"; then
        echo "Failed to claim (already taken?)"
        exit 1
    fi

    # Step 2: Create worktree directory
    mkdir -p "$WORKTREE_BASE"

    # Step 3: Create worktree on new branch
    echo "Creating worktree at $worktree_path..."
    cd "$REPO_ROOT"
    git worktree add "$worktree_path" -b "$branch_name"

    # Step 4: Copy environment files (not in git)
    if [-f "$REPO_ROOT/.env"](/-f "$REPO_ROOT/.env"/); then
        cp "$REPO_ROOT/.env" "$worktree_path/.env"
    fi

    # Step 5: Install dependencies
    echo "Installing dependencies..."
    cd "$worktree_path"
    uv sync --quiet

    echo "Worktree ready: $worktree_path"
    echo "Branch: $branch_name"
}

remove_worktree() {
    local issue_id="$1"
    local worktree_path="$WORKTREE_BASE/$issue_id"
    local branch_name="work/$issue_id"

    echo "Removing worktree..."
    cd "$REPO_ROOT"
    git worktree remove "$worktree_path" --force 2>/dev/null || rm -rf "$worktree_path"
    git worktree prune

    # Delete branch if merged
    git branch -d "$branch_name" 2>/dev/null || true

    echo "Cleanup complete"
}

list_worktrees() {
    echo "Active worktrees:"
    git worktree list

    if [-d "$WORKTREE_BASE"](/-d "$WORKTREE_BASE"/); then
        echo ""
        echo "Worktree status:"
        for dir in "$WORKTREE_BASE"/*/; do
            if [-d "$dir"](/-d "$dir"/); then
                local id=$(basename "$dir")
                local branch=$(cd "$dir" && git branch --show-current)
                local uncommitted=$(cd "$dir" && git status --porcelain | wc -l)
                echo "  $id (branch: $branch, uncommitted: $uncommitted files)"
            fi
        done
    fi
}

case "${1:-}" in
    create) create_worktree "$2" ;;
    remove) remove_worktree "$2" ;;
    list)   list_worktrees ;;
    *)      echo "Usage: $0 {create|remove|list} [issue-id]" ;;
esac

Worktree Workflow

# 1. Create isolated worktree (claims issue automatically)
scripts/agent_worktree.sh create proj-abc1

# 2. Work in isolated directory
cd ../project-worktrees/proj-abc1

# Make changes with complete isolation
uv run pytest          # Won't affect other agents
uv sync               # Safe, isolated .venv
git add . && git commit -m "Implement feature"
git push -u origin work/proj-abc1

# 3. Create PR, get review, merge to main

# 4. Cleanup
scripts/agent_worktree.sh remove proj-abc1
bd close proj-abc1 && bd sync

Worktree vs Advisory Locks

Both mechanisms solve different problems:

Mechanism Purpose Scope
Advisory Locks WHO works on WHAT Coordination
Worktrees HOW they work Isolation

They work together:

  1. Lock prevents duplicate claims (coordination)
  2. Worktree prevents file conflicts (isolation)

Disk Space Considerations

Each worktree duplicates working files but shares .git:

Component Main Repo Per Worktree
.git/ ~200MB 0 (shared)
Source code ~50MB ~50MB
.venv/ ~300MB ~300MB
node_modules/ ~200MB ~200MB
Total ~750MB ~550MB

For 5 concurrent agents: ~750MB + 5×550MB = ~3.5GB total

Worth it for complete isolation.

Layer 5: Skillz - Domain-Specific Validation

Generic linting catches syntax errors. Skills catch domain errors.

Skillz is an MCP server that provides domain-specific validation skills.

What’s a Skill?

A skill is a specialized instruction set with tool restrictions:

# .ruler/skills/api-validator/SKILL.md
---
name: api-validator
description: Validate API endpoints follow conventions. Triggers on "api check", "endpoint review".
allowed-tools: Read, Grep, Glob
---

# API Validation Skill

## Checks

1. All endpoints return `{data, meta}` structure
2. Error responses include `error_code` and `message`
3. No breaking changes to existing endpoints
4. Authentication required on non-public routes

## Validation Process

1. Find all router files: `glob("**/routers/**/*.py")`
2. Check response models include required fields
3. Verify error handlers follow pattern
4. Report violations with file:line references

The allowed-tools restriction is crucial. A validation skill shouldn’t edit files; only read them.

Setting Up Skillz

Add to your MCP configuration (.mcp.json):

{
  "skillz": {
    "command": "uvx",
    "args": ["skillz@latest", "/path/to/your/project/.ruler/skills"]
  }
}

Ruler can generate this automatically:

# ruler.toml
[mcp.skillz]
command = "uvx"
args = ["skillz@latest", ".ruler/skills"]

Skill Categories

Create skills for different validation needs:

.ruler/skills/
├── api-validator/      # API contract compliance
├── security-check/     # OWASP patterns
├── architecture/       # Layer boundaries
├── test-quality/       # Test coverage, meaningful assertions
└── domain-rules/       # Business logic constraints

Example architecture skill:

# .ruler/skills/architecture/SKILL.md
---
name: architecture
description: Validate layer boundaries. Triggers on "architecture check", "layer validation".
allowed-tools: Read, Grep, Glob
---

# Architecture Validation

## Layer Rules

| Layer             | Can Import From     |
|-------------------|---------------------|
| 1. Core           | Nothing             |
| 2. Domain         | Layer 1             |
| 3. Application    | Layers 1, 2         |
| 4. Infrastructure | Layers 1, 2, 3      |
| 5. Presentation   | All layers          |

## Validation

1. Find all Python files
2. Extract imports
3. Check each import against layer rules
4. Report violations: "file.py:15 - Data layer cannot import from Presentation layer"

Invoking Skills

In your agent instructions:

## When to Use Skills

- Before committing router changes: invoke `api-validator`
- Before any commit: invoke `architecture`
- When adding auth code: invoke `security-check`

Agents can invoke skills via MCP tool calls or slash commands (depending on the client).

Layer 6: Agent Learning - Context Passing & State Consistency

One of the biggest challenges in multi-agent workflows is context passing between agents and managing state consistency at scale.

The Agent Learning System solves this by transforming ephemeral task context into durable institutional knowledge that persists across agents and sessions.

The Learning Problem

When Agent A discovers that “Repository classes must use AsyncSession, not engine directly,” that insight is lost when the session ends. Agent B working on a similar task later has to rediscover this pattern, wasting time and potentially making inconsistent decisions.

Without learning:

  • Each agent rediscovers the same patterns
  • Inconsistent decisions across agents
  • No shared understanding of architectural choices
  • Context lost when sessions end

With learning:

  • Agents inherit prior discoveries
  • Consistent decisions across sessions
  • Shared institutional knowledge
  • Context persists across machines

Architecture

The system uses a two-tier storage architecture:

  1. PostgreSQL database (project_dev) - Fast queries, semantic search, staleness detection
  2. Git-synced JSONL (.learnings/corpus.jsonl) - Cross-machine sync, version control, offline access
┌─────────────────────┐                    ┌─────────────────────┐
│   Agent A (Mac 1)   │                    │   Agent B (Mac 2)   │
│                     │                    │                     │
│ ┌─────────────────┐ │                    │ ┌─────────────────┐ │
│ │ Local PostgreSQL│ │                    │ │ Local PostgreSQL│ │
│ │  (learnings DB) │ │                    │ │  (learnings DB) │ │
│ └────────┬────────┘ │                    │ └────────▲────────┘ │
│          │ export   │                    │   import │          │
│          ▼          │                    │          │          │
│ ┌─────────────────┐ │                    │ ┌─────────────────┐ │
│ │ .learnings/     │ │                    │ │ .learnings/     │ │
│ │   corpus.jsonl  │ │                    │ │   corpus.jsonl  │ │
│ └────────┬────────┘ │                    │ └────────▲────────┘ │
└──────────┼──────────┘                    └──────────┼──────────┘
           │                                          │
           │ git push                    git pull     │
           │                                          │
           └───────────► GitHub/GitLab ◄──────────────┘
                    (.learnings/corpus.jsonl)

Storage Schema

Each learning captures:

Field Purpose
insight The learning (1-3 sentences, specific)
category Classification: architecture, domain, patterns, trade-offs, anti-patterns, edge-cases, tooling
evidence Supporting evidence: {"files": [...], "commits": [...], "docs": [...]}
confidence 0.0-1.0 (decays over time)
relevance_scope Where it applies: global, layer-1 through layer-5, file-specific
tags Semantic tags for retrieval
embedding 384-dim vector for semantic search

Automatic Learning Extraction

Learnings are automatically extracted from completed tasks via git hooks:

# Agent implements feature
git commit -m "Add feature X (proj-abc1)"

# Post-commit hook automatically:
# 1. Analyzes commit message, files changed, code diff
# 2. Uses LLM to extract key learnings
# 3. Stores learnings in database
# 4. Exports to .learnings/corpus.jsonl

The extraction prompt prioritizes implementation-specific insights:

  • Patterns & decisions: How was the feature implemented?
  • Research-backed choices: Thresholds, formulas based on research
  • Architectural choices: Why reuse existing infrastructure?
  • Edge cases: Floating-point precision, boundary conditions
  • Trade-offs: What alternatives were considered?

Context-Aware Retrieval

When starting a new task, agents retrieve relevant learnings:

learnings = get_task_relevant_learnings(
    task_description="Implement feature X",
    files_being_modified=["backend/models/user.py"],
    tags=["authentication", "security"],
    max_tokens=2000  # Stay within LLM context limits
)

Hybrid retrieval combines three signals:

  1. File/tag matching (30%): Exact file paths and tag overlap
  2. Embedding similarity (30%): Semantic similarity using vector embeddings
  3. LLM semantic scoring (40%): LLM understands conceptual alignment

Results are merged, deduplicated, and ranked by combined relevance score.

Token Budget Management

To address the context passing challenge, the system includes token budget management:

# Greedy packing algorithm stays within budget
learnings = get_task_relevant_learnings(
    task_description="...",
    max_tokens=2000
)

# Response includes budget metadata
# {
#   "count": 5,
#   "learnings": [...],
#   "budget": {
#     "total_tokens": 1847,
#     "truncated_count": 2,
#     "truncated": true,
#     "available_budget": 1900
#   }
# }

This prevents context overflow while maximizing relevant information.

Staleness Detection

To maintain state consistency, learnings automatically decay when evidence files are modified:

# Pre-commit hook detects modified files
git commit -m "Refactor event repository"

# Output:
# ⚠️  Reduced confidence for 2 learnings due to modified evidence files
#    Learnings automatically updated to prevent staleness

Confidence decay:

  • 10% reduction every 30 days (time-based)
  • 20% reduction when evidence files modified (event-based)
  • Learnings with confidence < 0.1 filtered out during retrieval

Git-Based Sync

Learnings sync across machines via git, similar to Beads:

# Agent A: Store learning
store_learning(
    insight="Repository classes must use AsyncSession, not engine directly",
    category="architecture",
    evidence_files=["backend/repositories/user_repository.py"],
    ...
)

# Commit triggers export
git commit -m "Add feature"
# → Exports to .learnings/corpus.jsonl

# Agent B: Pull and import
git pull
# → Post-merge hook imports learnings to local database

# Agent B can now query
get_task_relevant_learnings(...)
# → Sees learning from Agent A!

Benefits:

  • Offline-capable: Work locally, sync when connected
  • Version controlled: Git history shows learning evolution
  • No infrastructure: Reuses git, no separate database server
  • Conflict resolution: Git merge handles conflicts (timestamp wins)

Observability

To address the debugging story challenge, the system provides observability:

# Learning corpus health
stats = get_learning_statistics()
# {
#   "total_count": 150,
#   "active_count": 120,
#   "invalidated_count": 30,
#   "by_category": {"architecture": 45, "domain": 30, ...},
#   "average_confidence": 0.87
# }

# Red flags:
# - Average confidence < 0.5 (too much low-quality data)
# - Invalidation rate > 50% (learnings becoming stale too fast)
# - One category dominates (imbalanced coverage)

Integration with Workflow

Before starting work:

# Retrieve relevant learnings as context
learnings = get_task_relevant_learnings(
    task_description=issue.description,
    files_being_modified=issue.files,
    tags=issue.tags
)

# Agent reviews learnings alongside documentation
# Makes consistent decisions based on prior discoveries

After completing work:

# Automatic extraction from commit
# (handled by post-commit hook)

# Or manual storage for specific insights
store_learning(
    insight="Algorithm stability can be negative after very poor performance",
    category="edge-cases",
    evidence_files=["backend/utils/calculation_tools.py"],
    confidence=0.9,
    tags=["algorithm", "edge-cases", "numerical-stability"]
)

Design Decisions

Why separate database (project_dev)?

  • Production data separation: AgentLearning is development tooling, not core platform
  • Physical isolation: Production database never contains development metadata
  • Clean backups: Production backups exclude development tooling data
  • Performance: Development tooling queries don’t impact production database

Why PostgreSQL + JSONB?

  • Structured schema enables filtering by category, confidence, tags
  • JSONB evidence provides flexible file/commit/doc references
  • Partial indexes optimize active high-confidence learnings
  • Semantic search via pgvector embeddings

Why NOT SQLite-vector?

  • Write locking = multi-agent conflicts
  • No concurrent writes (we have multiple agents)

Why NOT ChromaDB?

  • 10x storage overhead (10GB vs 1GB for same data)
  • Overkill for <1000 learnings

Future Enhancements

  1. Learning validation skill: /learning-check validates quality, detects duplicates
  2. Cross-agent learning sync: Real-time sync via PostgreSQL (currently git-based)
  3. Tier 2 consolidation: LLM-summarized patterns when corpus exceeds 1000 learnings
  4. Learning diffs: bd learning diff to show changes between machines

Layer 7: Git Hooks

Hooks enforce the workflow automatically. No discipline required.

Beads Hooks

Beads installs its own git hooks:

bd hooks install

This creates:

  • pre-commit: Flushes pending issue changes
  • pre-push: Validates issues are synced
  • post-merge: Auto-syncs after pull

Pre-commit Framework

Combine with pre-commit for code quality:

# .pre-commit-config.yaml
repos:
  # Code formatting
  - repo: https://github.com/astral-sh/ruff-pre-commit
    rev: v0.4.0
    hooks:
      - id: ruff-check
        args: [--fix]
      - id: ruff-format

  # Beads sync
  - repo: local
    hooks:
      - id: bd-sync
        name: bd sync (flush)
        entry: bash -c 'bd sync --flush-only || true'
        language: system
        always_run: true
        pass_filenames: false

  # Type checking
  - repo: local
    hooks:
      - id: mypy
        name: mypy
        entry: mypy
        language: system
        types: [python]

  # Ruler apply (regenerate agent configs if rules changed)
  - repo: local
    hooks:
      - id: ruler-apply
        name: ruler apply
        entry: npx @intellectronica/ruler apply
        language: system
        files: ^\.ruler/

Custom Pre-push Validation

#!/bin/sh
# .git/hooks/pre-push (or via bd hooks)

# 1. No uncommitted changes
if ! git diff --quiet; then
    echo "Error: Uncommitted changes"
    exit 1
fi

# 2. Issues synced
bd sync --status || {
    echo "Error: Beads not synced. Run 'bd sync'"
    exit 1
}

# 3. Tests pass with coverage
pytest --cov=. --cov-fail-under=80 || exit 1

# 4. Warn about in-progress issues
in_progress=$(bd list --status=in_progress --count)
if [ "$in_progress" -gt 0 ]; then
    echo "Warning: You have $in_progress in-progress issues"
    bd list --status=in_progress
    read -p "Continue push? [y/N] " confirm
    [ "$confirm" = "y" ] || exit 1
fi

Layer 8: Session Protocols

The most overlooked piece: how agents start and end work.

Session Start Protocol

Add this to your AGENTS.md:

## Multi-Agent Coordination

**Session Start Protocol** (MANDATORY):

Option A - Shared directory (simple):

1. Get latest state: `bd sync`
2. Find available work: `bd ready`
3. Claim atomically: `scripts/bd_claim.sh <id>`
4. Push claim: `bd sync`
5. Now read and plan: `bd show <id>`

Option B - Isolated worktree (recommended for parallel work):

1. Get latest state: `bd sync`
2. Find available work: `bd ready`
3. Create worktree: `scripts/agent_worktree.sh create <id>`
   (This claims + creates isolated directory)
4. Work in: `cd ../project-worktrees/<id>`

**Stale Work Detection**:

Check for potentially abandoned work:
- `bd list --status=in_progress`
- If resuming abandoned work: `bd comment <id> "Resuming from previous session"`

Session Close Protocol

## Session Close Protocol

**CRITICAL**: Before ending any session, complete this checklist:

For shared directory:

1. `git status` - Check what changed
2. `git add <files>` - Stage code changes
3. `bd sync` - Sync beads changes
4. `git commit -m "..."` - Commit code
5. `bd sync` - Sync any new beads changes
6. `git push` - Push to remote

For worktree:

1. `git add . && git commit` - Commit in worktree
2. `git push -u origin work/<id>` - Push branch
3. Create PR if ready
4. `scripts/agent_worktree.sh remove <id>` - Cleanup worktree
5. `bd close <id> && bd sync` - Close issue

**Work is not done until pushed.**

Putting It All Together

Here’s the complete workflow with worktrees:

Agent A (Machine 1)                    Agent B (Machine 2)
───────────────────                    ───────────────────
bd sync                                bd sync

bd ready                               bd ready
→ proj-001: Add auth                   → proj-001: Add auth
→ proj-002: Fix bug                    → proj-002: Fix bug

agent_worktree.sh create proj-001      (Agent A claims first)
→ Lock acquired
→ Worktree: ../worktrees/proj-001      bd sync (sees claim)

                                       bd ready
                                       → proj-002: Fix bug

cd ../worktrees/proj-001               agent_worktree.sh create proj-002
                                       → Lock acquired
                                       → Worktree: ../worktrees/proj-002

# Complete isolation                   cd ../worktrees/proj-002
uv run pytest  # Agent A's tests
uv sync        # Agent A's venv        uv run pytest  # Agent B's tests
                                       uv sync        # Agent B's venv

# Retrieve relevant learnings         # Retrieve relevant learnings
get_task_relevant_learnings(...)      get_task_relevant_learnings(...)
# → Sees prior architectural          # → Sees prior patterns
#   patterns discovered               #   from Agent A

git commit && git push                 git commit && git push
# work/proj-001 branch                 # work/proj-002 branch
# → Post-commit hook extracts         # → Post-commit hook extracts
#   learnings automatically            #   learnings automatically

agent_worktree.sh remove proj-001      agent_worktree.sh remove proj-002
bd close proj-001 && bd sync           bd close proj-002 && bd sync
# → Learnings exported to             # → Learnings exported to
#   .learnings/corpus.jsonl            #   .learnings/corpus.jsonl

Key properties:

  • No coordinator needed: Git is the source of truth
  • Atomic claims: PostgreSQL advisory locks prevent races
  • Complete isolation: Each agent has its own directory
  • Context persistence: Agent learning system captures and shares insights
  • Offline-capable: Work locally, sync when connected
  • Self-healing: Session-level locks auto-release on crash

Complete Setup Checklist

1. Install Tools

# Beads
brew install steveyegge/tap/bd

# Ruler (via npx, no install needed)
npx @intellectronica/ruler --help

# Pre-commit
pip install pre-commit

2. Initialize Project

cd your-project

# Initialize beads
bd init

# Create ruler structure
mkdir -p .ruler/skills
touch .ruler/ruler.toml
touch .ruler/AGENTS.md

# Create scripts directory
mkdir -p scripts

3. Configure Ruler

# .ruler/ruler.toml
[agents.claude]
output = "CLAUDE.md"
mcp_output = ".mcp.json"

[agents.cursor]
output = ".cursor/rules/AGENTS.md"

[mcp.skillz]
command = "uvx"
args = ["skillz@latest", ".ruler/skills"]

4. Write Your Rules

# .ruler/AGENTS.md

## Critical Constraints

1. Never decrease test coverage
2. TODOs must reference issues
3. No secrets in code

## Multi-Agent Coordination

**Session Start**:
- Simple: `bd sync``bd ready``bd_claim.sh <id>``bd sync` → plan
- Isolated: `bd sync``bd ready``agent_worktree.sh create <id>` → work

**Session Close**:
- Simple: `git status``git add``bd sync``git commit``bd sync``git push`
- Isolated: commit → push branch → remove worktree → `bd close <id>``bd sync`

## Before Every Commit

pytest && pylint .

5. Create Scripts

# Create advisory lock module
cat > db/advisory_locks.py << 'EOF'
# ... (see implementation above)
EOF

# Create claim script
cat > scripts/bd_claim.py << 'EOF'
# ... (see implementation above)
EOF

# Create worktree script
cat > scripts/agent_worktree.sh << 'EOF'
# ... (see implementation above)
EOF
chmod +x scripts/*.sh
# Create separate database for agent learnings
createdb project_dev

# Enable pgvector extension (for semantic search)
psql project_dev -c "CREATE EXTENSION IF NOT EXISTS vector;"

# Set environment variable
export AGENT_LEARNING_DB_URL=postgresql+asyncpg://localhost:5432/project_dev

# Create .learnings directory (git-tracked)
mkdir -p .learnings

# Add to .gitignore (DO NOT ignore .learnings/)
# .learnings/ should be committed to git for cross-machine sync

Git hooks (already configured if using pre-commit):

  • Pre-commit: Exports learnings to .learnings/corpus.jsonl
  • Post-commit: Extracts learnings from commit (if issue ID present)
  • Post-merge: Imports learnings from .learnings/corpus.jsonl

Usage:

# Before starting work: Retrieve relevant learnings
learnings = get_task_relevant_learnings(
    task_description="Implement feature X",
    files_being_modified=["backend/models/user.py"],
    tags=["authentication", "security"]
)

# After completing work: Automatic extraction via git hook
# Or manual storage:
store_learning(
    insight="Repository classes must use AsyncSession, not engine directly",
    category="architecture",
    evidence_files=["backend/repositories/user_repository.py"],
    session_id="full-cycle-2026-01-10-123",
    confidence=1.0,
    tags=["async-patterns", "database"]
)

7. Generate and Commit

# Generate agent configs
npx @intellectronica/ruler apply

# Install hooks
bd hooks install
pre-commit install

# Commit everything
git add .
git commit -m "Add multi-agent workflow infrastructure"
git push

Measuring Success

How do you know if your multi-agent workflow is working?

Metric Healthy Warning
Race conditions per week 0 >2
Stale in_progress issues 0 >1
Merge conflicts on issues.jsonl <1/week >3/week
Build artifact conflicts 0 >1/week
Agent idle time <5% >20%
Rule violations caught by hooks Decreasing Increasing
Learning corpus size Growing Stagnant
Average learning confidence >0.7 <0.5
Learning invalidation rate <30% >50%

Use bd stats for project health:

bd stats
# Open: 12  In Progress: 2  Closed: 45  Blocked: 3

Check worktree status:

scripts/agent_worktree.sh list
# Active worktrees:
#   proj-abc1 (branch: work/proj-abc1, uncommitted: 0 files)
#   proj-xyz2 (branch: work/proj-xyz2, uncommitted: 3 files)

Check learning corpus health:

from dev_tools.agent_learning.tools import get_learning_statistics

stats = get_learning_statistics()
# {
#   "total_count": 150,
#   "active_count": 120,
#   "invalidated_count": 30,
#   "by_category": {"architecture": 45, "domain": 30, ...},
#   "average_confidence": 0.87
# }

Common Pitfalls

1. Reading Before Claiming

# WRONG
bd show proj-001                         # Read first
# ... 5 minutes planning ...
bd update proj-001 --status=in_progress  # Too late! Agent B claimed it

# RIGHT (with advisory locks)
scripts/bd_claim.sh proj-001             # Atomic claim
bd show proj-001                         # Now read safely

2. Forgetting to Push

# Agent A
git commit -m "Done with proj-001"
bd close proj-001
# Forgets to push, closes laptop

# Agent B (next day)
bd sync  # Sees proj-001 still in_progress locally
# Confusion ensues

Solution: Pre-push hooks warn about uncommitted work.

3. Not Using Worktrees for Parallel Work

# WRONG: Two agents in same directory
# Agent A                    Agent B
uv sync                      uv sync
# Race condition on .venv!

# RIGHT: Isolated worktrees
agent_worktree.sh create proj-001    agent_worktree.sh create proj-002
cd ../worktrees/proj-001             cd ../worktrees/proj-002
uv sync                              uv sync
# Each has its own .venv - no conflicts

4. Not Using Dependencies

# WRONG: Create independent issues
bd create --title="Design API"
bd create --title="Implement API"
bd create --title="Test API"
# Agent might start "Test API" before "Implement API" is done

# RIGHT: Create with dependencies
bd create --title="Design API"        # → proj-001
bd create --title="Implement API"     # → proj-002
bd create --title="Test API"          # → proj-003
bd dep add proj-002 proj-001          # Implement depends on Design
bd dep add proj-003 proj-002          # Test depends on Implement

# Now `bd ready` only shows unblocked work

Conclusion

Multi-agent development isn’t about having more agents; it’s about coordination infrastructure. As one Reddit commenter noted:

“multi-agent workflows are becoming pretty essential for serious dev work tbh. the tradeoffs between agent specialization and orchestration complexity are real though - most people swing too hard one way or the other.”

The solution is a layered architecture that addresses each coordination challenge:

  1. Ruler: Single source of truth for agent rules
  2. Beads: Git-native issue tracking with dependencies
  3. Advisory Locks: Atomic claim operations (no races)
  4. Worktrees: Complete filesystem isolation
  5. Skillz: Domain-specific validation skills
  6. Agent Learning: Context passing & state consistency
  7. Git Hooks: Automatic enforcement
  8. Protocols: Explicit session start/end procedures

Key Insights

Coordination vs. Isolation: These are different problems requiring different solutions.

  • Advisory locks solve coordination (who works on what)
  • Worktrees solve isolation (how they work without conflicts)
  • Agent learning solves context passing (what agents know)

Context Passing & State Consistency: The agent learning system addresses the critical challenge:

“what usually trips teams up: context passing between agents and managing state consistency at scale.”

By transforming ephemeral task context into durable institutional knowledge, agents inherit prior discoveries and make consistent decisions across sessions and machines.

Observability: As another commenter noted:

“also the debugging story is rough. single agent is easier to reason about. multiple agents means you need solid observability and tracing to figure out where something went sideways.”

The system provides observability through:

  • Learning corpus health metrics
  • Staleness detection (automatic confidence decay)
  • Git history for learning evolution
  • Structured evidence linking learnings to files/commits/docs

Getting Started

The tools are open source and composable. Start simple, add layers as needed:

  1. Start: Beads for issue tracking
  2. Add: Ruler when you have multiple agents
  3. Layer in: Advisory locks when you hit race conditions
  4. Add: Worktrees when build artifacts conflict
  5. Enable: Agent learning when context passing becomes a bottleneck

The result: agents that work together without stepping on each other, with shared institutional knowledge and consistent decision-making.


Resources: