/loop --interval 30m " 1. Read heartbeat.md for the last-seen message timestamp 2. Fetch new messages from #engineering since that timestamp 3. Identify messages where I'm tagged or that mention 'bridge-api' 4. Summarize each relevant thread in 1-2 sentences 5. Update heartbeat.md with the new timestamp and summaries 6. If any message is marked urgent, draft a response "
From Manual Prompting to
Autonomous Agent Loops
A comprehensive guide for the BRIDGE MT engineering team. Covers the Agent Workspace, Loop Architecture, Skill Lifecycle, Automated Evaluations, and Enterprise Deployment patterns.
Why This Matters — Strategic Context
The transition from ephemeral, reactive prompting to structured, autonomous skills is a strategic imperative — not a luxury. We are building toward a "Proactive Loop" architecture where agents execute complex workflows autonomously.
The market dynamics tell a clear story: Anthropic's revenue run rate grew from $1B (Dec 2024) to $19B (Mar 2026). Claude Code alone achieved a $2.5B run rate shortly after reaching market readiness. In developer surveys (n=15,000), Claude leads for complex tasks at 28%, outpacing legacy models at 19%. This signals a flight to precision.
For BRIDGE MT, this means treating AI skills as living corporate assets — not throwaway prompts. By encoding institutional knowledge into structured skills, we prevent "shadow AI" proliferation and ensure that as underlying models evolve, our proprietary business logic remains intact and optimized.
1. Agent Workspace
Understanding the agent's execution environment is the foundation for everything that follows.
When Claude Code runs, it operates inside a sandboxed workspace — a dedicated directory where the agent can read files, execute scripts, and maintain state across turns. Think of it as the agent's "desk": everything it needs to work on a task lives here.
Workspace Structure
directory layout # Typical Claude Code workspace ~/project/ # Your project root (mounted) ├── .claude/ # Claude Code config directory │ ├── settings.json # Project-level settings │ └── commands/ # Custom slash commands ├── CLAUDE.md # Project instructions (auto-loaded) ├── heartbeat.md # Loop state file (more on this below) ├── src/ # Your source code └── ...
Key Files
.md file becomes a /command in the CLI.
Use these to create team-specific workflows accessible via slash notation.
CLAUDE.md and .claude/ to your repo.
This ensures every developer on your team gets the same agent behavior and project context automatically.2. Loops & the Heartbeat Architecture
Loops transform Claude from a reactive chatbot into a proactive agent that monitors, acts, and reports on intervals. This is the foundation of our "Proactive Loop" architecture.
What is a Loop?
A loop is a recurring execution cycle where Claude Code runs a defined workflow at a set interval. Instead of waiting for you to type a command, the agent wakes up, checks for changes, performs actions, logs results, and goes back to sleep — automatically.
The /loop Command
The /loop command starts an interval-based trigger cycle. You provide a description of the workflow
and Claude executes it repeatedly.
terminal # Basic syntax /loop "Check Slack for new messages tagged @bridge-dev, summarize them, and update heartbeat.md with findings" # With explicit interval (every 30 minutes) /loop --interval 30m "Monitor the #deployments channel for failed builds and create a summary in heartbeat.md" # Complex multi-step workflow /loop "Every cycle: 1. Read heartbeat.md for last known state 2. Check GitHub PRs for reviews needing attention 3. Scan Slack #engineering for unresolved questions 4. Update heartbeat.md with findings and timestamps 5. If critical items found, draft a Slack message to #alerts"
How heartbeat.md Works
The heartbeat.md file is the agent's persistent memory between loop cycles. It functions as a structured
log and state tracker. Each cycle, Claude reads the file, processes updates, and writes its findings back.
heartbeat.md example # Heartbeat — Loop State ## Last Run - Timestamp: 2026-03-18T14:30:00Z - Status: OK - Duration: 12s ## Findings - 3 new PRs awaiting review (PR #142, #145, #147) - Slack: 2 unresolved questions in #engineering - No failed builds detected ## Next Actions - [ ] Notify @sarah about PR #142 (assigned 2 days ago) - [ ] Follow up on Slack thread about API rate limits ## History | Cycle | Time | Items Found | Actions Taken | |-------|------------|-------------|---------------| | 14 | 14:30 | 5 | 2 notified | | 13 | 14:00 | 2 | 0 | | 12 | 13:30 | 3 | 1 notified |
Interval-Based Trigger Gateway
The trigger gateway is the mechanism that wakes the agent at defined intervals. Under the hood, this uses a timer that re-invokes the Claude Code session at the specified frequency.
Practical Loop Examples
Slack Monitoring Loop
PR Review Reminder Loop
/loop --interval 1h " 1. Check GitHub for open PRs in bridge-mt/core 2. Identify PRs with no review after 24 hours 3. For each stale PR, check the author and assigned reviewer 4. Draft a polite Slack DM to the reviewer 5. Log all actions in heartbeat.md "
Build Health Monitor
/loop --interval 15m " 1. Run 'gh run list --limit 5' to check latest CI runs 2. If any run has status 'failure': a. Fetch the logs with 'gh run view {id} --log-failed' b. Identify the failing step and likely root cause c. Post a summary to #build-alerts 3. Update heartbeat.md with build health status "
3. Commands Reference
A complete reference of Claude Code CLI commands, slash commands, and key shortcuts for daily developer use.
Core CLI Commands
| Command | Description | Example |
|---|---|---|
claude |
Start an interactive Claude Code session | claude |
claude "prompt" |
Run a one-shot command without interactive mode | claude "fix the failing test in auth.ts" |
claude -p "prompt" |
Print mode — output only, no interactivity | claude -p "explain this function" | pbcopy |
claude config |
Edit global or project-level configuration | claude config set model claude-opus-4-6 |
claude mcp |
Manage MCP server connections | claude mcp add slack-server |
claude update |
Update Claude Code to the latest version | claude update |
Slash Commands (Inside a Session)
| Command | Description | Category |
|---|---|---|
/loop |
Start an interval-based automation loop | Automation |
/help |
Show available commands and usage | General |
/clear |
Clear conversation context | General |
/compact |
Compress context to save token budget | Context |
/cost |
Show token usage and estimated cost for this session | Monitoring |
/model |
Switch the active model mid-session | Config |
/permissions |
View and manage tool permissions | Security |
/review |
Request a code review of recent changes | Dev |
/pr-comments |
Fetch and address PR review comments | Dev |
/init |
Initialize CLAUDE.md for a new project | Setup |
Custom Commands
Create your own slash commands by placing .md files in .claude/commands/.
Each file becomes a callable command.
.claude/commands/deploy-check.md # Deploy Readiness Check Run the following checks before deployment: 1. Execute the full test suite: npm test 2. Run the linter: npm run lint 3. Check for security vulnerabilities: npm audit 4. Verify build succeeds: npm run build 5. Summarize results and flag any blockers. If all checks pass, output: "DEPLOY READY" If any check fails, output: "DEPLOY BLOCKED" with details.
Now any team member can run /deploy-check in their Claude Code session to trigger this standardized workflow.
.claude/commands/ directory to your repo. This creates a
shared command library that every developer inherits automatically — a lightweight way to standardize workflows
without heavy tooling.4. Automation & Cron Patterns
Moving beyond reactive usage to proactive agent automation. Three tiers: session loops, scheduled tasks, and persistent CI/CD.
The Automation Spectrum
| Tier | Mechanism | Persistence | Best For |
|---|---|---|---|
| Tier 1 | /loop + heartbeat.md |
While session is active | Monitoring, summaries, ad-hoc checks |
| Tier 2 | Scheduled Tasks (Desktop App) | While Desktop app is running | Daily reports, recurring workflows |
| Tier 3 | GitHub Actions / Custom Scripts | Server-side, always on | CI/CD, nightly audits, production monitoring |
Tier 1: Session Loops (Covered Above)
Use /loop within an active Claude Code session. The agent stays alive and re-executes your workflow at
intervals. State is maintained in heartbeat.md. Terminates when you end the session.
Tier 2: Scheduled Tasks (Desktop App / Cowork)
The Claude Desktop application and Cowork mode support scheduled tasks — persistent automations that survive session restarts. These are managed through the scheduling system.
scheduled task creation # In Cowork or Desktop: create a daily 9 AM summary Create a scheduled task: Name: "Daily Engineering Summary" Schedule: "Every weekday at 09:00" Workflow: " 1. Check all open PRs in bridge-mt org 2. Summarize overnight Slack activity in #engineering 3. List today's calendar events from shared team calendar 4. Compile into a morning briefing 5. Post to #daily-standup"
Managing Scheduled Tasks
| Action | How |
|---|---|
| List all tasks | list_scheduled_tasks — shows all active schedules with their IDs |
| Create new task | create_scheduled_task — define name, schedule (cron syntax), and workflow |
| Update existing | update_scheduled_task — modify schedule, workflow, or pause/resume |
| Delete task | Update with disabled status |
Tier 3: GitHub Actions & Custom Scripts
For automations that must run regardless of whether anyone's machine is on — nightly code audits, scheduled deployments, automated PR reviews — use GitHub Actions with Claude Code.
See the dedicated GitHub Actions section below for full details.
5. Writing Good Skills
Skills are the bridge between ad-hoc prompting and repeatable, enterprise-grade AI workflows. A well-written skill encodes institutional knowledge into a reusable, testable, versionable asset.
What is a Skill?
A skill is a structured instruction set — typically a SKILL.md file inside a folder — that tells Claude
exactly how to perform a specific task. Unlike a one-off prompt, a skill is persistent, shareable, and measurable.
skill folder structure skills/ └── nda-review/ ├── SKILL.md # Core instructions + trigger description ├── templates/ # Reference templates and examples │ └── nda-template.md ├── examples/ # Gold-standard input/output pairs │ ├── input-1.md │ └── output-1.md └── evals/ # Test prompts and benchmarks └── test-cases.yaml
Anatomy of a Good SKILL.md
Every production skill should include these components:
Example: NDA Review Skill (Encoded Preference)
# NDA Review Skill description: "Review NDAs against BRIDGE MT legal standards. Trigger for: 'review this NDA', 'check NDA terms', 'NDA compliance check'. NOT for: general contract review, employment agreements, SLAs." ## Prerequisites - PDF or DOCX file containing the NDA - Access to BRIDGE MT legal template (templates/nda-standard.md) ## Workflow 1. Extract full text from the uploaded NDA document 2. Compare each clause against our standard NDA template 3. Flag deviations in these categories: - RED: Unacceptable terms (e.g., unlimited liability, IP assignment) - YELLOW: Non-standard but negotiable (e.g., extended term, broad definition) - GREEN: Matches our standard 4. Generate a structured review document with: - Executive summary (2-3 sentences) - Clause-by-clause analysis table - Recommended redlines with suggested alternative language - Risk assessment score (1-10) ## Output - Format: DOCX file - Naming: NDA_Review_{CounterpartyName}_{Date}.docx - Must include BRIDGE MT letterhead
6. Skill Taxonomy: Capability Uplift vs. Encoded Preference
Effective governance begins with categorization. Understanding this distinction dictates how we invest in maintenance and anticipate model evolution.
| Dimension | Capability Uplift | Encoded Preference |
|---|---|---|
| Core Function | Gives the AI a technical capability it doesn't natively possess | Bundles existing abilities into a proprietary, rigid workflow |
| Examples | Native Frontend Design, Docx/PDF generation skills | Social Media Automation, NDA Review, Sales Meeting Summarization |
| Obsolescence Risk | High — base models eventually inherit these features | Low — tied to specific business logic that models cannot guess |
| Maintenance Strategy | Audit quarterly. Retire when native model capability matches. | Continuously refine. These are long-lived corporate assets. |
| Value Driver | Immediate utility — fills gaps in current model capabilities | Institutional knowledge — encodes proprietary business recipes |
Decision Matrix: When to Build Which Type
7. The Skill Creator Agent
The Skill Creator is a meta-skill — a skill that writes, evaluates, and optimizes other skills. It acts as a governance engine, ensuring every skill adheres to best practices.
What the Skill Creator Does
Using the Skill Creator
terminal # Invoke the Skill Creator skill /skill-creator # Then describe what you need: "Create a skill that converts YouTube video transcripts into LinkedIn posts. It should: - Extract key insights (max 5) - Format for LinkedIn's algorithm (hook + value + CTA) - Match our BRIDGE MT tone of voice - Include 3-5 relevant hashtags - Output as markdown"
The Skill Creator will then generate the full SKILL.md, create test prompts, run evaluations with sub-agents, measure the pass rate, and iterate until quality criteria are met.
8. The Reverse-Engineering Development Methodology
We don't engage in iterative "prompt guessing." Instead, we identify the perfect output and work backward to the technical instructions. This is outcome-oriented development.
The Six-Step Process
9. Evaluations & Benchmarking
Manual evaluation doesn't scale. We deploy isolated sub-agents to stress-test every skill, ensuring measurable quality and preventing personal bias from affecting results.
Evaluation Architecture
The evaluation system uses a three-tier "Judge" architecture. This ensures that the skill is tested objectively and that improvements are data-driven.
Standardized Benchmarks
| Metric | Description | Business Impact |
|---|---|---|
| Pass Rate | Percentage of criteria met (e.g., 17/17) | Primary metric — defines "Enterprise Readiness" |
| Execution Duration | Time taken to complete the full execution loop | Operational throughput and developer productivity |
| Token Consumption | Total tokens used during execution | Cost reduction and ROI maximization |
| Trigger Accuracy | Frequency of correct activation via YAML keywords | System reliability and user experience |
Running Evaluations in Practice
terminal # Using the Skill Creator to run evaluations /skill-creator "Evaluate the 'nda-review' skill: - Run 5 test cases from evals/test-cases.yaml - Use 2 isolated sub-agents for each test - Score against all 17 success criteria - Report pass rate, execution time, and token usage - Suggest top 3 improvements if pass rate < 100%"
Sample Evaluation Output
Skill: nda-review v2.3 Test Cases: 5 / 5 executed Sub-Agents: 2 per test (10 total runs) Results: Pass Rate: 15/17 (88.2%) Avg Duration: 34.2s Avg Token Usage: 12,400 tokens Trigger Accuracy: 100% (5/5 correct activations) Failed Criteria: #9 - Missing risk score in 2/5 outputs #14 - Letterhead not applied when input is PDF Suggested Fixes: 1. Add explicit "ALWAYS include risk score" instruction after clause analysis 2. Add PDF-specific preprocessing step to extract text before template application 3. Add validation checkpoint: "Before finalizing, verify: risk score present? letterhead applied?"
10. Description Tuning & Trigger Optimization
In a multi-skill environment, trigger precision is everything. If the model cannot differentiate between two similar skills, the system fails.
Shot / Not Trigger Logic
Every skill description must define clear boundaries using "Shot" (should trigger) and "Not Trigger" (should not trigger) examples. This is how the system disambiguates between similar skills.
description optimizer example # Sales Meeting Summarization Skill description: "Summarize sales call transcripts into structured briefs with action items and next steps, formatted per BRIDGE MT sales process." triggers: shot: - "Summarize the transcript from today's call with the lead from Sandra" - "Process this sales meeting recording" - "Create a brief from the Acme Corp demo call" not_trigger: - "Create a strategy summary for the quarterly planning meeting" - "Summarize the all-hands meeting notes" - "Write meeting minutes for the board session"
This differentiation ensures that the sales skill is not wasted on general strategy tasks, preserving the context window for its intended purpose. Use the Skill Creator's Description Optimizer to refine these boundaries automatically.
Optimization Workflow
terminal /skill-creator "Optimize the description for the 'sales-meeting-summary' skill: - Test trigger accuracy against 20 sample prompts (10 should trigger, 10 should not) - Report current accuracy - Suggest improved description text - Re-test with improved description - Report improvement delta"
11. Obsolescence Management
As underlying LLMs improve, some skills become redundant. Proactive retirement reduces system overhead and ensures the agent uses the most efficient native pathways.
Quarterly Audit Protocol
Every quarter, the team should audit all Capability Uplift skills against the current model's native abilities. The process is straightforward:
archive/ directory with a note
explaining why they were retired. They may become useful again if model behavior changes.docx
Capability Uplift skill may become partially redundant. However, any branding-specific formatting
instructions within it should be extracted and preserved as an Encoded Preference skill before retirement.12. GitHub Actions & Custom Scripts
For durable, server-side automation that runs independently of any developer's machine. This is Tier 3 of the automation spectrum — always-on, event-driven or scheduled.
Claude Code in GitHub Actions
Claude Code can be integrated into GitHub Actions workflows to automate code reviews, documentation generation,
security audits, and more. The key is using claude -p (print mode) for non-interactive execution.
.github/workflows/claude-pr-review.yml name: Claude PR Review on: pull_request: types: [opened, synchronize] jobs: review: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 with: fetch-depth: 0 - name: Install Claude Code run: npm install -g @anthropic-ai/claude-code - name: Run Code Review env: ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }} run: | # Get the diff for this PR DIFF=$(git diff origin/main...HEAD) # Run Claude in print mode for non-interactive use claude -p "Review this PR diff. Focus on: 1. Security vulnerabilities 2. Performance issues 3. Code style violations per our CLAUDE.md standards 4. Missing error handling Output as a structured markdown review. Diff: $DIFF" > review.md - name: Post Review Comment uses: actions/github-script@v7 with: script: | const fs = require('fs'); const review = fs.readFileSync('review.md', 'utf8'); github.rest.issues.createComment({ owner: context.repo.owner, repo: context.repo.repo, issue_number: context.issue.number, body: review });
Scheduled GitHub Actions (Cron)
.github/workflows/nightly-audit.yml name: Nightly Code Audit on: schedule: - cron: '0 2 * * 1-5' # Weekdays at 2 AM UTC jobs: audit: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Install Claude Code run: npm install -g @anthropic-ai/claude-code - name: Run Nightly Audit env: ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }} run: | claude -p "Perform a nightly code audit: 1. Check for TODO/FIXME comments added in the last 24h 2. Identify any new dependencies added (check package.json changes) 3. Flag any files over 500 lines that may need refactoring 4. Check for hardcoded secrets or API keys 5. Output a structured JSON report" > audit.json
Custom Scripts (Standalone)
For automation outside of GitHub — internal tools, local cron jobs, or CI/CD pipelines in other systems — use Claude Code in script mode:
scripts/daily-summary.sh #!/bin/bash # Daily engineering summary — run via crontab or systemd timer export ANTHROPIC_API_KEY="$CLAUDE_API_KEY" # Generate summary SUMMARY=$(claude -p "Generate today's engineering summary: - Parse git log for the last 24 hours - Count merged PRs, open PRs, and closed issues - Identify the top 3 most-changed files - Format as a concise Slack message with emoji") # Post to Slack via webhook curl -X POST "$SLACK_WEBHOOK_URL" \ -H 'Content-type: application/json' \ -d "{\"text\": \"$SUMMARY\"}"
13. Scheduled Tasks (Desktop App)
The Claude Desktop App and Cowork mode provide a built-in scheduling system for recurring tasks — no GitHub Actions or cron setup required. Ideal for business-process automation.
Creating Scheduled Tasks
In Cowork or the Desktop App, you can create tasks that run on a schedule. These persist between sessions and execute automatically as long as the app is running.
example: creating a scheduled task # Ask Claude to create a scheduled task: "Create a scheduled task called 'Weekly Skill Audit' that runs every Monday at 9 AM. It should: 1. List all active skills in the skills/ directory 2. Check each Capability Uplift skill for obsolescence 3. Run a basic pass rate test on the top 5 most-used skills 4. Generate a brief audit report 5. Save the report as weekly-audit-{date}.md"
Task Management
list_scheduled_tasks tool
to see every active schedule, its frequency, and last run status.
create_scheduled_task with proper cron syntax.
update_scheduled_task.
14. Enterprise Integration
For AI to drive true ROI, it must operate within a shared context — where a single skill can inform multiple platforms simultaneously.
Integration Pathways
MCP Servers (Model Context Protocol)
MCP servers are the primary integration mechanism. They allow Claude to interact with external services (Slack, GitHub, Google Drive, Figma, databases) through standardized tool interfaces.
adding an MCP server # Add a Slack MCP server to your project claude mcp add slack-server # The server provides tools like: # - slack_send_message # - slack_search_channels # - slack_read_thread # These become available as tools in every Claude session for this project.
CLI-First Integration Philosophy
For Google Workspace, internal APIs, and other services, we prefer a CLI-based approach over heavy middleware. The reasoning: CLI is the most token-efficient abstraction layer. It avoids the massive overhead of complex API request/response cycles and keeps the context window focused on the actual task.
HTML over Native Formats
For generated outputs like slide decks and reports, AI produces superior results in HTML. We use HTML for internal assets and convert to legacy formats (PowerPoint, PDF) only when necessary for external distribution. This approach leverages the model's strongest output modality.
Shared Context Architecture
The goal is "analyze once, deploy everywhere." A single skill execution can feed data into Excel for analysis, PowerPoint for presentation, and Slack for notification — all within the same context window.
15. Compliance & Security
Enterprise deployment must adhere to strict safety protocols. Data sovereignty, environment isolation, and risk mitigation are non-negotiable foundations.
Security Principles
Permission Management
terminal # View current permissions /permissions # Project-level permissions in .claude/settings.json { "permissions": { "allow": [ "Read", "Edit", "Bash(npm test)", "Bash(npm run build)", "mcp__slack__*" ], "deny": [ "Bash(rm -rf *)", "Bash(git push --force)" ] } }
Commit this configuration to your repo so all team members inherit the same guardrails. Use allow-lists (not deny-lists) as the primary security model: explicitly permit what's needed, deny everything else by default.
Quick Reference Card
Daily Commands
| Action | Command |
|---|---|
| Start a session | claude |
| One-shot command | claude "your prompt here" |
| Script/pipe mode | claude -p "prompt" | output |
| Start a loop | /loop "workflow description" |
| Check token cost | /cost |
| Compress context | /compact |
| Code review | /review |
| Create/test a skill | /skill-creator |
Skill Lifecycle Checklist
| Phase | Action | Tool |
|---|---|---|
| 1. Design | Define gold standard output | Manual |
| 2. Build | Create SKILL.md via reverse-engineering | Skill Creator |
| 3. Test | Run evaluations with sub-agents | Skill Creator evals |
| 4. Tune | Optimize description & triggers | Description Optimizer |
| 5. Deploy | Commit to repo, share with team | Git |
| 6. Monitor | Track pass rate, token usage, trigger accuracy | Benchmarks |
| 7. Audit | Quarterly obsolescence check | Native comparison test |
Automation Promotion Path
BRIDGE MT · Claude Code Developer Tutorial v1.0 · March 2026
For internal use. Questions? Reach out in #engineering on Slack.
Based on the Operational Excellence Framework for Enterprise AI Skill Lifecycle.