EDGE by Blue Orange Digital

Orchestrated Systems: Done Without You.

Tier 4: Orchestrated Systems

L4 is multi-agent. Specialized agents (planner, researcher, writer, reviewer, executor) coordinate via shared state, handoffs, and a control plane. Runs durably across hours or days, with checkpointing, parallelism, retries, audit. The "agent fleet" becomes a real org chart. "AI ops" becomes a function.

L5Autonomous OperationsAgents orchestrate agents

L4Orchestrated SystemsAgent runs, human edge-gated

L3Workflow AgentsBOD entryAI runs a step, human reviews

L2Connected IntelligenceHuman drives every step

L1Chat & CopilotPeople, prompting

Talk about Edge Scale Approach L5 →

~2%

Adoption

The agent (bounded)

Who orchestrates

$1.5-3.5M

Implementation

12-18 months

Time to tier

750-1,850 bps

EBITDA lift (cum.)

Definition

A graph of agents, with shared state, evals, and a kill switch.

Technically: a graph of agents (LangGraph, CrewAI, AutoGen/AG2) or a managed platform (Bedrock AgentCore, Vertex Agent Builder, Mosaic Agent Bricks, Cortex Agents), with shared memory, evals as the heartbeat, observability, and edge-gated escalation.

Not human-in-the-loop as default. HITL is the exception, not the rule: if the evals can't tell whether the output is right, fixing the evals is the priority, not stapling a reviewer to the loop.

L4 is also where lock-in becomes a real exit consideration. Bedrock, Vertex, Agentforce, and Mosaic Agent Bricks create durable platform dependencies. Edge Scale keeps the orchestration control plane portable across these runtimes, so a buyer doesn't inherit a vendor.

Framework vs. runtime: the distinction that orders the stack

Framework · the harness

Chosen for control & inspectability.

LangGraph

CrewAI

Claude Agent SDK

Decoupled

Runtime · where it executes

Chosen for governance & data proximity.

Bedrock AgentCore

Mosaic AI

Azure AI Foundry

Keeping the harness decoupled from the runtime lets us change one without re-platforming the other. How we engage →

What L4 actually looks like in a portfolio company

Function	In practice	Signal
Support	Agent-run tier 1 and tier 2; humans escalate only on edge cases	Outcome-priced
Sales	An SDR agent pool: research → send → reply classification → meeting booked	Fleet scale
Finance	AP agent: invoice intake → matching → approval routing → exception escalation	Approve exceptions only
HR	Recruiting agent: inbound screening → scheduling → first-round scorecard	Recruiter sees finalists
Engineering	Multi-Devin or Claude Agent SDK fleet working a backlog you can't hire against	Parallel SWE agents

The named picks at L4.

Multi-agent framework

LGLangGraph + LangSmith

BOD L4 default. Checkpointing, time-travel debug, graph viz. 110K+ stars, 35% of Fortune 500. · Lock-in: Low

Multi-agent (alt)

CRCrewAI

Role/crew metaphor lands with stakeholders. Insight-backed. Managed Enterprise. · Lock-in: Low

MS-aligned

AutoGen / AG2

Strong conversation patterns. Microsoft-adjacent procurement. · Lock-in: Low

Anthropic-native

CLClaude Agent SDK

The cleanest path when Claude is primary. · Lock-in: Low

AWS-managed

Bedrock AgentCore

Fully managed. Memory + KBs + action groups. AWS-only. · Lock-in: High (AWS)

GCP-managed

Vertex AI Agent Builder

ADK + Agent Studio + 200+ models including Claude and Gemini. · Lock-in: High (GCP)

Databricks-managed

Mosaic AI + Agent Bricks

Auto-tunes against benchmarks. Presupposes Lakehouse. · Lock-in: High (Dbrx)

Snowflake-managed

Cortex Agents

Snow-native. Easiest path if Snow is system of record. · Lock-in: High (Snow)

MS-managed

Copilot Studio + Azure AI Foundry

Most-adopted enterprise platform (38.6% per JetBrains 2026 survey). · Lock-in: Very high

Vertical CX

DEDecagon

$4.5B val. Notion, Duolingo, Substack logos. Proven multi-agent CX. · Lock-in: Medium-high

Vertical SWE

DVCognition Multi-Devin

Parallel autonomous SWE. Pricey; needs supervision. · Lock-in: Medium

Evals heartbeat

BTBraintrust + OTel

Quality as engineering. OpenTelemetry as substrate. · Lock-in: Low

BOD positioning

Edge Scale is BOD's Agent Ops control plane, orchestration-agnostic by design: it runs on top of LangGraph, CrewAI, Bedrock, Vertex, Cortex, or Mosaic, and stays cloud-portable. Governance, audit, cost attribution, RBAC, SSO, connector catalog, deployment blueprints, and a managed tier targeting 99.9% SLA. "The architecture is the asset."

L4 in production today.

Support

Agent-run tier 1 + 2

Sierra (outcome-priced), Decagon (multi-agent enterprise), or LangGraph-built with Edge Scale on top. Human escalation only at policy edges.

Sales

SDR agent pool

A fleet of research → outreach → reply classification → meeting agents, with per-agent budgets and a shared eval harness. Pipeline scales without headcount.

Finance ops

AP agent fleet

Invoice intake → matching → approval routing → exception escalation. Finance approves exceptions, not invoices.

Recruiting fleet

Inbound screening → scheduling → first-round scorecard. The recruiter takes only shortlisted candidates into final rounds.

Engineering

Multi-Devin or Agent SDK fleet

Parallel autonomous SWE agents working the backlog the team can't hire against. Supervised by senior engineers; not autonomous in the L5 sense.

Product-embedded

The agent becomes the flagship value

Onboarding agent runs sign-up → first value. CS agent watches usage, flags risk, drafts save motions. Pricing model starts shifting seat-based → outcome-based.

How L4 stalls.

Agents talking to agents, no observability. Failures uninterpretable, costs spike, trust dies.
Multi-agent as theater. "Specialists" are the same model with different prompts, just more tokens.
Bedrock Agents as a shortcut. Managed runtime doesn't replace evals, observability, or workflow ownership.
No memory governance. Persistent memory inherits permissions from nothing; CFO query surfaces HR doc 90 days later.
Skipping the eval harness. "We'll measure once it's live" means measuring via your first incident.
HITL as default. "Edge gating, not human-in-the-loop. HITL as a default is a design decision that says we don't trust our evals. Fix the evals."

What L5 looks like.

The system initiates work unprompted: schedule, signal, or memory trigger.
It learns from its own outputs (not just "we update the prompt monthly").
Durable, governed memory across weeks and months.
It can refuse work, knows what's out-of-scope or needs escalation.
KPIs are business outcomes (revenue, retention, cycle time), not "tasks completed."

L5: Autonomous Operations

Edge Scale is the L4 control plane.

Governance, audit, cost attribution, RBAC, SSO, connector catalog, deployment blueprints: a managed tier targeting 99.9% SLA, orchestration-agnostic by design. Cloud-portable across Databricks Mosaic AI, Snowflake Cortex, and AWS Bedrock AgentCore.

Talk to BOD