Technology Apr 27, 2026 · 15 min read

Deep Dive into Open Agent SDK (Part 4): Multi-Agent Collaboration

A single Agent, no matter how powerful, is just one executor. Real development tasks are often multi-step and multi-role: someone explores the codebase, someone designs a plan, then someone writes code and runs tests. A single Agent working alone easily bloats its context and loses efficiency. Open...

DE
DEV Community
by NEE
Deep Dive into Open Agent SDK (Part 4): Multi-Agent Collaboration

A single Agent, no matter how powerful, is just one executor. Real development tasks are often multi-step and multi-role: someone explores the codebase, someone designs a plan, then someone writes code and runs tests. A single Agent working alone easily bloats its context and loses efficiency.

Open Agent SDK addresses this at three levels:

  1. Sub-Agents — The main Agent dynamically spawns sub-agents during execution, delegating specialized tasks
  2. Task System — Tracks progress and results of multi-step work
  3. Team + Messaging — Multiple Agents form a team, communicating via a mailbox system

This article analyzes each level's implementation, then examines how they combine for task orchestration.

1. Sub-Agents: SubAgentSpawner Protocol and AgentTool

SubAgentSpawner Protocol

Sub-agent spawning isn't AgentTool directly creating a new Agent — there's a protocol layer in between. SubAgentSpawner is defined in Types/AgentTypes.swift:

public protocol SubAgentSpawner: Sendable {
    func spawn(
        prompt: String,
        model: String?,
        systemPrompt: String?,
        allowedTools: [String]?,
        maxTurns: Int?
    ) async -> SubAgentResult

    func spawn(
        prompt: String,
        model: String?,
        systemPrompt: String?,
        allowedTools: [String]?,
        maxTurns: Int?,
        disallowedTools: [String]?,
        mcpServers: [AgentMcpServerSpec]?,
        skills: [String]?,
        runInBackground: Bool?,
        isolation: String?,
        name: String?,
        teamName: String?,
        mode: PermissionMode?,
        resume: String?
    ) async -> SubAgentResult
}

Two methods: a basic version (5 parameters) and an enhanced version (13 parameters). The protocol also provides a default implementation where the enhanced version calls the basic one, so existing implementations don't need changes to be compatible.

Why is the spawner defined in Types/ instead of Core/? Because Tools/Advanced/AgentTool.swift needs it, but Tools/ shouldn't import Core/. The protocol is defined in Types/, with concrete implementation in Core/, injected via ToolContext.agentSpawner — a common dependency inversion pattern in the SDK.

DefaultSubAgentSpawner Implementation

DefaultSubAgentSpawner in Core/DefaultSubAgentSpawner.swift does the following:

final class DefaultSubAgentSpawner: SubAgentSpawner, @unchecked Sendable {
    private let apiKey: String
    private let baseURL: String?
    private let parentModel: String
    private let parentTools: [ToolProtocol]
    private let provider: LLMProvider
    private let client: (any LLMClient)?

    func spawn(...) async -> SubAgentResult {
        // 1. Filter out AgentTool to prevent infinite recursion
        var subTools = parentTools.filter { $0.name != "Agent" }

        // 2. If allowedTools specified, filter further
        if let allowed = allowedTools, !allowed.isEmpty {
            let allowedSet = Set(allowed)
            subTools = subTools.filter { allowedSet.contains($0.name) }
        }

        // 3. disallowedTools filters again (higher priority than allowedTools)
        if let disallowed = disallowedTools, !disallowed.isEmpty {
            let disallowedSet = Set(disallowed)
            subTools = subTools.filter { !disallowedSet.contains($0.name) }
        }

        // 4. Create sub-agent and execute
        let options = AgentOptions(
            apiKey: apiKey,
            model: model ?? parentModel,
            systemPrompt: systemPrompt,
            maxTurns: maxTurns ?? 10,
            tools: subTools
        )
        let agent = Agent(options: options)
        let result = await agent.prompt(prompt)

        return SubAgentResult(
            text: result.text.isEmpty
                ? "(Subagent completed with no text output)"
                : result.text,
            toolCalls: [],
            isError: result.status != .success
        )
    }
}

Key points:

  • Recursion prevention: Sub-agents never receive AgentTool, preventing Agent-in-Agent-in-Agent scenarios
  • Tool inheritance: Sub-agents inherit all parent tools (except AgentTool) by default, but can be restricted via allowedTools/disallowedTools
  • Blocking execution: The parent Agent awaits after calling spawn(), waiting for the sub-agent to finish before continuing

AgentTool: The Sub-Agent Tool as Seen by the LLM

AgentTool is the tool exposed to the LLM. When the LLM calls the Agent tool, it passes a prompt and parameters. AgentTool handles calling the spawner to generate a sub-agent.

It has two built-in sub-agent types:

private let BUILTIN_AGENTS: [String: AgentDefinition] = [
    "Explore": AgentDefinition(
        name: "Explore",
        description: "Fast agent specialized for exploring codebases...",
        systemPrompt: "You are a codebase exploration agent...",
        tools: ["Read", "Glob", "Grep", "Bash"],
        maxTurns: 10
    ),
    "Plan": AgentDefinition(
        name: "Plan",
        description: "Software architect agent for designing implementation plans...",
        systemPrompt: "You are a software architect. Design implementation plans...",
        tools: ["Read", "Glob", "Grep", "Bash"],
        maxTurns: 10
    ),
]
  • Explore: Codebase exploration, using Glob to find files, Grep to search content, Read to read files
  • Plan: Software architect, understanding the codebase then outputting implementation plans

When the LLM calls AgentTool, it specifies the type via the subagent_type field:

{
  "prompt": "Explore the project structure and find all Swift source files",
  "description": "Explore codebase",
  "subagent_type": "Explore"
}

AgentTool also supports optional parameters: model, maxTurns, run_in_background, isolation, team_name, mode. These are passed through to the spawner.

A Complete Example

The SDK includes a SubagentExample demonstrating the full flow of a coordinator main Agent delegating to an Explore sub-agent:

// Main agent system prompt
let systemPrompt = """
You are a coordinator agent. When given a task, you should delegate it to a sub-agent \
using the Agent tool. The Agent tool will spawn a specialized agent (e.g., "Explore" type) \
that can use Read, Glob, Grep, and Bash tools to investigate the codebase. \
After the sub-agent returns its findings, summarize the results for the user.
"""

// Register tools: core tools + AgentTool
let agent = createAgent(options: AgentOptions(
    apiKey: apiKey,
    model: defaultModel,
    systemPrompt: systemPrompt,
    maxTurns: 10,
    tools: getAllBaseTools(tier: .core) + [createAgentTool()]
))

// Send task — main Agent will call AgentTool to delegate to Explore sub-agent
for await message in agent.stream("""
    Explore the current project directory. Find all Swift source files, \
    examine the project structure, and provide a summary. \
    Use the Agent tool to delegate this task to an Explore sub-agent.
""") {
    switch message {
    case .toolUse(let data):
        if data.toolName == "Agent" {
            print("[Sub-agent Delegation: \(data.toolName)]")
        }
    case .toolResult(let data):
        print("[Result: \(data.content.prefix(200))]")
    case .result(let data):
        print("Turns: \(data.numTurns), Cost: $\(data.totalCostUsd)")
    default:
        break
    }
}

Execution flow: user sends prompt → main Agent decides it needs to explore the codebase → calls AgentTool → AgentTool spawns Explore sub-agent via spawner → sub-agent uses Glob/Grep/Read → results returned to main Agent → main Agent summarizes and responds to user.

2. Task System: Task Tracking and State Machine

Sub-agents solve the "who does the work" problem. The Task system solves "how much work is done, who's doing it, and what are the results."

TaskStore: Thread-Safe Actor

TaskStore is a Swift Actor, ensuring concurrency safety:

public actor TaskStore {
    private var tasks: [String: Task] = [:]
    private var taskCounter: Int = 0

    public func create(
        subject: String,
        description: String? = nil,
        owner: String? = nil,
        status: TaskStatus = .pending
    ) -> Task {
        taskCounter += 1
        let id = "task_\(taskCounter)"
        let now = dateFormatter.string(from: Date())
        let task = Task(
            id: id, subject: subject, description: description,
            status: status, owner: owner,
            createdAt: now, updatedAt: now
        )
        tasks[id] = task
        return task
    }
}

Using an Actor instead of a regular class means all methods are implicitly serialized — no manual locking needed. Multiple Agents creating tasks simultaneously won't cause race conditions.

Task State Machine

Tasks have 5 states with clear transition rules:

public enum TaskStatus: String, Sendable, Equatable, Codable {
    case pending      // Waiting to start
    case inProgress   // In progress
    case completed    // Completed
    case failed       // Failed
    case cancelled    // Cancelled
}

State transitions have constraints: pending and inProgress can transition to any state, but completed, failed, and cancelled are terminal states that cannot change:

private func isValidTransition(from: TaskStatus, to: TaskStatus) -> Bool {
    switch from {
    case .pending, .inProgress:
        return true
    case .completed, .failed, .cancelled:
        return false  // Terminal state, cannot transition
    }
}

As a state diagram:

pending ──→ inProgress ──→ completed
   │            │
   │            ├──→ failed
   │            │
   └──→ cancelled ←──┘

TaskStatus also has a convenient parse() method supporting both camelCase (inProgress) and snake_case (in_progress), since LLM JSON formats aren't always consistent:

public static func parse(_ string: String) -> TaskStatus? {
    if let direct = TaskStatus(rawValue: string) { return direct }
    // snake_case → camelCase
    let camel = string
        .split(separator: "_")
        .enumerated()
        .map { $0.offset == 0 ? String($0.element) : String($0.element).capitalized }
        .joined()
    return TaskStatus(rawValue: camel)
}

Task Struct

A Task instance includes dependency relationships and metadata beyond basic status tracking:

public struct Task: Sendable, Equatable, Codable {
    public let id: String
    public var subject: String
    public var description: String?
    public var status: TaskStatus
    public var owner: String?        // Who's working on it
    public let createdAt: String
    public var updatedAt: String
    public var output: String?       // Result
    public var blockedBy: [String]?  // Blocked by which tasks
    public var blocks: [String]?     // Which tasks this blocks
    public var metadata: [String: String]?
}

The blockedBy and blocks fields show the Task system has built-in task dependency support — Task A can declare "I need Tasks B and C to complete before I can start."

Three Task Tools

The SDK provides three tools for the LLM to operate the Task system:

TaskCreate — Create a task:

public func createTaskCreateTool() -> ToolProtocol {
    return defineTool(
        name: "TaskCreate",
        description: "Create a new task for tracking work progress.",
        inputSchema: taskCreateSchema,
        isReadOnly: false
    ) { (input: TaskCreateInput, context: ToolContext) in
        guard let taskStore = context.taskStore else {
            return ToolExecuteResult(content: "Error: TaskStore not available.", isError: true)
        }
        let initialStatus: TaskStatus = input.status.flatMap { TaskStatus.parse($0) } ?? .pending
        let task = await taskStore.create(
            subject: input.subject,
            description: input.description,
            owner: input.owner,
            status: initialStatus
        )
        return ToolExecuteResult(
            content: "Task created: \(task.id) - \"\(task.subject)\" (\(task.status.rawValue))",
            isError: false
        )
    }
}

TaskList — List tasks (supports filtering by status and owner):

// LLM can query "list all pending tasks" or "list tasks assigned to agent-1"
let tasks = await taskStore.list(status: status, owner: input.owner)

TaskUpdate — Update a task (status, description, owner, output):

do {
    let task = try await taskStore.update(
        id: input.id,
        status: status,
        description: input.description,
        owner: input.owner,
        output: input.output
    )
    return ToolExecuteResult(
        content: "Task updated: \(task.id) - \(task.status.rawValue) - \"\(task.subject)\"",
        isError: false
    )
} catch let error as TaskStoreError {
    return ToolExecuteResult(content: "Error: \(error.localizedDescription)", isError: true)
}

Note that TaskUpdate throws invalidStatusTransition errors — e.g., trying to change a completed task to inProgress. The LLM receives the error message and can adjust its strategy.

3. Team System: Team Formation and Management

The Task system tracks "what to do." The Team system answers "who works with whom."

TeamStore

Like TaskStore, TeamStore is an Actor:

public actor TeamStore {
    private var teams: [String: Team] = [:]
    private var teamCounter: Int = 0

    public func create(
        name: String,
        members: [TeamMember] = [],
        leaderId: String = "self"
    ) -> Team {
        teamCounter += 1
        let id = "team_\(teamCounter)"
        let team = Team(
            id: id, name: name, members: members,
            leaderId: leaderId,
            createdAt: dateFormatter.string(from: Date()),
            status: .active
        )
        teams[id] = team
        return team
    }
}

Teams have two states: active and disbanded. Deleting a Team doesn't actually delete it — the status changes to disbanded. Disbanded Teams cannot have members added or removed.

TeamMember and Roles

public enum TeamRole: String, Sendable, Equatable, Codable {
    case leader   // Team leader
    case member   // Regular member
}

public struct TeamMember: Sendable, Equatable, Codable {
    public let name: String
    public let role: TeamRole
}

When TeamCreateTool creates a Team, all members default to member role, and leaderId defaults to "self" (the creator):

let members: [TeamMember] = input.members?.map { TeamMember(name: $0) } ?? []
let team = await teamStore.create(
    name: input.name,
    members: members,
    leaderId: "self"
)

TeamStore also supports dynamic member management:

// Add member
try teamStore.addMember(teamId: "team_1", member: TeamMember(name: "agent-coder"))

// Remove member
try teamStore.removeMember(teamId: "team_1", agentName: "agent-coder")

// Find which team an Agent belongs to
let team = await teamStore.getTeamForAgent(agentName: "agent-coder")

getTeamForAgent is important for messaging — when sending a message, you need to know which Team the sender belongs to in order to verify the recipient is a teammate.

AgentRegistry: Agent Registration

Besides TeamStore, there's an AgentRegistry tracking all active Agents:

public actor AgentRegistry {
    private var agents: [String: AgentRegistryEntry] = [:]
    private var nameIndex: [String: String] = [:]  // name -> agentId

    public func register(agentId: String, name: String, agentType: String) throws -> AgentRegistryEntry {
        if nameIndex[name] != nil {
            throw AgentRegistryError.duplicateAgentName(name: name)
        }
        let entry = AgentRegistryEntry(...)
        agents[agentId] = entry
        nameIndex[name] = agentId
        return entry
    }

    public func getByName(name: String) -> AgentRegistryEntry? {
        guard let agentId = nameIndex[name] else { return nil }
        return agents[agentId]
    }
}

Name uniqueness constraint — no two Agents with the same name in one AgentRegistry. nameIndex is a reverse lookup index supporting O(1) name lookups.

4. Messaging: MailboxStore and SendMessage

With Teams in place, Agents need to communicate. The SDK uses a Mailbox pattern — messages aren't pushed directly to the recipient but placed in their mailbox for them to pick up.

MailboxStore

public actor MailboxStore {
    private var mailboxes: [String: [AgentMessage]] = [:]

    // Point-to-point send
    public func send(from: String, to: String, content: String, type: AgentMessageType = .text) {
        let message = AgentMessage(from: from, to: to, content: content,
                                   timestamp: dateFormatter.string(from: Date()), type: type)
        if mailboxes[to] == nil { mailboxes[to] = [] }
        mailboxes[to]?.append(message)
    }

    // Broadcast — to all Agents with mailboxes
    public func broadcast(from: String, content: String, type: AgentMessageType = .text) {
        let timestamp = dateFormatter.string(from: Date())
        for (agentName, _) in mailboxes {
            let message = AgentMessage(from: from, to: agentName, content: content,
                                       timestamp: timestamp, type: type)
            mailboxes[agentName]?.append(message)
        }
    }

    // Read and clear mailbox
    public func read(agentName: String) -> [AgentMessage] {
        guard let messages = mailboxes[agentName] else { return [] }
        mailboxes[agentName] = []  // Clear after reading
        return messages
    }
}

Three core operations: send (point-to-point), broadcast (broadcast), read (read). read is destructive — reading clears the mailbox. broadcast only sends to Agents that already have mailboxes, not creating new ones.

Message types beyond plain text (.text) include .shutdownRequest, .shutdownResponse, .planApprovalResponse — special types for team management coordination.

SendMessage Tool

SendMessageTool performs three layers of validation:

// 1. Must have MailboxStore
guard let mailboxStore = context.mailboxStore else { ... }
// 2. Must have TeamStore
guard let teamStore = context.teamStore else { ... }
// 3. Must know who the sender is
guard let senderName = context.senderName else { ... }

// 4. Sender must be in a Team
guard let team = await teamStore.getTeamForAgent(agentName: senderName) else { ... }

// 5. Recipient must be a teammate
let isMember = team.members.contains { $0.name == input.to }
guard isMember else { ... }

Broadcast uses "*" as recipient:

{ "to": "*", "message": "Phase 1 complete, starting Phase 2." }

Point-to-point uses a specific name:

{ "to": "agent-coder", "message": "Here's the spec for module A." }

Failed validations return error messages. The LLM can see which members are available and adjust the target.

5. Orchestration Patterns: Combining These Capabilities

Individual Agent, Task, Team, and Mailbox capabilities are clear. How do they combine in practice?

Pattern 1: Main Agent + Parallel Sub-Agents

The simplest pattern. The main Agent receives a complex task and launches multiple sub-agents simultaneously, each handling a portion:

let agent = createAgent(options: AgentOptions(
    apiKey: apiKey,
    model: "claude-sonnet-4-6",
    systemPrompt: """
    You are a coordinator. Break complex tasks into subtasks, \
    delegate each to an Explore sub-agent, then synthesize results.
    """,
    maxTurns: 20,
    tools: getAllBaseTools(tier: .core) + [
        createAgentTool(),
        createTaskCreateTool(),
        createTaskUpdateTool(),
        createTaskListTool()
    ],
    taskStore: TaskStore()
))

The LLM might orchestrate like this:

  1. TaskCreate("Analyze module A") — Create task
  2. Agent(prompt: "Analyze module A", subagent_type: "Explore") — Delegate to sub-agent
  3. TaskUpdate(id: "task_1", status: "completed", output: result) — Mark complete
  4. Repeat steps 1-3 for other modules
  5. Synthesize all results

Pattern 2: Team Collaboration + Messaging

When multiple Agents need to collaborate long-term, use Team + Mailbox:

let mailboxStore = MailboxStore()
let teamStore = TeamStore()

let agent = createAgent(options: AgentOptions(
    apiKey: apiKey,
    model: "claude-sonnet-4-6",
    agentName: "coordinator",
    mailboxStore: mailboxStore,
    teamStore: teamStore,
    tools: getAllBaseTools(tier: .core) + [
        createAgentTool(),
        createTeamCreateTool(),
        createTeamDeleteTool(),
        createSendMessageTool(),
        createTaskCreateTool(),
        createTaskListTool(),
        createTaskUpdateTool()
    ]
))

The LLM's orchestration might look like:

  1. TeamCreate(name: "refactor-team", members: ["explorer", "planner", "coder"]) — Form team
  2. TaskCreate("Explore codebase", owner: "explorer") — Create task
  3. Agent(prompt: "...", name: "explorer", subagent_type: "Explore") — Launch explore agent
  4. SendMessage(to: "planner", message: "Exploration done, here's the summary...") — Notify planner
  5. TaskCreate("Write implementation plan", owner: "planner") — Next task
  6. Continue progressing...

Pattern 3: Work Queue

Use the Task system as a work queue. The main Agent creates a batch of tasks, and sub-agents claim and execute them one by one:

Main Agent:
  TaskCreate("Fix bug #1")     → task_1 (pending)
  TaskCreate("Fix bug #2")     → task_2 (pending)
  TaskCreate("Add feature X")  → task_3 (pending)

Sub-Agent A:
  TaskList(status: "pending")       → [task_1, task_2, task_3]
  TaskUpdate(task_1, status: "in_progress", owner: "agent-a")
  ... do work ...
  TaskUpdate(task_1, status: "completed", output: "Fixed by ...")

Sub-Agent B:
  TaskList(status: "pending")       → [task_2, task_3]
  TaskUpdate(task_2, status: "in_progress", owner: "agent-b")
  ... do work ...

TaskStore is an Actor, so multiple Agents concurrently updating the same task won't cause issues (first-come-first-served), but there's no automatic assignment — the LLM coordinates who claims which task.

Design Trade-offs

This multi-agent collaboration mechanism involves several deliberate design choices:

Why can't sub-agents spawn their own sub-agents? DefaultSubAgentSpawner filters out AgentTool when creating sub-agents. This is an intentional limit — without it, an Agent spawning an Agent spawning an Agent leads to uncontrollable recursion depth and exponential token consumption.

Why is messaging pull-based instead of push-based? MailboxStore.read() is destructive reading — Agents must actively call to receive messages. This is much simpler than push mode — no callbacks to maintain, no handling for offline Agents. The trade-off is reduced real-time responsiveness, but at the frequency of tool calls in the Agent Loop (tools can be called every turn), pull latency is acceptable.

Why doesn't the Task state machine auto-transition? The blockedBy field declares dependency relationships, but TaskStore.update() doesn't automatically check whether prerequisite tasks are complete. This means "wait for Task A before doing Task B" logic must be implemented by the LLM — calling TaskList to check status, then deciding next steps. This is a pragmatic trade-off: automatic dependency resolution could be added, but for the LLM, explicit checking is more controllable.

Summary

Open Agent SDK's multi-agent collaboration consists of three layers:

  • Sub-Agents: Via SubAgentSpawner protocol and AgentTool, the main Agent dynamically spawns sub-agents at runtime for task delegation, with built-in Explore and Plan types
  • Task System: Task tracking based on TaskStore Actor with a clear state machine (pending → inProgress → completed/failed/cancelled), where terminal states are irreversible
  • Team + Mailbox: TeamStore manages teams and members, MailboxStore implements mailbox-style messaging, supporting point-to-point and broadcast

All three layers can be used independently or combined — use Tasks to track progress, Teams to organize members, Mailbox for coordination, and sub-agents to execute the actual work.

The next article covers the SDK's session persistence: how Agent conversation history is stored, restored, and how to continue previous work after a restart.

Deep Dive into Open Agent SDK (Swift) Series:

GitHub: terryso/open-agent-sdk-swift

DE
Source

This article was originally published by DEV Community and written by NEE.

Read original article on DEV Community
Back to Discover

Reading List