Dreadnode Workers: The Connective Tissue for Flexible Agent Integration and Orchestration

Through research, real-world ops, and constant iteration, we’ve learned a lot about what AI infrastructure for the security stack actually requires. We’ve found that flexibility is the hardest problem to navigate as security teams build agentic systems.

Today’s options for building and deploying agents are incredibly rigid, forcing engineers and operators to build custom integrations every time they need to connect agents to existing tools, Slack alerts, webhook callbacks, cron jobs, etc. The workflow you need today may look nothing like the workflow you’ll need next month when the threat surface shifts or a new agent orchestration pattern drops. To keep up, your agent platform has to move as fast as you do and easily integrate with your existing tech stack. We built a primitive for exactly this: a component that connects to the tools you already run, from C2 frameworks and recon feeds to exploit pipelines, webhooks, internal APIs, and custom scripts. Stand it up in minutes, reshape it as your work evolves.

Beyond the agent lifecycle

The first wave of agentic engineering gave us a chat loop. You sent a message, the model thought, the model called tools in a loop, the model answered. Done.

We’ve pushed past that with agent lifecycle hooks. Lifecycle hooks let you react to what happens during an agent session: intercept tool calls, handle errors differently depending on context, inject context between turns, and shape how the agent behaves while it’s running. That covers a lot of use cases, and for some workflows it’s all you need. But more complex workflows don’t fit inside a single session or agent lifecycle at all.

Someone doing source-code review might want an attack-surface map at first, then they may want to add five specialists running in parallel against that map. Maybe a final reviewer that reconciles all of it and a validator per high or critical finding. Each in its own session, each labeled, each visible in the trace UI.

Someone running a SOC bridge doesn’t want a chat session at all. They want a process that subscribes to a webhook, opens a session when an alert comes in, and closes when the investigation finishes.

Someone running periodic re-evaluation against a model that ships weekly doesn’t want a human pressing a button. They want cron.

These aren’t situations that fit inside a chat session. They’re background processes. They need a place to live alongside the agent runtime, with their own state and a way to communicate back to the system.

That place is a worker.

What is a worker?

A worker is a long-running background component that lives inside a capability and runs alongside your agents, tools, and prompts. If you’re not familiar with capabilities, they’re our unit of deployment: a single package that bundles everything an agentic system needs to do its job, from tools and skills to hooks and configuration. Workers extend that boundary with background processes that the runtime loads when the capability starts, dispatches events to, runs on schedules, supervises, and tears down on reload.

The design was broadly inspired by our experimentation with the actor model. Each worker is an independent unit with its own state that communicates via message passing rather than shared memory. But we didn’t want the overhead that comes with a full actor framework, so we built workers as simple Python functions. You write functions, decorate them, and the runtime handles the rest. An example to illustrate:

Define an event handler:

# workers/notifier.py
from dreadnode.capabilities.worker import Worker, EventEnvelope, RuntimeClient

worker = Worker(name="notifier")

@worker.on_event("session.created")
async def announce(event: EventEnvelope, client: RuntimeClient) -> None:
    await client.notify(title=f"Session started: {event.session_id[:8]}")

if __name__ == "__main__":
    worker.run()

Wire it into the capability manifest:

# capability.yaml
name: notifier
version: 0.1.0

workers:
  notifier:
    path: workers/notifier.py

That’s it. The runtime now pushes a notification every time a session opens.

The five worker decorators cover most of what background code wants to do:

Decorator	When it fires
`@worker.on_startup`	Once, before any handlers
`@worker.on_shutdown`	Once, on capability reload or runtime stop
`@worker.on_event("kind")`	Every time an event of that kind hits the message bus
`@worker.every(seconds=…)` / `every(cron=…)`	On a schedule
`@worker.task`	Supervised long-running coroutine, restart-on-crash

State lives on worker.state, a plain dict shared across handlers. Use an asyncio.Lock if you mutate it from concurrent handlers. That’s the whole mental model.

In practice: Workers for source code analysis

A worker has the full runtime client. It can open as many sessions as it likes, run them concurrently, gather results, and stitch them together. That means you can build things that are impossible from inside a single session.

To make this concrete, we built a source code analysis capability that is being released alongside this post as a worked example. A single worker coordinates multiple agents across four stages to run a security review against any GitHub repository. One event comes in with a URL. The worker clones the repo, runs an attack-surface mapper, fans out five specialists in parallel, hands their reports to a final reviewer, then spawns a validator for each high-severity finding. One event goes out with the assembled report. The whole pipeline is ~500 lines of Python in a single file, and everything in between the input and output is plain asyncio.

If you want to jump right in, explore the source code analysis example that utilizes workers and use the capability itself.

Why workers, work

Fan-out is just asyncio. You don’t need a special orchestration layer. Bound your concurrency with a semaphore, gather your results, and move on. Each agent runs in its own session, and the reports publish onto the message bus the moment they’re ready. Users don’t wait for the slowest one.

Progress streams onto the message bus. While a pipeline runs, the worker publishes progress events. A launcher script, a UI, another worker, a Slack bridge, whatever, subscribes and renders the events live. None of this requires a persistent client connection. Events land on the bus whether anyone’s listening, and you can reconnect later to pick up where you left off.

What ties all of this together is how little ceremony is involved. You don’t stand up a new service. You don’t wire up a new deployment. You write a Python file, add a few lines to the manifest, and push the capability. And because workers live inside the capability boundary, they inherit everything the runtime already gives you: auth, event routing, session management, tracing. You’re not integrating from scratch every time. You’re plugging into what’s already there.

Use cases beyond multi-agent orchestration

The multi-agent orchestration pipeline is one shape workers can take. The primitive supports several more.

External bridges. Forward events to Slack, PagerDuty, or a webhook. Consume external callbacks and create matching sessions:

@worker.on_event("turn.completed")
async def to_slack(event: EventEnvelope, client: RuntimeClient) -> None:
    await httpx.post(SLACK_URL, json={"text": _summarize(event.payload)})

@worker.on_event("capability.bridge.callback_received")
async def from_slack(event: EventEnvelope, client: RuntimeClient) -> None:
    session = await client.create_session(
        capability="bridge",
        agent="triage",
        session_id=f"callback-{event.payload['callback_id']}",
    )
    async for _ in client.stream_chat(
        session_id=session.session_id,
        message=f"Investigate callback: {event.payload}",
    ):
        pass

Cron sweeps. Re-evaluate an evaluation set when a new model ships. Sweep stale runs. Rotate caches:

@worker.every(cron="0 9 * * 1")        # 9am every Monday
async def reeval_against_latest(client: RuntimeClient) -> None:
    await client.publish("eval.requested", {"model": "anthropic/claude-opus-4-7"})

Stateful loops. Tail an external queue. Drive a long-running process. Supervise an MCP transport. The @worker.task decorator gives you a supervised coroutine with restart-on-crash backoff:

@worker.task
async def reader(client: RuntimeClient) -> None:
    async for message in worker.state["ws"]:
        await process(message, client)

Same five decorators every time. The shape changes; the primitive doesn’t.

Why this matters for cyber + AI

There’s a reason we built this primitive into the capability boundary instead of standing it up as a separate “operations” service.

Security teams are surrounded by signals. Alerts firing, C2 webhooks arriving, callbacks landing from tools you integrated last quarter. The question isn’t whether an agent could do something useful with those signals. It’s whether the platform makes it easy enough to wire them up that you actually do it. Workers are that wiring. A Slack message comes in and an agent starts triaging. A new model drops and your evaluation suite runs automatically. A webhook fires and a multi-stage pipeline kicks off without anyone pressing a button. Each of those is a few lines of Python and a decorator.

And because workers live inside the capability boundary, you’re not building a new integration from scratch every time. You’re plugging into what the runtime already gives you. The gap between ‘I want to connect X to my agents’ and ‘X is triggering agent work’ should be measured in hours, not weeks. When spinning up a new worker is an afternoon of effort instead of a week-long project, you stop hesitating. You build the one you need today, another one tomorrow, and keep pace as the work changes.

Our philosophy is simple: the platform should be flexible enough to anticipate and react quickly to the threat landscape and the rapid pace of AI. Workers are one of the ways we make that possible.

Think about what signals are sitting around your environment right now that could be kicking off agent work automatically. Try writing a worker for one of them. The whole reference for the primitive is two pages long; the example in this repo is ~500 lines including comments. If you get stuck, the small toy example above is a good place to start. If you’d rather start with something ready to run, the source code analysis capability is fully functional out-of-the-box and we’ll be releasing more flows and worker templates in the coming weeks.

We’re shipping more capability primitives like this because we believe rigidity is the enemy of progress. The shape is always the same: small surface, fast iteration, no special platform team between you and the thing you want to build.

— The Dreadnode Team