Case study · 2026 · Product design

Atlas —
Control room
for AI agents

As AI agent fleets grow from one to dozens, teams lose visibility. Atlas is a full-stack observability and operations platform — morning briefs, trace explorer, eval pipelines, billing oversight, and multi-tenant collaboration — wrapped in an editorial interface that makes complexity feel manageable.

Type

SaaS Product

Scope

0 → 1 Design

Screens

20 desktop · 10 mobile

Stack

Figma · Prototype

Year

2026

01 · Context

Teams deploying AI agents have no control room

You can deploy an agent in an afternoon. But once you have ten — across different workflows, prompts, and cost centers — visibility collapses. Logs are scattered, evals live in spreadsheets, billing is a surprise at month end.

Problem 01

No unified fleet overview

Teams manage agents across multiple tools. There's no single place to see all agents, their status, spend, and eval health at once.

Problem 02

Traces buried in logs

Debugging a failed run means trawling raw JSON logs. There's no structured trace view — tool calls, LLM outputs, branching decisions — presented as a readable timeline.

Problem 03

Evals happen too late

Quality checks run manually, or not at all. By the time a regression is caught, it's already in production. A continuous eval pipeline with pass/fail trends was missing.

01 · Dashboard

The morning brief.
Fleet at a glance.

Every team lead opens Atlas to the same view: agent fleet health, today's spend vs. budget, run volume, and eval pass rate — all above the fold. The card-based layout gives each agent a name, status badge, and burn rate.

Agent naming (Wren, Tarn, Lev, Sole…) was intentional: names are easier to remember and talk about than UUIDs in incident response.

Fleet overview Status badges Spend tracking Eval pass rate

02 · Agent

Deep dive on
any single agent

Click into an agent and get the full picture: live status, model version, system prompt, run history, cost breakdown, and recent eval scores — all in one scrollable view.

The prompt panel is editable inline: teams can draft and push prompt changes without leaving Atlas, and each version is logged for diffing.

Agent config Prompt versioning Run history Cost per run

03 · Trace explorer

Every run,
fully observable

The trace view renders a complete execution timeline: each tool call, LLM inference, retrieved context chunk, and branch decision — laid out chronologically with latency and token counts.

Clicking any step shows the raw input/output payload. Engineers can pinpoint exactly where a run diverged from expected behavior without leaving the UI.

Execution timeline Tool call inspector Token counts Latency breakdown

04 · Evaluations

Continuous evals.
Catch regressions
before prod.

Every agent has an attached eval suite. On each deploy or prompt change, Atlas runs the suite automatically and plots pass/fail trend over time. Threshold alerts fire if pass rate drops below the team's SLA.

The eval grid shows individual test cases with expected vs. actual output — designed for non-engineers to review quality without reading raw logs.

Auto-run on deploy Pass rate trends Threshold alerts Test case explorer

05 · Tasks

Run management
across the whole fleet

The Tasks view is the ops center: filter runs by agent, status, date range, or cost. Bulk-cancel stuck runs. Inspect any task inline. Export a filtered dataset for analysis.

The task detail slide-over shows full metadata without leaving the list — a pattern that keeps the overall spatial model stable when drilling into specifics.

Run filtering Bulk actions Task detail panel Export

06 · All screens

20 desktop screens

Complete product coverage: onboarding, fleet management, agent config, trace, tasks, evals, billing, audit log, knowledge base, triggers, prompts, notifications, settings, and collaboration.

Home — morning brief

Agent list

Agent detail

Trace explorer

Tasks

Task detail

Evaluations

Billing

Audit log

Knowledge base

Triggers

Prompt editor

Notifications

Settings

Share — collaboration

Empty state

First agent live

Rate limit warning

Timeout error

07 · Mobile

10 phone screens

The mobile experience focuses on monitoring and approvals — check fleet status, review task alerts, and approve or cancel runs from anywhere.

08 · Design decisions

Why editorial, not dashboard-grey

Most ops tools default to cold blue-grey SaaS chrome. Atlas chose the opposite — warm paper tones, serif typography, rust accents — to make intensive monitoring sessions feel less fatiguing.

Decision 01

Fraunces for data, not just headlines

A variable optical-size serif reads equally well at 124px display headings and 13px table cells. The editorial feel signals "this tool has opinions" — it's not a generic dashboard.

Decision 02

Named agents, not UUIDs

When something goes wrong at 2am, you want to say "Wren is stuck" not "agent-7f3a2b is stuck." Short, memorable names reduce cognitive load in incident response and daily standups.

Decision 03

Morning brief as the entry point

The home screen is a daily digest, not a live feed. Key metrics are readable in 10 seconds. Full detail lives one click away — not immediately visible, avoiding dashboard anxiety.

09 · Scope

What was designed

Desktop screens — full product coverage from onboarding to billing

→ Figma, production-ready

Mobile screens — monitoring and approval flows on the go

→ iOS-native feel, 390px

Cohesive design language — warm editorial, Fraunces + Inter Tight

→ Tokens, dark mode, responsive

0→1

Complete product designed from concept to full prototype, no prior art

→ Strategy + UX + visual

← Prev case

AI Dent — AI Dental Platform

Next case →

EngiBoard — Engineering Dashboard

Atlas —Control roomfor AI agents

Teams deploying AI agents have no control room

The morning brief.Fleet at a glance.

Deep dive onany single agent

Every run,fully observable

Continuous evals.Catch regressionsbefore prod.

Run managementacross the whole fleet