Beyond the Chat Box: Building AccelByte Gaming Services for Every Surface Developers Work On

Written by Phil Tossell | Jun 11, 2026 3:30:00 PM

Most of the conversation about AI in developer tooling has so far been focused on a single shape: the chat box. Type a sentence then get a result. More refined versions use skills and plans but still resolve down to a text conversation. While there have been steps towards integrating more conventional UI through things like MCP UI, on the whole there has not been a conclusive direction for incorporating different forms of human interaction in a natural way.

Developers don't only want to chat. They work in terminals, in CI, in IDEs, in native apps, in AI hosts like Claude Code, and yes, sometimes in a literal chat window. Each surface has its own conventions for how a human gives input and how a machine asks for confirmation. Chat is one of those conventions but it is not always the best one.

For the last year we've been quietly building AGS, AccelByte's backend for live games, so that it can be addressed seamlessly from any of those surfaces, providing the right interaction pattern for each one. Two ideas have shaped that work. The first is that AI is good at some things and unreliable at others, and the work that AI cannot do reliably needs a deterministic layer underneath it. The second is that "human input" is not synonymous with "chat"; different surfaces and use cases suit different forms.

This piece walks through how those two ideas show up in AGS: in the CLI, in the workflow format, in the Client Runtime, and in the AI Marketplace plugin. We'll point to each of these pieces, and to what's still in progress, at the bottom of the article.

The Deterministic Layer

What AI Is Good at, and What It Isn't

AI is good at intent. It is good at taking a fuzzy sentence and figuring out the semantics of it. It is good at filling in parameters from context, at resolving ambiguity, at summarizing results back into a sentence a human can read. These are real capabilities and they are getting better quickly.

Where AI is much less reliable is at multi-step state mutation against a production system. Sometimes it skips a step, sometimes it passes the wrong identifier, sometimes it retries an idempotent-looking call that wasn't idempotent; anyone who has watched a model "set up matchmaking" three times in a row and got three different shapes of broken state will recognize the problem.

The industry response so far has largely been to give the model more tools and better prompts, which helps at the margin but doesn't really fix the underlying problem; the primitive itself is wrong. Reliable multi-step work needs a typed contract underneath, not a better prompt on top of an unconstrained one.

The Picture Today

The diagram below shows how we are thinking about this problem at AccelByte.

INTERACTION SURFACES AGS APP Claude/Codex Claude Code IDE Terminal AccelByte AI Marketplace plugin (AI glue) INTERFACES AGS MCP AGS CLI AGS Client Runtime PLATFORM SERVICES OS credential store AGS backend services

Solid borders are shipped today. Dashed borders are coming soon. See the bottom of the article for the full breakdown.

At the top are the surfaces a developer chooses to work on: the AGS App, Claude or Codex apps, Claude Code, an IDE, the terminal. AI hosts reach AGS through the AccelByte AI Marketplace plugin, a small glue layer we'll come back to in a moment; the plugin in turn drives whichever of the two interfaces, the MCP server or the CLI, is the right tool for the job at hand. The terminal goes straight to the CLI, and the AGS App talks to the Runtime directly. All of those paths terminate in the same place: the AGS Client Runtime, which sits behind both interfaces and is the piece that other clients can link or talk to directly. Beneath that sit the OS credential store, where the Runtime keeps authenticated sessions, and the AGS backend services themselves.

The OS credential store is worth a sentence on its own; we wanted authentication to be part of the foundation rather than a problem each AI integration solves repeatedly on its own. A developer logs in once with ags auth login and every surface wired into the Runtime inherits the session.

Commands: One per Endpoint

The Client Runtime exposes AGS as a flat set of commands, one per backend endpoint. The CLI is the most visible projection of that surface and the easiest place to see what it feels like:

Shell

ags iam users get --namespace my-game --user-id abc-123

One AGS endpoint, one command. Every public endpoint is reachable this way, with the same shape, from any surface the Runtime is wired into; there is no per-client translation layer to maintain and no SDK-specific argument shuffling. A script in CI and an LLM in Claude Code call the same operation with the same arguments, and the whole surface is testable as a whole; if a command works, it works everywhere.

This is a bigger change than it might at first appear, because it also collapses a split that has been part of using AGS from the start. Historically, working with AGS meant two different surfaces: the admin portal for configuration and setup, and SDKs for in-game integration. A flat command layer over every endpoint quietly unifies these two, so the same surface serves the engineer setting up a store and the engineer wiring a client to read from it. That's a bigger topic than this paragraph can do justice to, so we'll come back to it in its own piece.

Commands on their own are not enough though. Because they map 1:1, even with good documentation and discovery it's challenging for AI to be able to join sequences of actions together in a consistent and predictable way to achieve a specific result.

Workflows: Deterministic Composition

Most useful work in AGS is not a single endpoint call; it's a sequence. To set up competitive matchmaking, for example, you need to create a stat, then a ruleset, then a session template, then a match pool, then an AMS fleet, and finally wire the session template to the fleet. That's six operations spread across four services, and getting any one of them wrong leaves the rest broken.

This is precisely the kind of work AI is unreliable at, so we don't ask AI to do it directly. Instead we ask AI to pick the right workflow and fill in its inputs, while the workflow itself is just data: declarative YAML built on top of OpenAPI 2.0, with typed inputs, ordered steps, explicit dependencies, and a published spec.

Below is an excerpt of the workflow for setting up competitive multiplayer.

schemaVersion: '1.0'

workflows:
  competitive-multiplayer:
    name: 'Set up competitive multiplayer'
    intent: 'matchmaking ranked competitive dedicated servers AMS match pool'
    description: 'Stand up competitive matchmaking with dedicated servers.'

    inputs:
      namespace:
        description: "Your game's AccelByte namespace."
        required: true
        schema: { type: string }
      playersPerTeam:
        schema: { type: integer, minimum: 1 }
        default: 4
      teamCount:
        schema: { type: integer, minimum: 2 }
        default: 2

    steps:
      - id: create-ruleset
        operationId: createRuleSet
        inputs:
          namespace:      { from: workflow/namespace }
          minTeams:       { from: workflow/teamCount }
          maxTeams:       { from: workflow/teamCount }
          playersPerTeam: { from: workflow/playersPerTeam }

      - id: create-match-pool
        operationId: createMatchPool
        dependencies: [ step/create-ruleset ]
        inputs:
          namespace:   { from: workflow/namespace }
          ruleSetName: { from: step/create-ruleset, output: name }

This example is abbreviated for the article; the full definition chains six operations across iam, matchmaking, session, and ams. The shape is the part that matters. The intent line is what an LLM reads when it routes a user request to this workflow, the amsinputs block is the typed contract the LLM fills, and everything below is the deterministic sequence the Runtime executes once those inputs are bound.

From the CLI, a workflow like this is run by id, with each declared input exposed as a flag:

Shell

ags workflow run competitive-multiplayer \
  --namespace my-game \
  --players-per-team 5 \
  --team-count 2

The same shape works whether a developer is running this themselves at a terminal, a CI job is invoking it as part of an environment setup, or an LLM is filling the inputs from a chat conversation and calling it through the plugin. In each case, the caller (whether human or model) only picks the workflow and fills the inputs; the Runtime executes it, and execution itself is deterministic. Because the workflow is just data, the same definition can be driven by very different interfaces, and we'll come back to what that unlocks for different surfaces in a moment.

Treating workflows as data has a couple of other useful consequences:

Rollback and undo. Because the steps are described rather than executed imperatively, the Runtime can unwind a partial run cleanly when something fails partway through.
An open, extendable library. Because the format is published rather than baked into the Runtime, the workflow library isn't only for us to grow. Developers using AGS can write their own workflows for their own operational runbooks, share them between teams, or extend ours; all using the same format we use ourselves.

"Why Not Just Use Skills? "

A fair question, since skills, agents, and prompt chains are the obvious tool to reach for when adding AI to a system. They're the right tool in cases where the work itself is genuinely ambiguous: turning a vague brief into a draft, extracting an answer from a long document, or deciding which of five operations a user probably meant. They handle language well.

What they handle less well is state. Two runs of the same skill will often produce two different shapes, which is tolerable for prose but not for "set up matchmaking"; multi-step state mutation needs a typed contract rather than a probability distribution. Workflows are where state lives, and skills sit on top of them doing what they're actually good at.

Human in the Loop, in the Right Place

The Forgotten User

The deterministic layer we've described so far is one half of the picture. The other half is the one the chat-first wave seems to have largely skipped: the human is still there, the human still needs to give input, and that input has a shape that depends on where the human is sitting.

Think about where a developer actually works. In CI there is no human in the loop at all; the input is a config file that was checked in beforehand. In the terminal it's flags and arguments and the occasional interactive prompt. In an IDE it's the command palette and a side panel. In a native app it's a form, a picker, or a wizard. In Claude Code or Codex it's a structured prompt and a confirm message. In a chat window it's a sentence.

That's six different surfaces with six different right answers to "how should the human give input here", and designing for only one of them feels like a step backwards.

One Workflow, Many Surfaces

The same competitive-multiplayer workflow we showed earlier runs in any of these places, with the same definition, the same inputs, and the same dependencies. What changes between them is how a human is invited into the loop.

SURFACE WHO'S THERE ACCELBYTE GAMING SERVICES Terminal An engineer at a prompt The CLI's fullscreen TUI presents a step-walk review form. Each input the workflow marked visible is asked in turn; defaults are pre-filled; the engineer accepts or edits, then watches the steps execute. AGS App The same engineer, now in a GUI A wizard generated from the same workflow definition, with the same inputs, the same defaults, and the same visibility, all rendered as form fields, dropdowns, and a confirm screen. Claude Code/ Codex CLI An LLM in a terminal host The plugin routes by intent; fills inputs from the conversation, and surfaces any confirm step as a structured text approval before execution proceeds. Claude Code/ Codex apps An LLM in a desktop GUI Same routing as the terminal host, but the app can render confirms as native UI elements (modal cards, buttons, inline previews) rather than as plain text. IDE A developer with an AI assistant inside their editor Inputs surface through the command palette and quick-pick; confirms appear as inline editor prompts or a side panel review, alongside the code being edited. CI No human at all --no-input; required inputs read from environment or a config file; defaults accepted; non-zero exit on validation failure.

What makes this work is a clean split of responsibility: the workflow author decides what the human should see and approve, while the surface decides how to render it. The workflow declares which inputs are visible, which steps require confirmation, and which fields have descriptions worth surfacing; it deliberately doesn't say whether that confirmation should appear as a [y/N] prompt, a modal dialog, a structured approval message, or a "passed by config" log line. The surface itself picks the right form for its context.

This is the part of the design we think hasn't been talked about much yet. Most of the chat-first work so far has optimized for a single surface and treated that as the strategy; the more interesting problem is the abstraction that lets one operation render as a CI no-op, a CLI step-walk, a GUI wizard, a desktop app card, an IDE side panel, or an LLM confirm without becoming a different definition for each. That abstraction has to live in the underlying layer rather than in the AI host, because otherwise every host ends up having to reinvent it.

The AI Glue Layer

The piece that sits between the AI host and the layer underneath is the AccelByte AI Marketplace plugin, the box marked "AI glue" in the diagram above. It's how Claude Code, Codex, and any other MCP-capable host actually reaches AGS.

Its job is the glue work: teaching an AI host how to discover the workflows available in the user's namespace, routing intents to the right workflow, surfacing confirm steps in a form the host's interface understands, and respecting each surface's input conventions. That's deliberately the whole list.

The plugin is intentionally thin because we didn't want determinism living in the AI-facing layer; the AI-facing layer is the part most likely to be rewritten as models and protocols continue to change. The underlying components do the heavy lifting, which lets the glue stay small, replaceable, and honest about what it is.

What This Piece Leaves Out

We've kept this fairly short on purpose. There is a lot more to say about the MCP server itself, about observability across surfaces, about multi-tenant auth, and about offline modes for AI hosts that can't reach the backend; each of those deserves its own piece, and we'll get to them in time.

What You Can Use Today

This piece coincides with the public release of the AGS CLI, which is the first time the layer we've been describing is available for anyone to install and use. Together with the MCP server and the AI Marketplace plugin, both of which are already shipping, that's enough to address AGS from a terminal, from CI, and from any MCP-capable AI host today.

AGS CLI: the binary and the command surface. The 1:1 commands for all services have already shipped in the latest release; you can start using them right away. The workflow runner is due to ship in the coming weeks, alongside rollback support and a workflow-creator skill to help you author your own.
AGS MCP server: the same Client Runtime, exposed to AI hosts that speak MCP. Get the AGS API MCP server and the AGS Extend SDK MCP server. For more on how they work: AccelByte MCP servers give AI assistants real backend context.
AGS AI Marketplace plugin: the glue layer that teaches a host how to discover workflows, route intents, and surface confirms. Get the AGS AI Marketplace plugin.
Workflow definition spec (v1.0): the published format the workflows above are written in. Read the spec.

Still in flight: separating the Client Runtime out into a standalone daemon so other clients can link or talk to it directly, the AGS App, and the broader workflow library, which is small today and will grow as we and our customers add to it. We'll write about each of those when they ship.

Building for Where Developers Are

Chat box is one surface and a useful one, but we're building for the rest of them too. The commitment we're making to AGS customers is fairly straightforward. AGS will be addressable from every surface our developers actually use (terminal, CI, IDE, native app, AI host), and the work that LLMs cannot do reliably will run on a deterministic layer underneath. The way a human is invited to give input will fit the surface they're sitting in, rather than being squeezed into the lowest common denominator of "type a sentence and hope". We'll keep building, and writing about it as the pieces ship.

Try AGS From Whatever Surface You Actually Work In

Terminal, CI, IDE, or Claude Code, the CLI, MCP server, and AI Marketplace plugin all hit the same deterministic layer. AGS public cloud is completely free until you hit 30 concurrent users, plus a 90-day free trial if your game is already live. If you'd rather want a walkthrough first, let's talk!

Get Started for Free
Talk to Us

View full post