Using Unity Code MCP Server to Playtest Unity Games with AI Agents
A practical look at using play_unity_game to automate gameplay checks in real Unity projects, with Pong and top-down driving examples.
Most gameplay problems are hard to validate from code alone. A movement change, UI flow, collision rule, or input binding can look correct in scripts and still fail the moment the game is running and a player starts pressing buttons.
Unity Code MCP Server turns that gap into a runtime loop: search the running game, execute an action, verify what changed, and repeat. The agent is not limited to static code review or one-shot test execution. It can enter a real scene, understand the current state, act like a player, read the result, and adjust the next move.
That creates a practical search / execute / verify workflow for real gameplay. The agent can discover how a specific project works, try the same inputs a player would use, inspect the exact runtime outcome, and keep going until the scenario is proven or the failure is clear.
Search
Inspect scene, components, UI, physics, scores, timers, and input.
Execute
Run the game for a controlled duration while sending real input actions.
Verify
Read runtime state, logs, screenshots, and game-specific success signals.
Repeat
Adapt the next move from what actually happened in Play Mode.
Runtime Access
Unity Code MCP Server gives an AI agent a way to step inside a running Unity game, not just inspect files around it. Once the Editor is in Play Mode, the agent can use execute_csharp_script_in_unity_editor to read live runtime state: scene objects, component fields, physics values, UI hierarchy, scores, timers, animation flags, input configuration, and whatever project-specific data your game exposes.
Player and UI Input
play_unity_game gives the agent hands. It can emulate player input through Unity Input System actions: hold right, tap jump, steer a car, press a menu button, submit a dialog, or drive a UI flow. This applies to gameplay objects and interface flows, including projects using old Unity UI, newer Unity UI systems, and custom controller logic, as long as the behavior is reachable through the game’s input path.
What the Tool Actually Does
play_unity_game advances the game for a requested duration while optionally sending configured Unity Input System actions.
In practice, the loop looks like this:
- Call
enter_play_modeonce. The editor enters Play Mode and pauses time. - Use
execute_csharp_script_in_unity_editorto sense the current game state: positions, velocities, scores, UI state, controller values, or any other runtime data the project exposes. - Compute the next action from that fresh state.
- Call
play_unity_gamewith a duration and input actions. - Read the returned logs, re-sense, and repeat until the scenario passes or fails.
- Call
exit_play_modewhen finished.
The important detail is that the agent should not blindly press buttons. It should observe, calculate, act, and observe again. That makes the workflow useful for real projects where physics, timers, animation state, and game rules change every frame.
The tool accepts simple input like this:
play_unity_game:
duration: 350
input:
- action: "MoveRight"
type: hold
For a one-frame action:
play_unity_game:
duration: 200
input:
- action: "Jump"
type: press
Or to let the game run while the agent watches for logs or state changes:
play_unity_game:
duration: 500
Under the hood, the tool temporarily unpauses time, sends Input System state through the configured action asset, captures console output during the run, then pauses again. That pause between calls is what gives the agent a stable point to inspect before deciding what to do next.
Example 1: Pong as a Repeatable Gameplay Check
Pong is a good first test because the rules are simple but the timing is unforgiving. The agent has to track the ball, predict where it will cross the paddle line, move the paddle using input actions, and repeat after every bounce.
The prompt for the sample project is intentionally concrete:
Play Pong.
Bounce the ball with the paddle 5 times in a row.
Use enter_play_mode once, then use play_unity_game for gameplay.
Always re-sense after every action.
Never move transforms directly.
The useful part is not that the agent can “play Pong” as a party trick. The useful part is that this becomes a regression check:
- Does the scene still load?
- Is the Input Action asset still configured?
- Does paddle movement still respond to the expected actions?
- Is the ball velocity readable and physically plausible?
- Do bounce and miss logs still fire?
- Can the game survive a short, repeated Play Mode session without throwing runtime errors?
A typical decision cycle looks like this:
SENSE: ball=(1.84,0.71) vel=(4.2,-2.1) paddleY=0.10
COMPUTE: targetY=-0.43 action=Player1Down moveMs=132 safeIdleMs=0 delta=-0.53
The next call is precise, not guessed:
play_unity_game:
duration: 132
input:
- action: "Player1Down"
type: hold
After the tool returns, the agent reads the captured logs and immediately queries the new runtime state. If the paddle is slightly off, it applies a small correction. If the ball bounced, it updates the trajectory. If the game reset, it re-discovers the relevant state instead of continuing from stale assumptions.
This style works well for gameplay smoke tests because it exercises the same path a player uses: Input System actions, runtime scripts, physics, collision events, scoring, and scene state.
Example 2: Steering a Top-Down Driving Game
The driving example is deliberately different from Pong. There is no clean intercept point. The agent is testing whether the car responds to steering input in a running scene.
The prompt can be as small as:
Play the game - move car right then left.
The agent still follows the same structure. First it inspects the project and scene enough to find the car, the movement component, and the relevant input actions. Then it enters Play Mode and applies controlled steering:
play_unity_game:
duration: 600
input:
- action: "MoveRight"
type: hold
Then:
play_unity_game:
duration: 600
input:
- action: "MoveLeft"
type: hold
For a real project, you can make that check stricter. The agent can record the car’s position, heading, speed, and angular velocity before and after each run. The pass condition might be:
- after holding right, the car’s heading changed clockwise,
- after holding left, the heading changed back,
- speed stayed within an expected range,
- no runtime errors appeared in the console,
- the car stayed inside the playable area.
That gives you a lightweight Play Mode scenario without writing a full bespoke test harness first. If the scenario becomes important, you can later promote the checks into formal Play Mode tests or CI.
Scenarios Worth Automating
The strongest use cases are not broad “test the whole game” prompts. They are short, observable scenarios with a clear pass condition.
Good candidates:
- Input mapping checks: press jump, fire, interact, steer, pause, or menu actions and verify the expected state changes.
- Controller regression checks: move a character for 300 ms, jump once, dash once, and confirm the controller still responds after refactors.
- Physics and collision checks: run a ball, projectile, vehicle, or pickup interaction long enough to confirm collisions and triggers still fire.
- Tutorial and UI flow checks: press through a menu, start a level, open pause, resume, and verify no blocking runtime errors appear.
- AI and simulation checks: let a small encounter run for a few seconds and verify health, score, spawn counts, or win/loss state remain plausible.
- Level smoke tests: load a scene, move the player through a known opening path, and check for missing references, broken cameras, or immediate soft locks.
This complements Unity’s built-in testing stack rather than replacing it. Use Edit Mode tests for deterministic logic, Play Mode tests for code-level runtime assertions, build automation for CI coverage, and play_unity_game for fast agent-driven gameplay checks while you are actively developing.
Unity’s Build Automation unit test support can run Edit Mode and Play Mode tests as part of a build pipeline. play_unity_game is more local and interactive: it is for the moment when you just changed movement, collisions, a scene, or an input asset and want feedback before you switch mental context.
Why This Matters for Game Developers
The practical benefit is a shorter feedback loop. A developer can ask for a change, let the agent implement it, and then ask the same agent to run a focused gameplay check in the Editor.
That changes the conversation from:
“I changed the controller. Please try it manually.”
to:
“I changed the controller, entered Play Mode, held MoveRight for 600 ms, held MoveLeft for 600 ms, checked the car heading, and found no runtime errors.”
It also makes testing more reliable. Manual playtesting is still essential for feel, balance, readability, and fun. But humans are poor at repeating the same tiny verification step dozens of times a day. Agents are well suited to that work, especially when they can read exact scene state and use the same input actions a player uses.
The best results come from treating the agent like a disciplined test driver:
- give it a small objective,
- require it to discover input names and scene state dynamically,
- make it use input actions instead of direct transform edits,
- make it re-sense after every action,
- define a concrete stop condition.
For Pong, the stop condition was “five reflected balls.” For the driving game, it was “move right, then left, and verify the car responds.” For your project, it might be “open inventory and equip the first item” or “survive the first enemy attack without a null reference.”
Getting Started
Start with the Unity Code MCP Server repository and the Unity Code MCP Server product overview.
The short setup path is:
- Install UniTask in your Unity project.
- Install Unity Code MCP Server from the GitHub package URL.
- Configure your MCP client to run the bundled
unity-code-mcp-stdiobridge. - Open Tools > UnityCodeMcpServer > Show or Create Settings and confirm the Input Actions Asset used by
play_unity_game. - Make sure the generated agent skills are installed. The important ones for this workflow are
unity-game-playerandexecuting-csharp-scripts-in-unity-editor.
Useful references:
- Unity Code MCP Server on GitHub
- Model Context Protocol documentation
- Unity Input System Actions documentation
- Unity Test Framework Play Mode tests
- Unity testing and QA tips
- Unity Build Automation unit tests
A Good First Prompt
Once the server is connected and your scene is open, try a small scenario:
Use the Unity game-playing skill.
Enter Play Mode.
Inspect the loaded scene and discover the player object, input actions, and movement component.
Use play_unity_game to move the player right for 500 ms, then left for 500 ms.
After each action, re-sense the scene and report whether the player moved as expected.
Exit Play Mode when finished.
Keep the first run boring. Do not ask the agent to beat a level immediately. Ask it to prove that it can read the game, use the right inputs, and verify one behavior. Once that loop is reliable, you can build more ambitious checks on top of it.
That is where play_unity_game becomes valuable: not as a replacement for QA, but as a fast, repeatable way to put your Unity project under real runtime pressure while the work is still fresh.