AI Agents Can Build Power Automate Flows. Debugging Is the Trap.

Written by Catherine Han | Mar 5, 2026 5:17:29 PM

I have been using AI agents to build Power Automate flows. OpenClaw, specifically. Flows are JSON. Agents are good at JSON. So I expected it to just work - build and debug, end to end.

Building worked. Debugging was where it all fell apart.

And honestly? I think most people who try this will hit the exact same wall.

Video: AI Agent Debugged 2 Power Automate “Bugs” That Weren’t Bugs (3 min)

Or watch the 60-second thesis Short.

The setup

I started with the standard approach:

Azure app registration
Grant application permissions for Power Platform APIs
Grant a Power Platform system user role
Let the agent call the management APIs

That part worked fine. For basic tasks, it felt like vibe coding actually works.

The easy win (one attempt)

The first flow the agent built for me was simple: "When an item is created in a SharePoint list, send a Teams message to Catherine."

It did it in one attempt. That gave me confidence.

Task 2: a custom connector (helpful, but still broken)

Next I asked the agent to build a custom connector. It generated a Swagger definition and saved me time. But the connector still had errors - and each fix-deploy-test cycle cost tokens.

That was my first signal: agents can generate quickly, but debugging is where things get expensive.

Task 3: Real HR data (where it fell apart)

The real test was using the custom connector to grab real HR data from an external system. We needed both the Employee endpoint and the Employee History endpoint. There are thousands of employees, so the flow has to paginate - loop until there is no next page.

This is where the agent started burning through tokens. Not because it couldn't write the logic, but because it couldn't see what was actually failing.

What kept going wrong

Three things, specifically:

1. The agent couldn't catch the true error. It kept saying the loop failed due to "connection" issues. But the real error was inside a nested loop - entities were not referenced correctly. A scoping problem, not a connection problem. The agent couldn't see deep enough to know.

2. It mixed in Logic Apps concepts that don't exist in Power Automate. Agents borrow patterns from whatever training data they have. Mine kept trying map(), filter(), select(), or building "compose" style shapes that look right in JSON but fail at runtime. Power Automate is not Logic Apps. Close, but not the same.

3. Power Automate keeps the useful debug detail behind the UI. In the portal, you expand the failed action and see the real payload, the real error, the actual values. Through the API? The agent often couldn't get that detail. So it guessed, patched, redeployed, failed, and repeated.

That's where the token burn comes from. We went through 10-15 cycles on one flow. Easily $15-20 in LLM costs just to debug a single moderately complex workflow.

The try-catch workaround (better, but painful)

We tried the try-catch pattern my husband John wrote about back in 2018. Wrap actions in Scopes. On failure, run an error Scope. Use result() to capture what happened, then write it somewhere - we used a SharePoint list.

It helped. At least the agent had something to read.

But you need to implement it everywhere. Inside loops. Inside nested loops. Inside the parts that actually fail. Every debugging cycle meant modifying the flow itself just to capture error info, then modifying it again to remove the scaffolding. A lot of back and forth. And every round-trip burns tokens.

The breakthrough: give the agent the run data

Since John is running this SaaS FlowStudio, he already had the actions I needed - the ones that expose real run details, per-action error info, action inputs and outputs, loop iterations. The stuff you see in the Power Automate portal UI, but unavailable as APIs.

After watching me lose yet another round against InvalidTemplate, he said: "I already have all those APIs. I'll just wrap them in an MCP server."

And that was the game changer.

What changed with MCP

With the MCP server, the agent can finally see what actually broke. Here's what debugging looks like now:

Agent calls get_live_flow_runs - finds the failed run
Agent calls get_live_flow_run_error - gets structured, per-action error details. Not just "Failed" - the actual error message, the failing expression, the HTTP response body.
Agent calls get_live_flow_run_action_outputs - reads action inputs/outputs
Agent calls update_live_flow - deploys the fix

Four API calls. One round-trip. No try-catch scaffolding. No SharePoint. No guessing.

When the nested loop has a scoping problem? It's obvious now - the agent can see the actual action outputs and references.

That HR data flow that took $15-20 and 45 minutes to debug? Fixed in under 2 minutes. Pennies in token costs.

Why I am sharing this

After it worked for me, I packaged everything into three GitHub Copilot agent skills:

power-automate-mcp - Connect to and operate flows (list, read, trigger, resubmit, cancel)
power-automate-debug - Step-by-step diagnostic workflow for failing flows
power-automate-build - Build and deploy flow definitions from scratch

They work with any MCP-capable agent - GitHub Copilot, OpenClaw, Claude, or anything that speaks the Model Context Protocol.

The skills are free and open source. They need a FlowStudio MCP server to talk to. We're offering a free Starter plan - 100 MCP calls, no credit card required - so you can try the full experience without committing to anything.

Get started at mcp.flowstudio.app â†’

If you're experimenting with agentic Power Automate work - building flows from prompts, debugging with AI - this is the missing piece. Building is not the hard part. Debugging is.

Catherine Han - FlowStudio. John writes about Power Automate and Logic Apps at johnliu.net.

View full post