Flow Studio Blog

Power Automate Work Order Automation: Email + PDF to Excel

Written by Catherine Han | May 9, 2026 1:51:05 PM

Best way to automate Outlook emails + PDF work orders into Excel? (n8n / Power Automate) A property maintenance company asked on r/MicrosoftFlow this week. The OP listed the failure modes that kept biting them: inconsistent email formats, PDF extraction problems, Excel append failures, table and column mapping issues, sometimes Outlook body content getting truncated. The follow-up question I had was the more interesting one: AI Builder, Copilot Studio with Work IQ Outlook, or build it yourself?

So I screenshotted the post and sent it to my Telegram bot. I was lying on the bed on a Saturday afternoon, no VS Code, no Power Automate designer, no Azure portal. On the other end of the Telegram chat is a Claude PM running TmuxAgentManager, which dispatches the build to a Codex worker in a tmux session. About two and a half hours later I had a two-flow Power Automate solution writing to Excel with a confidence-gated exception path.

The useful pattern is to combine the email body and the PDF text into one context, extract once, and use a confidence score to decide whether the record goes straight into operations or into an exception queue for human review.

What follows is the field note the Codex worker wrote, lightly edited. What I personally validated: the test emails landed in my Outlook with their PDF attachments, both flows ran in the Power Automate run history, and the extracted rows landed in WorkOrder_Operations.xlsx with the fields populated. Demo worked end to end. Great work to the worker, to the PM for keeping it on rails, and to Flow Studio MCP for giving the worker action-level visibility into every Power Automate run, which is what made the silent-failure debugging in sections [05] and [06] tractable.

The next section is written from the worker's point of view. I kept the engineering detail because the failure modes are the useful part.

▸ section · author switch
CODEX WORKER FIELD NOTE
author: Codex (tmux worker, dispatched via Claude PM in Telegram)  ·  build duration: ~2h 30m (incl. mid-build SharePoint→Excel pivot)  ·  tools: Flow Studio MCP, Azure CLI, Microsoft Graph, Office 365, Excel Online, Azure OpenAI, Document Intelligence

[01] Triage: AI Builder vs Azure OpenAI vs Copilot Studio

Owner input via PM: a screenshot of an r/MicrosoftFlow Reddit post for a property maintenance company. Reddit OP's stated pain: scattered information across email body and PDF, inconsistent email formats, table/column mapping issues, occasional blank Outlook bodies. Owner emphasised one constraint twice in Telegram: "One thing important. As per Reddit op. The email needs to contain PDF and job info scattered in email body and pdf". That is the load-bearing requirement. Owner explicitly asked the build-vs-buy question on the Microsoft stack.

Two viable paths fell out of the audit. Both use a custom Power Automate flow as the orchestration shell. The difference is which engine does the OCR and the extraction.

Path Engine Trade-off
A · AI Builder native Outlook → AI Builder Extract text in images and PDF documents (OCR prebuilt) → AI Builder Run a prompt with JSON output mode → Excel No Azure resources to provision. No HTTP actions. JSON output mode can avoid a separate Parse JSON step in many flows. Premium license + AI Builder credits. Model availability depends on region and tenant, no param tuning.
B · HTTP + Azure OpenAI Outlook → HTTP → Azure AI Document Intelligence (prebuilt-read) → HTTP → Azure OpenAI (gpt-4o-mini) → Parse JSON → Excel Requires an Azure subscription, two Azure resources, key management. Full param control (temperature, max_tokens). Cheaper per call at scale. More setup overhead, plus a ~5s OCR polling delay per PDF that stacks at volume.

What I should reject up front: AI Builder Document Processing (the custom forms model). It is template-based per collection: train on 5+ examples that share a stable layout. The Reddit OP listed "inconsistent email formats" and the work-order PDFs come in a dozen shapes; that is the disqualifier. The custom forms model is the right call when your invoices all come from the same supplier, not when your inputs are arbitrary.

Decision for this build: Path B. Reason: the owner already has an Azure AI resource provisioned and wants me to reuse it; the marginal cost of HTTP-actions over native is small when the Azure footprint exists. Path A is the recommendation if you are starting from a Power Platform license with no Azure deployment.

[02] Tooling: Flow Studio MCP, Azure CLI, Microsoft Graph

▌ Flow Studio MCP
create flow, patch flow definition JSON, run history, action-level inputs and outputs
▌ Azure CLI (az)
pre-authenticated; used to provision Azure OpenAI deployment + Document Intelligence resource, fetch keys
▌ Microsoft Graph
create the Excel workbook on SharePoint, add tables, append columns mid-build via PATCH
▌ Power Automate connectors
Office 365 Outlook (trigger + GetAttachment_V2), SharePoint (CreateFile), Excel Online Business (AddRowV2), HTTP (Azure REST)

I never opened the Power Automate designer or the Azure portal during the build. The flow definition was patched as JSON via Flow Studio MCP. Azure resources were provisioned via az. Excel schema was managed via Graph PATCH.

[03] Architecture: two Power Automate flows, Outlook to Excel

Outlook email arrives  ─▶  Flow 1: WorkOrder_01_Intake_Outlook
   ─▶  GetAttachment_V2 (binary PDF bytes, not base64 text)
   ─▶  SharePoint CreateFile (archive PDF)
   ─▶  Excel AddRowV2 → Intake table (audit trail)
   ─▶  HTTP POST → Flow 2: WorkOrder_02_AI_Extract_To_WorkOrder
      ─▶  Azure AI Document Intelligence (prebuilt-read OCR, per PDF)
      ─▶  Combine email body + PDF text → single context string
      ─▶  Azure OpenAI gpt-4o-mini (system prompt enumerates 16 JSON fields)
      ─▶  Parse JSON · check confidence ≥ 0.75
      ─▶  Excel AddRowV2 → WorkOrders table  |  OR  |  Exceptions table

Two production flows plus one helper that emits diverse test emails on demand. The first end-to-end version landed on SharePoint Lists in under an hour; the owner then pivoted the datastore to Excel mid-build (the Reddit OP's pain point list explicitly named Excel). The Excel migration was the larger of the two rebuild beats in this build: connector swap, schema redefinition via Microsoft Graph, every PostItem action replaced with AddRowV2. Trade-offs of the Excel choice are in [07].

Environment Flow Studio Demo
Intake flow WorkOrder_01_Intake_Outlook · Outlook trigger
Extract flow WorkOrder_02_AI_Extract_To_WorkOrder · HTTP trigger
Workbook WorkOrder_Operations.xlsx · 4 tables (Intake, WorkOrders, Exceptions, AuditLog)
AI Azure OpenAI gpt-4o-mini · Document Intelligence prebuilt-read
Flow 1 · WorkOrder_01_Intake_Outlook: email arrives, loop attachments, archive PDF to SharePoint, log Intake row, hand off to the extract flow. Run completed cleanly; every action green.


Flow 2 · WorkOrder_02_AI_Extract_To_WorkOrder: per attachment, OCR via Document Intelligence; concatenate email + PDF text; one Azure OpenAI call returns the 16-field JSON; confidence-gated write to WorkOrders or Exceptions.

[04] The combined-context pattern: email + PDF in one Azure OpenAI call

The single decision that made this work: do not extract from email OR PDF separately. Concatenate first, extract once.

// inside Flow 2, after Document Intelligence returns PDF text
combined = concat(
    'Email Subject: ', triggerBody()?['subject'], '\n\n',
    'Email Body: ', triggerBody()?['body'], '\n\n',
    'PDF Content: ', join(variables('PDFText'), '\n\n')
)

// system prompt fragment sent to gpt-4o-mini
"Extract work order fields from maintenance requests. Return valid JSON with: propertyName, propertyAddress, unit, tenantName, contactPhone, contactEmail, issueDescription, tradeCategory, priority, requestedDate, preferredWindow, accessInstructions, confidence, missingFields, validationNotes, duplicateKey"

One model call. One response. The model decides where each field came from. Tenant name in the email body? Fine. Access code only in the PDF? Fine. Both? Fine, it merges them. The confidence field is part of the JSON contract. The model self-reports how sure it is, and the flow uses it as a routing key:

if(equals(body('Parse_Extraction_JSON')?['confidence'], null),
    'confidenceMissing; ',
    if(less(float(body('Parse_Extraction_JSON')?['confidence']), 0.75),
        'confidenceBelowThreshold; ',
        ''))

Two failure modes covered: confidence missing entirely (model returned malformed JSON) and confidence below threshold (model is unsure). Anything else flows to the WorkOrders table. Below threshold flows to Exceptions with the partial extraction preserved for human review.

The recipe in six steps, copy-pasteable into a flow:

  1. Outlook trigger captures the email with attachments
  2. AI Builder Extract text in images and PDF documents (Path A) OR Azure Document Intelligence prebuilt-read (Path B) returns OCR text per attachment
  3. concat() expression merges email subject + body + PDF text into a single context string
  4. AI Builder Run a prompt with JSON output mode (Path A) OR Azure OpenAI gpt-4o-mini chat completion (Path B) extracts the 16 fields
  5. Power Automate Condition action checks the model's self-reported confidence field against the threshold
  6. Excel AddRowV2 writes to the WorkOrders table on pass, or to the Exceptions table on fail

The pattern only matters because real inbound work orders distribute their content unevenly. Two of the test scenarios:

Test 1, email-heavy: property, tenant, and issue description live in the email body; the PDF only adds contact phone and access instructions. Test 3, scattered: informal forwarded email plus a handwritten-style PDF; the model has to merge both to reconstruct the work order.

[05] Where I went wrong: blank PDFs and base64 vs binary attachments

Mid-build, while the helper test-sender flow was being shaken down, archived PDFs landed in SharePoint at exactly 1500 bytes each. The owner peeked into Outlook, opened one of the archives, and messaged the PM:

▌ owner · in Telegram
The pdf seemingly not generated correctly. It is blank 1kb.

The PM forwarded the report and asked me to verify the file's actual bytes, not just the size. I downloaded one of the archives and checked the magic bytes:

▌ first 8 bytes · status: wrong
JVBERi0x...   ← base64-encoded ASCII
expected:
%PDF-1.4...   ← actual PDF binary header

Diagnosis: the helper test-sender flow was passing the attachment as a base64 string, and SharePoint CreateFile was writing those ASCII characters straight to disk. The intake flow itself was correct. The bug was upstream, in the test fixture. Fix:

- attachmentContent: base64String
+ attachmentContent: @base64ToBinary('base64String')

Re-test: archived PDFs are now ~85 KB each, magic bytes %PDF-1.4, openable in any reader. Lesson: for any binary attachment path in Power Automate, verify the magic bytes of the destination artifact, not just the file size or HTTP status. A 200 OK with a 1.5 KB file is the same shape as a 200 OK with a 85 KB file, and the connector will not warn you.

[06] Action-level debugging: when Excel AddRowV2 silently writes blank rows

During the SharePoint→Excel migration, my first patch of the extraction flow was failing in the worst possible way. Power Automate run status: Succeeded. Excel returned 200 OK on every AddRowV2 call. New rows appeared in the WorkOrders table on every run. The cells were empty. No error surfaced anywhere: not in the run history, not in the connector response, not in the Excel UI. This is the failure mode that defeats top-level monitoring: the flow succeeded as a flow, the connector accepted the request, the row appeared, and the data silently went nowhere. Tool I reached for:

tool: Flow Studio MCP get_live_flow_run_action_outputs
target: the AddRowV2 action body, exactly as the connector received it

What the action actually received:

▌ AddRowV2 input body · ignored silently
{
  "item": {
    "Title": "WorkOrder Scenario 1 - leak under vanity",
    "PropertyAddress": "42 Ocean View Road, Unit 8, Manly",
    "TradeCategory": "Plumbing"
  }
}

The Excel Online Business connector does not consume a nested item object. It reads dynamic top-level keys whose names follow the item/<ColumnName> convention, visible when inspecting the connector definition. My nested body was structurally valid JSON, so the connector returned 200; my fields lived under a key the connector did not look at, so the row went in blank. Fix: flatten the body to the dynamic-column shape:

▌ AddRowV2 input body · status: success
{
  "item/Title":           "WorkOrder Scenario 1 - leak under vanity",
  "item/PropertyAddress": "42 Ocean View Road, Unit 8, Manly",
  "item/TradeCategory":   "Plumbing"
}

Two minutes from "I notice the cells are empty" to a fix patched into the flow definition JSON, all without opening the Power Automate portal or the Excel file. Without action-level body inspection I would have spent an hour staring at the table and re-running the flow, watching new blank rows appear, with the connector and the run status both lying to me by silently agreeing. The owner asked the broader question shortly after, "how can we improve table column mapping issues", and the answer was the one already patched into the flow: read the connector swagger, write the dynamic-column form.

[07] Hardening backlog: confidence tiers, retry, daily digest

☐ Duplicate detection: hash PropertyAddress + IssueDescription[:50], search WorkOrders for matches in the last 7 days, route to Exceptions if hit
☐ Retry with backoff on Azure OpenAI calls: currently single-shot; transient 429s drop to Exceptions when they should retry
☐ Confidence-tier routing: auto-create at ≥0.90, approval action at 0.75–0.89, Exceptions below 0.75
☐ Teams alert on Exception row: adaptive card with property, issue summary, "Resolve" button
☐ Daily 24-hour digest: scheduled Recurrence flow that emails a dispatcher with the WorkOrder count + breakdown for the last 24 hours; an empty digest is the early warning that the Outlook trigger or an AI dependency broke overnight
☐ Migrate Excel → SharePoint Lists for production: workbook locking on concurrent writes is a known failure mode at volume; Excel is fine for the demo, not for the steady state
☐ Port to AI Builder native (Path A): replace the two HTTP actions with Extract text in images and PDF documents + Run a prompt in JSON output mode; current build is locked to the owner's Azure subscription, AI Builder native is portable to any Power Platform tenant

The hard part of this build was not the AI call. It was the data plumbing across boundaries: PDF binary handling at the email connector, OCR latency at Document Intelligence, JSON contract design at the model boundary, the Excel connector's undocumented dynamic-column convention. Each boundary was inspectable through action-level inputs and outputs, and that is what made the work tractable inside a single afternoon.

▸ section · author switch
BACK TO CATHERINE
human validation  ·  what I checked before publishing this

What I actually validated

I want to be precise about who did what here, because the orchestration matters more than I expected it to.

The worker built a helper flow that emits work-order emails with PDF attachments on demand. It used that helper to test itself end-to-end across three diverse scenarios: one with most of the detail in the email body, one with most of the detail in the PDF, one deliberately messy with information scattered across both. I did not send any of those test emails myself.

What I did do: I peeked into Outlook on my phone and confirmed the test emails arrived with their PDF attachments visible. I confirmed both flows ran in the Power Automate run history. I opened WorkOrder_Operations.xlsx on the SharePoint MCPDemo site and confirmed the extracted rows landed in the WorkOrders table with the fields populated.

WorkOrders table after the test runs. Each row is one extraction; right-most columns are WorkOrderStatus and Confidence. Test 3 (the messy one) lands at 0.9; the rest at 1.0.

That is enough for me to say the demo worked end to end. Everything in between (the combined-context concat, the system prompt, the confidence-gate Power Fx expression, the dynamic-column Excel mappings, the SharePoint→Excel migration mid-build) was the worker's work and it stands on its own merits.

The orchestration is the part I want to flag separately. I did not write a spec. I sent the Reddit screenshot and answered a handful of Telegram questions: which mailbox to watch, the SharePoint MCPDemo site as the archive target, "yes, you can use my Azure CLI session", and, about an hour into the build, "Also the data needs to go to an excel file not SharePoint list." The PM held the build in its head for the whole afternoon, asked the worker probing verification questions when results looked too clean (did you actually exercise that path, or did you just deploy it?), and pinged me only when a decision genuinely needed me. When I reported the blank 1KB PDFs, the PM did not guess; it asked the worker to verify file bytes, not file size, and the magic-bytes diagnosis followed from there.

What I'd add before a customer trusts this

The worker's [07] backlog is comprehensive. Two of the seven items are the ones I would not ship without if a real property maintenance company were going to rely on this.

A human-in-the-loop gate at borderline confidence. Today the flow is binary: confidence ≥ 0.75 goes to WorkOrders, anything lower to Exceptions. In a real deployment I would split the middle band (0.75–0.89) into a Power Automate Start and wait for an approval action (email approval or a Teams adaptive card showing the extracted fields next to the original email body and PDF text), and only auto-write to WorkOrders when the approver clicks Approve. The high-confidence band (≥ 0.90) still auto-creates. This costs the dispatcher a queue of perhaps five to fifteen approvals per day in exchange for catching the cases where the model was wrong and confident, which is the failure mode that destroys trust faster than anything else.

A daily 24-hour digest email. A scheduled Recurrence flow at 7am reads the WorkOrders rows where CreatedDate >= utcNow() - 24h, builds a short HTML table grouped by trade category and priority, and sends it to a dispatcher mailbox via Send an email (V2). The report itself is useful, but the real point is the early warning when zero rows arrive. If the Outlook trigger broke overnight, or AI Builder credits ran out, or Document Intelligence rate-limited, an empty digest tells the dispatcher within hours instead of days. The same pattern works for any silent-fail-prone flow: a daily heartbeat with a count is cheap insurance.

If you have the same Reddit question (scattered information across email body and PDF, "AI Builder vs Copilot Studio Work IQ vs build it yourself") the architecture above is the one I would copy: combined-context concatenation, single Azure OpenAI call with an explicit confidence field in the JSON contract, confidence-gated routing to either the operational table or an exceptions table. If you want your agent (or your agent's PM) to inspect action-level inputs and outputs the way ours did, Flow Studio MCP is what made that inspection possible. There is a free Starter plan.



Frequently asked questions

Should I use AI Builder or Azure OpenAI for Power Automate document extraction?

If you already have an Azure OpenAI deployment and a Document Intelligence resource, the HTTP + Azure path (Path B in [01]) gives you fine-grained control of model parameters and is cheaper per call at scale. If you are starting from a Power Platform license with no Azure footprint, the AI Builder native path (Path A) is portable, requires no Azure resources, and uses one AI Builder Run a prompt action with JSON output mode that removes the Parse JSON step entirely.

How do I extract data when information is scattered between the email body and a PDF attachment?

Combine before extracting. Concatenate the email subject, email body, and the OCR text of every PDF attachment into one context string, then send a single prompt to the model. Do not run separate extraction passes on the email and the PDF. The model decides which field came from where, and the same prompt works whether the property name is in the email body, the PDF, or both. Code in [04].

Why is my Power Automate Excel AddRowV2 writing blank rows even though the run succeeded?

The Excel Online Business connector reads dynamic top-level keys named item/<ColumnName>. A nested item: {...} object is structurally valid JSON, the connector returns 200 OK, the row appears, and every cell is empty. Flatten the body to { "item/Title": "...", "item/PropertyAddress": "..." }. Discoverable in the connector swagger. Diagnosis story in [06].

Why are my archived PDFs only 1 KB and unreadable?

Verify the magic bytes of the destination file, not just the size. PDFs start with %PDF-1.4 (binary). If you see JVBERi0x, you wrote base64 ASCII text into a .pdf file because the upstream flow passed the attachment as a base64 string. Wrap with @base64ToBinary('...') in the Power Automate expression. Story in [05].

How can I prevent silent failures in Power Automate flows?

Top-level run status is not enough. The flow can be Succeeded while the connector silently drops your data. Inspect action-level inputs and outputs (e.g. via Flow Studio MCP's get_live_flow_run_action_outputs) to see what each connector actually received and returned. Add a daily 24-hour digest email so an empty digest becomes an early warning when the trigger or an AI dependency breaks overnight.

About Flow Studio MCP: Flow Studio MCP is a Model Context Protocol server that gives AI agents action-level visibility into Power Automate. It works with GitHub Copilot, Claude, Codex, and any MCP-compatible agent. Available on GitHub via the awesome-copilot list.

Related reading:

Catherine Han, Flow Studio