AI replacing manual RSAT testing with automated Dynamics 365 Finance testing environment

Prepare for RSAT Deprecation: Shift to AI in D365 FINANCE

  1. The May 2027 Deadline
  2. Why RSAT Is Reaching Its Limits
  3. The Alternative: AI + MCP for D365 FINANCE & SCM
    1. What an MCP brings to the table
    2. What changes versus RSAT
  4. Reporting and Traceability: Email + Report Files
  5. Topics Every Organization Must Address Before Switching
    1. Volumes
    2. Performance
    3. AI Costs
    4. Governance
    5. Monitoring
  6. Closing Thought

Microsoft has confirmed — via Viva Engage and direct communications — that the Regression Suite Automation Tool (RSAT) will be deprecated on 15 May 2027. After that date, no updates and no support will be provided.

Notably, Microsoft has not announced a like-for-like replacement. Instead, the message points toward modern testing approaches and broader end-to-end automation. For customers, partners, and ISVs who have built years of investment around Task Recorder and RSAT pipelines, this is more than a tooling swap — it is an invitation to rethink the entire non-regression strategy.

The underlying question is sharp:

Is the ERP industry moving away from scripted testing toward AI-driven testing — tools capable of generating, adapting, and maintaining test scenarios on their own?

The answer, increasingly, is yes. And the emergence of Model Context Protocol (MCP) servers around Dynamics 365 Finance & Operations makes this shift concrete rather than theoretical.

RSAT served us well, but its constraints are well known:

  • Brittle to UI changes — every form update risks breaking the recordings.
  • Heavy maintenance — each release wave (twice a year) typically triggers a cycle of repair work.
  • Linear scripting — recorded steps don’t understand business intent; they replay clicks.
  • Limited reporting — results land in Azure DevOps test plans, but business-readable evidence (CR documents, executive summaries) still has to be built manually.
  • No self-healing — when a label changes or a control moves, the script fails rather than adapts.

In a world where D365 FINANCE & SCM evolves continuously, brittle scripts become a tax on the upgrade cadence rather than a safety net.

The combination of Large Language Models and MCP servers opens a fundamentally different way of running non-regression cycles.

An MCP server is a standardized bridge between an AI agent and a business system. For D365 FINANCE & SCM specifically, an MCP can expose:

  • Data services: read/write entities (customers, vendors, sales orders, journals, postings).
  • Process orchestration: trigger batch jobs, period-end routines, MRP runs.
  • Metadata introspection: forms, fields, security roles, workflows.
  • Reporting hooks: fetch financial reports, inventory snapshots, audit trails.

An AI agent connected to these MCPs can describe a test in natural language, execute it via the API surface (not the UI), validate outcomes against expected values, and adapt when the underlying form or label changes.

DimensionRSAT (today)AI + MCP (tomorrow)
Test definitionRecorded clicks (.axtr)Natural-language scenarios
ResilienceBreaks on UI changesSelf-adapts via APIs / metadata
CoverageOne scenario = one recordingOne prompt = N variants generated
MaintenanceManual re-recordingAI proposes the fix
EvidencePass/fail logsAuto-generated CR (Word/Excel/PDF)
Skill profileFunctional consultant + scripterFunctional consultant + prompt engineer

A credible replacement strategy is not only about running the tests — it must produce evidence that auditors, sponsors, and operations teams accept.

A well-designed AI + MCP pipeline should automatically:

  • Generate an executive email at the end of each cycle: pass/fail counts, top regressions, risk areas, next steps. Sent to project sponsors, key users, and the QA lead.
  • Produce a Word Report per functional area — Finance, Procurement, Inventory, Production — with screenshots, expected vs. actual values, and the AI’s commentary on root causes.
  • Produce an Excel test matrix — one row per scenario, status, duration, environment, build number, defect link — usable directly for steering committees.
  • Archive everything in SharePoint / Teams with consistent naming, so the audit trail is queryable months later.

The AI does not just test — it documents in the language of the business.

  • How many scenarios run per cycle today (smoke, regression, full pack)?
  • How many environments (DEV, UAT, GOLD, PROD-copy)?
  • What is the expected execution window — overnight, weekend, on-demand?
  • What is the data volume profile of test datasets — small synthetic vs. full PROD copy? AI agents reasoning over large transaction tables behave differently from those handling sample data.
  • Target duration per scenario and per full pack.
  • Parallelism: can scenarios run concurrently against isolated environments?
  • API throughput limits of D365 FINANCE & SCM (OData, custom services, DMF) — AI agents can saturate them faster than human testers.
  • Latency budget for the AI inference itself, especially when chaining tool calls.
  • Token consumption per scenario — a complex end-to-end flow (quote → order → pick → ship → invoice → payment) can consume tens of thousands of tokens.
  • Model selection — premium models for orchestration, smaller models for routine validation. Routing strategy drives the cost curve.
  • Caching — prompt caching for repeated scenario templates, embeddings for scenario libraries.
  • Cost per cycle vs. cost per defect found — the real ROI metric, not raw token spend.
  • Budgeting model — chargeback by functional stream, monthly cap, alerting at 70/90/100%.
  • Who writes the prompts? Functional consultants, QA leads, both? A prompt library with peer review must exist.
  • Data sensitivity — production-like data flowing through external AI services requires DPA review, and field-level masking.
  • Determinism vs. creativity — for non-regression, you want determinism. Temperature settings, seed control, and version pinning of models are part of governance.
  • Change management — when the AI proposes a fix to a broken scenario, who approves it before it enters the next baseline?
  • Auditability — every AI decision (which scenario ran, which tools it called, what it concluded) must be logged and replayable.
  • Segregation of duties — the AI account in D365 FINANCE & SCM must follow least-privilege; a “TestAutomation” role narrower than SysAdmin.
  • Operational KPIs: success rate, mean time to detect a regression, mean time to repair a scenario.
  • AI-specific KPIs: tool-call failure rate, retry rate, hallucination rate (assertions made without an MCP call backing them).
  • Cost KPIs: tokens per scenario, cost per cycle, cost per environment.
  • Drift detection: when scenarios start failing in patterns, surface it before the next release wave.
  • Dashboards: Power BI on top of the Excel/SharePoint outputs, refreshed daily — visible to the steering committee, not buried in DevOps.
  • Alerting: Teams / email notifications on red builds, with the AI’s root-cause hypothesis attached.

May 2027 is not just a deprecation date — it is the moment ERP testing leaves the era of recorded clicks and enters the era of reasoning agents. The organizations that prepare now will not only replace RSAT; they will gain a testing capability that is continuously aligned with business intent, self-documenting, and auditable by design.

The technical question — what tool replaces RSAT? — matters less than the organizational question:

Are we ready to manage testing as an AI-augmented practice, with the volumes, performance, costs, governance, and monitoring discipline that this implies?

That is the conversation to start today.


Comments

Leave a comment