eCommerce Revenue Observability Systems
Research
Systems EngineeringeCommerceRevenue ObservabilityMCPModern Data Stack

eCommerce Revenue Observability Systems

How diagnostic engineering transformed ad spend intelligence—detecting ROAS drops in hours, not weeks.

Published Jan 02, 20265 min read

When ROAS drops 60% overnight, eCommerce teams scramble. Marketing blames the landing page. Engineering blames the creatives. Analytics pulls reports that arrive days later. At a leading D2C brand technology company, we built agentic AI systems that answer the question in hours: "Was it the creative or the landing page?"

The Problem: Attribution Chaos

Modern eCommerce operates in a fog. iOS 14.5 caused 40-60% of conversions to go untracked. Platforms conflate attribution—Meta claims 80% of conversions while Google claims 75% of the same ones. Manual diagnosis takes days to weeks, and teams optimize on incomplete data, burning budget on blind ad spend.

A single broken pixel or slow-loading widget can cost $180K+ monthly. The challenge: detect problems in minutes, not days.

The question isn't "what happened?" It's "why did it happen, and was it the creative or the landing page?"

The Agentic Approach

We deployed autonomous AI agents using the ReAct pattern (Reason + Act) to continuously monitor revenue health and diagnose issues. The system works by monitoring metrics, detecting anomalies, reasoning about what changed, pulling creative and landing page data, observing patterns, and alerting with a diagnosis.

Key Capabilities:

  • First-Party Attribution — Bypass platform fog with server-side event capture
  • Creative-Level Tracking — Performance by ad creative, not just campaign aggregate
  • Landing Page Health Scoring — Widget load times, CLS, button accessibility
  • Natural Language Queries — "Why did ROAS drop yesterday?" via Model Context Protocol

Model Context Protocol (MCP) in Action

The agents were built to answer questions in plain English—no dashboards, no waiting for analysts. When a user asked "Why did my ROAS drop yesterday?", the agent pulled 48-hour performance data across Meta, Google, and TikTok. It identified a new creative set with 0.2% CTR versus the 2.1% average. It verified landing page CVR was actually UP 15%. Diagnosis: Creative issue, not landing page. Reverted creatives, ROAS recovered in 2 hours.

The Modern Data Stack Powering It

Agentic systems need fast, reliable data. We chose tools designed for speed and developer experience over legacy ETL complexity:

  • Rill — BI-as-code with metrics-first analytics. Sub-second queries without expensive warehouse overhead. Agents need fast answers.
  • Bauplan — Git-for-Data lakehouse. Branch data like code, rollback bad pipelines instantly. Built for agents and automation.

Rill replaces dashboard sprawl with a metrics-first approach. Define metrics once in code, version them in Git, and agents query them directly. No more "which dashboard has the right number?" chaos.

Bauplan treats data pipelines like software—branch data instantly to experiment without breaking production, zero infrastructure (write Python, Bauplan handles scaling), and deterministic runs with every pipeline tied to a commit, fully traceable.

The combination of MCP agents + Rill + Bauplan creates a system where non-technical users ask questions, agents reason through the data, and pipelines update automatically—all version-controlled like code.

Real Impact

Our system delivered measurable results:

  • Issue Detection Time: From 1-2 weeks down to under 2 hours
  • Monthly Wasted Ad Spend: From $180K to near zero
  • ROAS Improvement: +40% over baseline
  • Mobile Conversion: 2.8x lift from a simple touch target fix

One client recovered $2.1M in lost revenue by fixing a Shopify Plus analytics script that added 1.2s to page loads during BFCM.

Architectural Lessons

Building production agentic systems taught us several key lessons. First, tools are everything—the agent's diagnostic power came from well-designed integrations with ad platforms, Shopify, and analytics APIs. Second, memory matters—agents that remember baseline performance can detect anomalies faster than threshold-based alerts. Third, guardrails prevent chaos—autonomous diagnosis is powerful, but autonomous action requires human-in-the-loop for high-stakes budget shifts. Finally, MCP enables accessibility—non-technical marketers could finally ask "why" directly, without going through a data team ticket queue.

The Bottom Line

eCommerce observability isn't about more dashboards. It's about agents that reason through problems like your best analyst would—except they work 24/7 and respond in seconds. The shift from reactive reporting to agentic diagnosis cut detection time by 95% and fundamentally changed how D2C brands protect their ad spend.