Real-Time Analytics
Getting business answers in minutes instead of waiting for weekly reports
280
Stores on live data
< 90 sec
Point-of-sale to dashboard
31%
Reduction in stockout incidents
A national retailer operating 280 stores across Canada and the United States had a data problem that looked like a reporting problem. Weekly sales reports were produced every Monday morning for the previous week's data. By the time a regional manager received information about an underperforming store or a supply shortage, it was already ten to fourteen days old. The decisions that needed to be made based on that information — reordering stock, adjusting staffing, responding to competitor pricing — had already been delayed by the time the information arrived.
The outcome
We rebuilt the data pipeline from the point of sale all the way through to role-specific dashboards. The Monday report was replaced with a live view that updates continuously. Decisions that took a week to trigger now happen the same day.
The batch data problem hiding in plain sight
The retailer's data infrastructure had been built incrementally over fifteen years. Point-of-sale systems in each store wrote transactions to a local database throughout the day. Every night at 2:00 a.m., an ETL process collected the previous day's transactions from every store and loaded them into a central data warehouse. A second ETL process ran on Sunday nights to aggregate the week's data and populate the reporting tables that the Monday morning reports were generated from. The architecture had made sense in 2009, when the alternative was more expensive. By the time we began working with this retailer, the batch model was creating business problems that were clearly attributable to data latency: buyers placing reorders based on inventory counts that were forty-eight hours old, regional managers making staffing decisions based on sales patterns that were nine days old, and a marketing team running promotions without any visibility into how individual stores were responding until the following Monday.
Redesigning the pipeline from event to insight
We replaced the nightly ETL batch process with a streaming pipeline. Every transaction at every point-of-sale terminal now produces an event that is published to a message broker in real time. The message broker holds the event for downstream consumers to process. The first consumer updates the inventory management system: when a product is sold, its inventory count in the central system decreases immediately, not at 2:00 a.m. the following morning. The second consumer updates the analytics platform: transaction data is available for querying within ninety seconds of the sale occurring. We chose a streaming architecture rather than a more frequent batch architecture — running ETL every fifteen minutes rather than every night — because streaming and frequent-batch have very different failure characteristics. A streaming pipeline that develops a processing backlog produces data that is slightly delayed but correct. A frequent-batch pipeline that fails halfway through a run produces data that is missing for a period, with no straightforward way to detect the gap from the consumer side. For a retailer where inventory accuracy has direct revenue implications, the failure mode of the streaming architecture was substantially safer.
Role-based dashboards that match how decisions are made
The technology change only created value if the people making decisions could act on the information. We worked with the retailer's operations, buying, and marketing teams before building any dashboards to understand specifically what questions each role needed to answer, how frequently they made decisions that required those answers, and what the consequence of a delayed answer was. Store managers needed to see their current day's sales against target, their current inventory levels for fast-moving products, and a flag when any product category fell below reorder threshold. Regional managers needed a view across their stores that showed which stores were running ahead or behind target that day, which stores had potential stock issues, and staffing coverage for the current shift. The buying team needed product-level sell-through rates by region so they could respond to demand signals with reorder decisions the same day rather than the following week. We built each of these views as a distinct dashboard with a distinct data model, rather than building a single flexible analytics environment and expecting each team to construct their own views. Flexible self-service analytics is valuable for exploration; it produces worse outcomes than purpose-built views for recurring operational decisions.
Handling peak load: Black Friday as the real stress test
The streaming pipeline's throughput requirement on a normal day was approximately 400,000 transactions across all 280 stores. On Black Friday, transaction volume across the retailer's highest-traffic stores runs at twelve to fifteen times normal volume during peak hours. We designed the pipeline to scale horizontally to handle peak load — adding processing capacity automatically as transaction volume increases, and releasing it as volume returns to normal. The retailer's previous batch architecture had handled peak periods by accepting that the data would be slower than usual; the ETL process that normally took three hours to run would sometimes take eight hours during high-volume periods, meaning that Monday's report after Black Friday weekend might be based on data that was two days old in some stores. The streaming pipeline's first Black Friday ran without incident. Peak transaction volume reached 6,200 transactions per minute across the network. The pipeline processed the volume with end-to-end latency under 75 seconds throughout — faster than normal day performance, because the horizontal scaling had been over-provisioned to provide a safety margin.
What changed in the business
The most significant business outcome was not the technology change but the behavioral change it enabled. In the first six months of live data, the buying team made same-day reorder decisions on forty-three occasions where a product category was tracking to stockout in a high-demand store. In the batch data model, thirty-seven of those forty-three situations would not have been visible until the following Monday, by which point the stockout would already have occurred. Stockout incidents across the network dropped by 31% in the first year. The marketing team began running store-specific promotions that could be evaluated and adjusted within hours, rather than promotions designed for broad regional rollout because the feedback loop had been too slow for anything more targeted. Regional managers reported spending significantly less time in their weekly review meetings because the questions those meetings had been designed to answer — how did last week go, which stores had problems — were answered continuously during the week by the dashboard, not retrospectively in a meeting.
Facing a similar infrastructure challenge?
We're happy to have a technical conversation about your specific environment — no commitment required.