MetricSight
Data
Analytics
Project Details
Owner
TIMORIA
Release Date
Mar 20, 2023
Services
Data Architecture, Time Series Analysis
Duration
60 Days
Budget
9500$
Overview
MetricFlow is a real-time analytics platform developed as a proprietary solution for a large industrial client. It processes and visualizes millions of internal time-series data points per second, specifically from IoT sensors and infrastructure telemetry. The platform allows the client organization to gain immediate insights through its powerful query language and visualization capabilities, scaling horizontally within their private infrastructure to handle growing data volumes.
Objective
Commissioned by the client, the goal was to build a specialized time-series database and analytics engine capable of ingesting millions of internal data points per second and providing sub-second query results across billions of historical records for their operational needs. Support for complex aggregations, anomaly detection, and forecasting was essential. The solution needed to be resource-efficient, operate within their private network, and offer flexible data retention policies.
Process
We implemented MetricFlow using Go for the ingestion pipeline and Rust for the query engine, deployed within the client's environment. TimescaleDB provided the storage foundation, integrated with their existing PostgreSQL clusters. The real-time processing pipeline used Kafka for internal data streams. OpenTelemetry and Grafana were integrated into their existing monitoring stack. A significant technical challenge was optimizing the compression algorithm to minimize storage footprint while ensuring high query performance on their private infrastructure.
Impact
MetricFlow now serves as the central internal monitoring solution for the client's data-intensive operations, including tracking over 50,000 proprietary IoT devices. It processes over 5 million internal metrics per second with query latencies under 500ms. Its anomaly detection capabilities have helped the client identify and resolve potential operational issues proactively, significantly improving system reliability and reducing costs within their specific industrial context. The platform remains an internal tool and is not publicly available.