From Cron to Modern Data Stack (MDS): Dataflow Automation and Its Current State
The concept that makes the technological miracles of today possible are defined by data. Enormous amounts of data are collected...
PingThings was a startup looking to build a real-time platform to leverage machine-learning for physical systems on the electric utility grid and high-value industrial assets such as GSU transformers and step-down transformers. They wanted an analytics platform to track sensor data, focusing on storing and manipulating time-series data and modeling complex relationships between synchrophasors' high-resolution signals.
Blue Orange helped build the first production prototype of PingThings’ PredictiveGrid. The PredictiveGrid is an Advanced Sensor Analytics Platform (ASAP) architected to ingest, store, access, visualize, analyze, and train machine learning and deep learning algorithms with sensor data measuring the grid with nanosecond temporal resolution. The throughput data was very large and required a novel approach to implement scale data models.
To meet these requirements, we implemented a framework to enable the rapid development of scalable analytics pipelines with strict guarantees on result integrity despite non-synchronous data changes. This framework was comprised of two separate components:
At the heart of each function is a smaller kernel that contains two functions; (1) the precompute allows the user to specify the data needed for the (2) compute function that will operate on the data and return the computed values and associated time ranges. Each service can emit one or more new time series that are fed back into Berkeley Tree Database.
This architecture focuses on the efficient and reliable calculation and storage of all models in advance of queries, rather than just-in-time materialization. The advantage is that many months or years of analytical results can be queried in milliseconds.
Moreover, everything is versioned: the data, the distillers, and the intermediate streams. As a change occurs, the framework determines what needs to be recomputed to produce consistent results with precise provenance and schedules the processing required to propagate the change through associated streams. Additional details are explained here: https://www.pingthings.io/platform.html
Initial predictive problems addressed:
Schedule 15 minutes with blue Orange Digital to save money on digital transformation or explore how your data could be making you money.
Josh Miramant is the CEO and founder of Blue Orange Digital, a data science and machine learning agency with offices in New York City and Washington DC.
Miramant is a popular speaker, futurist, and a strategic business & technology advisor to enterprise companies and startups. As an example of thought leadership, Miramant has been featured in IBM ThinkLeaders, Dell Technologies, Global Banking & Finance Review, the IoT Council of Europe, among others. He can be reached at email@example.com.
Blue Orange Digital is recognized as a “Top AI Development and Consultant Agency,” by Clutch and YahooFinance, for innovations in predictive analytics, automation, and optimization with machine learning in NYC.
They help organizations optimize and automate their businesses, implement data-driven analytic techniques, and understand the implications of new technologies such as artificial intelligence, big data, and the Internet of Things.
Main Image source: Canva