MultiNav: Navigating Multivariate Data

Tags: : General, R, Statistics

In statistical process control (SPC), a.k.a. anomaly-detection, one compares some statistic to its “in control” distribution. Many statistical and BI software suits (e.g. Tableau, PowerBI) can do SPC. Almost all of these, however, focus on univariate processes, which are easier to visualize and discuss than multivariate processes. The purpose of our Multinav package is to ingest multivariate data streams, compute multivariate summaries, and allow us to query the data interactively when an anomaly is detected.

In the next example we look into an agricultural IoT use-case, and try to detect sensors with anomalous readings. The full details are available in the official documentation.

The data consists of hourly measurements, during 5 days, of the girth of 100 avocado trees. For each such tree, an anomaly score is computed using four variation of Hotelling’s T2 statistic. Multinav thus returns the following dashboard:

Crucially for our purposes, this dashboard is interactive. One may hover over a particular score, in order to inspect the raw evolution of the measurement in time, and compare it to the group via a functional boxplot.

There is a lot more that MultiNav can do. The details can be found in the inline-help, and the official documentation. For the purpose of this blog, we merely summarize the reasons we find Multinav very useful:

  1. Interactive visualizations that link anomaly score with raw data.
  2. Functional boxplot for querying the source of the anomalies.
  3. Out-of-the-box multivariate scoring functions, with the possibility of extension by the user.
  4. The interactive component are embeddable in Shiny apps, for a full blown interactive dashboard.
Written on April 28, 2020