Networked Interpretive Sensors And The IoT Paradigm

Sensors Insights by Tom Freund

Intro

Sensors have always been the key tool in extracting raw, measurable data from a process or environment. And now, sensors have also become a cornerstone of Internet of Things (IoT). When creating a network of sensors, the current standard approach for sensor data exploitation is the Cloud + Big Data. However, there are potential overhead issues affecting network performance with this approach when implementing an IoT; particularly when constrained to operating in a legacy network environment.

Because of the enhanced price/performance and added capabilities of current and future microcontrollers, a tightly coupled microcontroller-sensor configuration is both quite compact and, yet, capable of handling deeper algorithms that not only handle data extraction, but can perform interpretation as well. As a result, the overhead of network traffic for raw data is replaced with a smaller volume of discrete messages highlighting key patterns. The Big Data side can then be optionally used to integrate and more deeply analyze this stream of patterns from the various sensors in the network. This is the approach of networked interpretive sensors.

The IoT Paradigm Today

The general direction of a typical implementation of IoT is that of a star topology network of sensors (or actuators) that feed volumes of signal data to or receive commands from a cloud (see figure 1). The cloud, in turn, digests and analyzes the data through use of statistical and machine learning algorithms. The results can be made available through apps or dashboards on web browsers.

Fig. 1: Conventional IoT configuration
Fig. 1: Conventional IoT configuration

But, taking a more holistic look at the effects of the message volume created by this configuration poses some issues. As background, current global Internet usage can be surmised from the following:

  1. Total current Internet message traffic volume is at the level of 1 exabyte (1,000,000,000,000,000,000 bytes) per day.
  2. In this traffic, Facebook exchanges alone represent 500 terabytes (500,000,000,000,000 bytes) per day.
  3. As a contrast, 6,000 Twitter messages, or at most 840,000 bytes, are generated per second; or 73 gigabytes (73,000,000,000 bytes) per day.

By the year 2025, Internet accessible devices are projected to grow to a level of 200 billion globally. Messages from such devices can range in size. But, taking an average size of 40 bytes per message at an average rate of 100 messages per device per second, total global message traffic per second adds up to 800 terabytes. This equates to:

  • 40 times the size of the Library of Congress per second,
  • 70 times the current total global daily message traffic on the Internet.

All of this is on top of message exchanges between persons, which is also growing rapidly. The impact on our communication infrastructure is, to say the least, staggering. The criticality of many of these IoT devices will demand a much higher level of network reliability and quality of service (QoS). The need for rapid response and minimizing latency in message arrival will place a significant demand on network operational speed.

These demands on network infrastructure require an approach that rely on how data is handled at the very edge of this IoT in order to reduce operational capacity of the Internet to a more manageable level.

Next page

Edge Processing

In the literature about IoT, much has been mentioned and written about edge processing. What exactly is that all about ? Edge processing refers to components in a network of sensors that collect and summarize recorded data locally in order to streamline the operation of data analytics in the cloud.

Another way of putting it is "the edge of IoT will often not be an OS-based device, but the data sourced from those points is critical to the larger context of your architecture." That is why edge computing is a critical element needed for the advancement of the IoT, in particular :

  1. It serves as a "muffler" for data "exhaust". Much of the data extracted, up to 99% in some cases, are "exhaust". That is, messages that ultimately have no real value and can be ignored.

     

  2. Time can matter. Being effective at leveraging historical data means achieving the best message response possible. This is particularly true when reaction time becomes more critical and minimizing latency is extremely important.

     

  3. Cost. Handling and sending data "exhaust" in a network adds up to operational costs. The less "exhaust" there is, the more effective a sensor network becomes.

Interpretive Processing

Loss in the discussion of edge computing is one point that, oddly enough, is never explicitly mentioned. To achieve all the above-mentioned advantages, an edge computing device must have the ability to serve as an interpreter of the sensor data stream passing through it.

To be an effective interpreter, it must:

  1. Filter the collected data for significant values.
  2. Integrate the collected pieces of data that have been filtered out.
  3. Categorize such a collection based on past history or pre-defined rules.

This filter-integrate-categorize approach compresses a large data stream into a succinct message that focuses on the detection of either unusual or expected behaviors (see figure 2). And, the key objective of an IoT, particularly in time constrained applications, is to provide the most timely information about the environment where sensors are embedded. But, the key to timeliness is to have an efficient filter-integrate-categorize process as an integral part of the sensor operation. To achieve this, the "edge computer" must, in effect, be absorbed within the networked sensors themselves.

Fig. 2: Filter-Integrate-Categorize (FIC) configuration
Fig. 2: Filter-Integrate-Categorize (FIC) configuration

Take for example a water management system based on a network of soil moisture sensors and weather stations. This can be used for a controlling water distribution devices over several properties or for farming operation. A weather station can be a collection of on-site, compact physical instruments or a network source of weather data for a particular site. In the case of an on-site collection of instruments, this can be a single instrument or a local network of instruments.

Each of the soil moisture monitors local soil water content in a particular section of a large parcel of land. Each sensor is equipped with a filter-integrate-categorize capability that enables it to make decisions on water needs to its section. If there is a weather "station" linked to soil moisture sensor network, it can provide look-ahead capability on precipitation.

Thus, a soil moisture sensor detecting a lower-than-normal moisture level in its section, can engage directly in a discrete message dialogue with a weather station to obtain predictions on precipitation for its section. If the predictions prove negative, the soil moisture sensor can engage directly in another "dialogue" with the controller for a watering system (e.g. water distribution valves, mobile irrigation unit, sprinkler units) requesting a selected volume of water for its section. This exchange of discrete messages can be recorded via a remote cloud for background analysis and future planning (see figure 3).

Fig. 3: Cloud and FIC configurations
Fig. 3: Cloud and FIC configurations

Thus, the cloud plays a more passive, yet key role focused on planning and less on operational use. If we aggregate the volume of a global collection of such networked filter-integrate-categorize devices, message traffic can be compared in volume to the above-mentioned daily volume of Tweets, or in the order of 70 gigabytes per day; a more manageable networking situation.

Next page

Networking Interpretive Processors

Enabling the filter-integrate-categorize approach to networked devices requires an alternate approach to their architecture. On the hardware side, this approach requires a tight coupling of a sensor to a high-end microcontroller. To that end, off-the-shelf microcontrollers are available with megabytes of flash storage and up to 1 megabyte of random access memory (RAM). In addition, processing speed has increased significantly; along with embedded or co-processor support for encryption to enable secure message exchange. Finally, these microcontrollers provide organic support for Ethernet to enable networking, as well as extensive I/O support for sensor access.

On the software side, part of this approach relies on existing standard and open source components. More specifically, support for sensor and network communication, as well as task management, are readily available as open source software. On the other hand, the novelty comes in the architecture of the filter-integrate-categorize component.

At the heart of this component is a set of algorithms known collectively as the signal interpreter core, or simply the core. The core interacts with a separate component known as the signals model, or simply the model (see figure 4). The model, in effect, embodies the FIC approach.

Fig. 4:  Signal interpreter and model interaction within the microcontroller and NIS network
Fig. 4: Signal interpreter and model interaction within the microcontroller and NIS network

The model is a catalog of groupings of signal patterns known as behaviors (see figure 5). The behavior itself is an ordered list consisting of any combination of:

  1. an ordered list of indexes to other behaviors,
  2. a sequence of specific data values,
  3. a sequence of rates of change between specific data values,
  4. a sequence of value ranges over a specific number of sensor readings.

Fig. 5: Signal model structure
Fig. 5: Signal model structure

Each behavior has an associated list of actions. Each action can be:

  1. a link to another behavior,
  2. a request message, expressed as a template, associated with either the sensor or the network,
  3. temporary storage, or "memorizing", of a set of 1 or more behaviors
  4. a computation on 1 or more "memorized" behaviors

Temporary storage a behavior simply stores the occurrence of that behavior, and associated data, in an indexed list of behaviors. The "zero index" item is the latest behavior detected. The kinds of computations that can take place are:

  1. standard mathematical computations,
  2. search-and-match operation of a sensor data sequence against behaviors in the model,
  3. reset, or clear, the indexed list of accumulated, sensed behaviors.

Requests message templates are used to build brief Tweet-like messages that either describe a request for another node, or nodes, in the network or announce the presence of a behavior.

The core operates based on the following sequence:

  1. Determine if a model is present.
  2. If not, request a LOAD of the model from an archive in the network to which this sensor is attached.
  3. Locate the filter behavior in the model.
  4. Perform the list of actions associated with the filter behavior against a fresh reading of sensor data.
  5. Perform the list of actions associated with the "zero index" behavior and any linked behaviors.
  6. Proceed with step 4 above.

The filter behavior is used to bootstrap the signal data acquisition process from the sensor. Steps 4 through 6 are an endless loop that carries out the filter-integrate-catalog process. Building a model can be performed either through direct specification and encoding or using data-driven algorithms for automatic generation of a signal model.

This model-core configuration specifies a networked interpretive sensor (NIS). Rather than being a data vacuum cleaner, the added computational capability of this configuration allows the sensor package to screen data and only issue a message over the network when behaviors of interest, as cataloged in the model, are discovered. Thus, network traffic is significantly reduced and, at the cloud level, large scale data mining is transformed to medium scale text message mining; lowering the computational burden at the cloud level and making the results of the analysis more meaningful.

As an aside, networked interpretive sensors are compatible with the approach of the Open Fog Consortium. This industry-driven group aims to develop an architectural framework and supporting ecosystem that minimizes network latency, supports end point mobility, enables persistent connectivity, insures predictable bandwidth, and provides distributed coordination of systems.

Next page

Challenges

Key, though, to the deployment of networked interpretive sensors is finding ways to build a signals model. What is the best way to structure the individual behaviors in a model? What is the best way to organize behaviors in a model? What are suitable methodologies to build a model? In terms of structuring individual behaviors, two potential candidates are JSON and SensorML.

JSON, or JavaScript Object Notation, is a text representation that is relatively simple to scan and interpret. The computational resources required to handle JSON are well within the capabilities of high-end microcontrollers. And, It provides a simple, straightforward way of describing behaviors based on signal values and/or characteristics.

An example of a JSON representation is a simple behavior capturing sustained low water content level in a soil moisture sensor from the prior example. This behavior captures sustained low moisture requiring additional watering within the section covered by the sensor as follows:

 

	{
		low_moisture:
		{
			measure: ohm-m,
			sequence: { [50,100], >100 },
			actions: { calc, ….; msg, …...; }
		}  
	}

Basically, a low moisture condition is measured in Ohm meters. And, if a sequence of signal data values from the sensor starts with moderate soil corrosion level (soil resistivity between 50 and 100 ohm-meters) transitioning to low soil corrosion level (soil resistivity over 100Ω-meters), a discrete request (message) is issued to a water management to release a calculated volume of water to the section covered by the sensor.

SensorML is a sensor-specific structure derived from XML. It is used to describe the structure and operation of a sensor in detail. Being an XML derivative, the computational capacity required to scan and interpret is at or beyond the bounds of many, if not all, currently available microcontrollers. SensorML is more suitable as a document describing a configuration of sensors at a physical site for cloud-level applications. Examples of SensorML representation are available at the SensorML support site.

Once a behavior representation has been chosen, there is the issue of building a database of these representations that can fit in the compact space of flash storage available on microcontrollers. Embedded in-memory and flash-based storage techniques geared to microcontrollers are beginning to appear.

There is also the option to store an entire model as a large JSON object; combining all behavior description within than object. This option, however, requires thorough analysis of flash and memory use within a target microcontroller for the particular sensor implementation.

Next page

For building a model, there are two options that can be pursued: interactive graphical tools for creating and accumulating behaviors, or automated tools based on a machine learning approach.

Interactive graphical tools, similar to LabView, can be used to build a process taking user actions to create individual behaviors that are then stored in an off-line database. To create a model, the tool can then retrieve selected stored behaviors and create a model through an automated process enabled by a single user command.

Automated tools can take specific cases of signal sequences coupled with relevant specific actions to create individual behaviors expressed as rules through techniques from machine learning, such as C5.0. A single user command can then enable a translation of the generated rules to a catalog of behaviors ready for review and subsequent implementation.

Managing model provisioning across multiple interpretive sensor networks becomes an issue of scalability. As tools for IoT software configuration management become available, they can be readily adapted to model configuration management and distribution.

Finally, realizing dialogs between networked interpretive sensors, as mentioned above, requires a standard for an application-level protocol that:

  1. Enables "discovery" of a sensor's calibration capabilities and measurements or an actuator's functions,
  2. Supports requests for a specific sensor measurement or actuator function across a network,
  3. Minimizes latency in message transfer in order to provide real-time response.

These issues were also encountered in the area of building automation and the answer that came from that effort was the BACnet protocol. BACnet was developed as an answer to the plethora of proprietary protocols adopted by building infrastructure and automation vendors which led to confusion among their customers.

As a result, BACnet was adopted as an application level protocol focused on devices in the infrastructure of a building. More specifically, device "discovery" is accomplished through the Who-Is, I-Am, Who-Has, and I-Have protocol services. Read-Property and Write-Property protocol services are used to access or enable device functions described as BACnet objects. Some of the BACnet objects are Analog Input, Analog Output, Analog Value, Binary Input, Binary Output, Binary Value, Multi-State Input, Multi-State Output, Calendar, Loop, Program, Schedule, and Command. Instances of these objects are associated with the native sensor measurement or actuator capabilities in a building automation network.

In effect, BACnet is an application-level mechanism to enable message exchange between devices. These messages can be transferred via existing communication protocols such as IP operating over Ethernet media.

To enable message exchange between networked interpretive sensors, there is a need to adopt a protocol supporting objects and services similar to those found in BACnet.

Benefits

Despite the issues dealing with model creation and provisioning, the benefits accrued by the deployment of networked interpretive sensors in IoT implementations spell out a more manageable network and a better understanding of the environment being monitored. The impact is particularly significant in Internet traffic volume and analysis at the cloud level.

With a network of interpretive sensors, the flood of data snippets is reduced to a stream of behavior messages. It will not be inconceivable that sensing a behavior will take the equivalent of 20 sensor data readings. This translates to five messages per second per device.

Coupling this with the projected 2025 traffic figures mentioned before (200 billion devices, 40 bytes per message on average), the result is network traffic growth from 1 Exabyte per day to nearly 3.5 Exabytes per day by 2025, or four times the current Internet volume

Contrast this to an alternative based on simply drawing raw data to the cloud. The Internet traffic volume is projected to be 70 times over the same 10-year period. A fourfold growth over a 10 period is a much more manageable growth rate vis-a-vis infrastructure management of the Internet.

Given the lower message traffic rate per device per second, cloud-level analytics can focus on directly analyzing behavior trends rather than first extracting patterns, then trying to extract behaviors from patterns, and finally extracting behavior trends. This frees up more computational bandwidth to accommodate any surge in number of message sources and provide a more succinct, yet meaningful, and timely description of the context being monitored.

References & Extra Reading

Processing Data From the Edge: A Platform for the Internet of Your Things
Seven reasons edge computing is critical to IoT
The Vital Role of Edge Computing in the Internet of Things
JSON support site
SensorML support site
Extensible Markup Language (XML)
SensorML 2.0 Examples
picoDB™ - a NoSQL database for eLua
picoDB™ tutorial and source
Data Mining Tools See5 and C5.0
BACnet - Official Website of ASHRAE SSPC 135
Open Fog Consortium

About the Author

Tom Freund is currently the owner of Dig.y.SoL, a firm building and licensing oversight systems for unmanned platforms and urban infrastructure. His background represents over 20 years of experience in intelligent (AI-based) industrial automation and remote equipment management systems. He has provided leadership in pioneering projects dealing with:

  • intelligent process control for composite materials,
  • resource scheduling systems,
  • intelligent software generation for test equipment,
  • next-generation infrastructure equipment monitoring systems.

Mr. Freund also taught courses in Artificial Intelligence and Software Engineering. He can be reached at [email protected].

Related Stories

IoT & Thick-Film Technology for Underground Sensors in Agriculture

Energy Harvesting and Low Power Design and Applications Complimentary Webinar

The IoT Is As Easy A, B, C