Drowning in Data

By Melanie Martella Sep 21, 2007 1:00am

Two weeks ago, in "Will Sensors Outnumber Insects?", I talked about the increase in the number of sensors in our environment and that we should think long and hard about how we use the data collected. Reader Jay Nemeth-Johannes disagreed that my examples crossed the line, although he agreed about paying attention to privacy concerns. His second comment, however, pointed out another, more pressing problem.

Prime Processing Locations
I'll quote his second comment in its entirety:
"Also, although privacy is a concern that needs to be constantly monitored, I believe the larger problem is managing the data being generated by these billions of sensors. Current government RFPs for sensor networks seem to be aimed toward transferring massive amounts of data to large central servers where processing occurs. I strongly believe that the only way we are going to manage this massive data flow is to push the analysis down closer to the sensors themselves and to be shipping only filtered information across public networks." To which I reply, "Heck, yeah!"

As we add more sensors to our environment and create more and more sensor networks, particularly sensing and control networks, we need to process at the sensor nodes to get a sufficiently speedy response. Upstream processing to do more sophisticated analysis and tie multiple data streams together? Fine. But localized processing to do basic filtering or operate with some basic logic seems a better model for seriously large-scale networks.

As Jay so rightly points out, the more sensors you have, the more data you have and the burden of shipping them to some central processing and storage location, let alone managing them once they're there, will become immense. And if you're trying to enable actuation at the sensor location you really don't want to wait until the data has gone from the sensor to the central processor and then back out to the actuator. What you want is knowledge—what the data means and a way to tie it to appropriate real-time action.

Some wireless sensor networks use this approach already and look for more to adopt it—the mesh network doesn't exist just to transport bits from point A to point B, it exists to accumulate and analyze data in-situ. The higher-level knowledge amassed by local processing of sensory data is then available upstream or can be used to initiate control actions.

Unless we want to drown in data, we need to think hard about shifting some of the heavy lifting down to the sensor level. For pervasive computing to become a reality, it requires in-situ sensing and response. Do you agree? Disagree? Scroll down to post a comment and join the debate.