Enterprise IT Watch Blog

Jan 13 2014   1:55PM GMT

Log data is pivotal to analytics on Internet of Things

Michael Tidmarsh Michael Tidmarsh Profile: Michael Tidmarsh

shutterstock_155472725

By James Kobielus (@jameskobielus)

Logs of all sorts–web logs, application logs, database logs, system logs, etc.–are fundamental to the promise of the Internet of Things (IoT).

Without continuous logging of relevant events, the IoT can’t fulfill its core role as the real-time event-notification bus of the online world. Machine-readable event logging is fundamental to all the core applications of IoT, including real-time sensor grids, remote telemetry, self-healing network computing, medical monitoring, traffic management, emergency response, and security incident and event monitoring. Ubiquitous IoT will depend on the ability to support continuous real-time ingest, analysis, correlation, handling, and any-to-any routing of machine-generated information.

IoT’s development depends on implementation of a ubiquitous, general-purpose event-logging infrastructure. This global logging infrastructure must be able to support disparate relational and non relational logged data types; execution of advanced analytics against myriad logged data objects; agility to work in batch and streaming environments; scalability to support growing volumes of in-flight log data replication without choking or slowing down. Individual event logs need not be peta-scale; in fact, most IoT devices will support local logs that are constrained to their increasingly tight storage constraints and disparate form factors.

I recently came across a great article on the untapped potential for general-purpose logging infrastructure in the IoT age. Though the author, LinkedIn software engineer Jay Kreps, doesn’t specifically connect his discussion to IoT, the affinity is obvious. The two trends that he highlights as increasing the need for distributed data logging–”event data firehose” and “explosion of specialized data systems”–are at the very heart of the IoT revolution.

Kreps lays out a real-time pub-sub architecture reminiscent of the time-honored concept of an “enterprise service bus” (ESB). “[M]any of the things we were building,” he says, “had a very simple concept at their heart: the log. Sometimes called write-ahead logs or commit logs or transaction logs, logs have been around almost as long as computers and are at the heart of many distributed data systems and real-time application architectures.”

To the extent that we intend for IoT to evolve into an ESB-like infrastructure for big data applications,  we must grapple with the central role of distributed logs and with the protocols that support that support distributed log-data consistency, replication, and concurrency.

It seems to me that this loosely-coupled ESB-like approach for data integration is the best infrastructure for truly flexible, increasingly heterogeneous big data, IoT, and cloud infrastructures. The log will be the common denominator data-storage and integration abstraction.

1  Comment on this Post

 
There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when other members comment.

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
  • JohnWalker
    I agree a common infrastructure must exist for IoT, and that it must support streaming or batch file processing.  But this is very complicated as the data being tracked is not one dimensional and will vary greatly between devices.   It could go the way of EDI in business communications with many different formats for different purposes that are designed to be standardized,but in reality are made non standard because they are rigid and devices of different types want to expose different, new or improved information.  Or it could go the way of XML where the metadata is included in the file, which presents the problem of much larger file streams because of the metadata included with every record.  There is a need for a file definition system that will keep the metadata out of the data being sent, but ties the metadata to the record through some sort of transaction type/company database that has to be exposed to the public.
    10 pointsBadges:
    report

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to: