IoT Back to Basics, chapter 2: In the era of the Internet of Things (IoT) it is becoming increasingly important to be able to process, filter and analyse data close to where it is created, so it can be acted on remotely, rather than having to bring it back to a data-centre or the cloud for filtering and analysis.
The other reason to implement analytics at the edge of the network is because use cases for IoT continue to grow, and in many situations, the volume of data generated at the edge requires bandwidth levels – as well as computing power – that overwhelm the available resources. So it’s possible that streams of data from smart devices, sensors and the like could swamp datacentres designed for more traditional enterprise scale needs.
For example, a temperature reading from a wind turbine motor’s sensor, that falls within the normal range, shouldn’t necessarily be stored every second, as the data volume can soon add up. Rather, it is the readings that fall outside of a normal range or signify a trend – perhaps pointing towards an imminent failure of a component – that should create an alert, and possibly be stored centrally only after that first anomaly, for subsequent analysis.
There are too many vendors in this space to produce an exhaustive list here. But it’s perhaps notable that last year, a company formerly known as JustOne Database performed a root and branch rebranding exercise. It renamed not only its products, but also its company name, which is now Edge Intelligence. It told me it was seeing such good traction for its database – that can run on relatively compact servers at the edge of the network, a data-centre or the cloud – that it changed its name after over six years in the business.
So what are some of the characteristics of edge analytics that you might want to consider if you are trying to push at least some analytics to the edge?
Standards and protocol translation
Although there is likely to be a shakeout of some of the standards in this space, opting for technologies that support standards is likely to make future integrations easier. Again there is a vast array of standards and API’s in this area. Standards and protocols include POSIX and HDFS API’s for file access, SQL for querying, a Kafka API for event streams, and HBase and perhaps an OJAI (Open JSON Application Interface) API to help with compatibility with NoSQL databases. There’s also the need to be able to support older, proprietary telemetry protocols so that legacy equipment (that often have lifetimes measured in decades) can been connected to more modern IoT frameworks. This is especially true in the industrial space, where IoT is of particular value for the likes of predictive maintenance.
Distributed data aggregation
This is to some extent the bread and butter of edge analytics, providing high-speed local processing, which is especially useful for location-restricted or sensitive data such as personally identifiable information (PII), and can be used also to consolidate IoT data from edge sites.
This refers to technologies that adjust throughput from the edge to the cloud and/or data centre, even with occasionally-connected sensors or devices.
Combines operational decision-making with real-time analysis of data at the edge.
Security and identity management
End-to-end IoT security provides authentication, authorization, and access control from the edge to the central clusters. In certain circumstances it will be desirable to offer secure encryption on the wire for data communicated between the edge and the main data centre. Identity management is also a thorny issue: it’s necessary to be able to manage the ’things’ in terms of their authentication, authorization and privileges within or across system and enterprise boundaries.
Delivers a reliable computing environment to handle multiple hardware failures that can occur in remote, isolated deployments.
Integration with the cloud
Even if not now, there may be a requirement in the future to have good integration between an edge analytics node and the cloud. This is so that alert data and even ‘baseline’ data points can be stored in the cloud rather than in one’s own data centre. In this regard integration with your cloud provider of choice – if you have one – would be a wise idea. If you don’t already do much in the way of data processing and storage in the cloud, some of the likely execution venues in your future could include Amazon Web Services, Google Cloud Platform or Microsoft Azure, but it wouldn’t do any harm to know there is support for the open source OpenStack infrastructure as a service (IaaS).
Edge analytics has come on leaps and bounds in the past several years as IoT use cases have shaken out. At the very least it might be worth asking if edge computing has a role to play in any IoT projects that you may be thinking of embarking on.
Most of us already recognise that technology has the potential to wipe out our privacy, if checks and balances are not in place – or at least I hope we do! What’s scary then in the recent hoo-hah about fitness trackers revealing secret locations is that it shows how bad we are – both as users and as technology developers – at spotting those privacy risks ahead of time.
Soldiers and other security staff have been warned for years against revealing their location via social networks. The risks are obvious: in 2007, Iraqi insurgents used geotagged photos to locate and destroy four US attack helicopters, for instance. More recently, geotagged selfies contradicted official Russian claims by revealing Russian soldiers in Ukraine, fighting alongside Ukrainian rebels.
Yet here we are, with people acting all surprised that, when the Strava fitness tracking app openly publishes its users’ location and movement data, it reveals where soldiers exercise, as well as civilians.
You have to wonder what on earth those military users thought they were doing, leaving a tracker wirelessly-connected when they’ve been warned for years about geotagged photos, Facebook Places, Foursquare and all the rest. Did they fail to spot the privacy options on their Strava settings page? (It’s easily done – they are buried a few layers down.) Or did they, as so many of us do, assume that it’s just ephemeral data, of no interest to anyone else?
The tracking scare should remind everyone, not just the world’s militaries, that even a direct order is sometimes not enough. And if it’s an indirect order or mere advice, you’re lucky these days if the recipient scans the first paragraph before muttering “Whatever” and clicking Accept. There must be training too, plus active checks on compliance and probably some form of pen-testing or white-hat hacking.
Beyond that, it also shows why – as the GDPR will require – you need to get a user to actively opt-in to data processing, and why it must be informed consent. Simply providing an opt-out, without a clear explanation of the risks, is nowhere near enough.
To be fair, Strava does recognise that some individuals want anonymity. In a statement it said, “Our global heatmap represents an aggregated and anonymized view of over a billion activities uploaded to our platform. It excludes activities that have been marked as private and user-defined privacy zones.”
Real anonymity is hard
The problem is that this concept of anonymity looks too much like, “Oh, that could be just anyone out there, jogging around Area 51 or that Syrian airbase!” If any more proof were needed that some people in technology have no idea what anonymisation really means, this is it.
There’s a whole bunch of lessons in here, both for Strava and the rest of us. I’ve already mentioned a couple – that privacy needs to be the default, not an opt-out extra, and that anonymisation doesn’t just mean taking the names out. Another is that there is nothing intrinsically good in big data, it’s all in how it’s used – and in who’s using it.
And perhaps it’s also to beware vanity, although that can be a tough challenge for the Instagram generation. Whether it’s soldiers keen to be top of the exercise leaderboard or app developers trumpeting how many million users they have, they’re showing off. Wanting to do your best is one thing, but as the saying goes, pride comes before a fall.
Some assumptions have been held by IT pros for so long that they have almost become articles of faith. One of these is the idea that content management, particularly for files, semi-structured and unstructured content, is so difficult that only the foolhardy attempt to tackle it for anything other than information that regulators say has to be ‘actively managed’.
It’s fair to say that, until very recently, this assumption may even have underestimated the challenges involved getting an effective content management system in place, even for relatively small sets of data and files. But things are changing.
An important development has been recent work to make some of the core elements of content management simpler and more effective. These tasks all begin with data discovery: “What do I have in my storage systems?” Even data protection vendors suppliers such as Veritas, Arcserve and Commvault, amongst others, have started to produce tools that make data discovery something that can be contemplated without fear.
However data discovery is just step one. To move towards managing content and information across the board, not just confining it to those files you are legally forced to look after, requires technology to automate the classification of the files in line with the organisation’s business needs. Traditionally this has relied on where the files live in the file system and folder structure in order for users to be able to search and surface them. And users often “misplace” or move files around, making finding them later something of a challenge.
An era of genuinely-usable data discovery is dawning
But this too is now being addressed, as vendors like Veritas and M-Files bring tools to market that, while not perfect by any means, can at least pass the 80:20 rule of dealing with the majority of files. We are at the start of an era when finding data, and using human insight to turn it into valuable information on demand, should become routine.
Of course, technology developments alone are unlikely to trigger an avalanche of user-adoption without business triggers to fire that process. That said, many organisations today have visible challenges bearing down upon them.
Some have been around for a long time, such as pressure to use storage cost-effectively or ensure data is protected appropriately, but have been placed in the ‘too hard to look at now’ folder. Others, such as various regulatory drivers around data privacy, are charging forwards at high speed with GDPR a major consideration in the boardroom.
I hope that drivers such as GDPR, combined with better technology solutions, will see organisations look more deeply at managing information, and especially at following often-valuable user-generated content throughout its lengthening, but now bounded, lifespan.
There is an additional upside if you do Information management well for all the files in the organisation, if you can generate new business value by exploiting data that was previously hard to locate when needed. And with tools like M-Files and Veritas making it possible to do so without having to move everything into yet another silo, the age of enterprise-wide information management may finally be dawning.
Europeans will in future be able to bring US-style class actions for (alleged) privacy violations, instead of having to sue individually and expensively. It’s thanks to a little-known clause of the EU’s GDPR, which comes into force in May.
Rich and arrogant organisations have long relied on delaying tactics to evade certain of their responsibilities to individuals and small businesses. Who among us has the time and money needed to seek redress at law, when our opponent has a full-time legal staff with nothing better to do than dispute and obstruct? Especially if our reward might only be a few hundred pounds or euro.
A solution used (and yes, some would say abused) in the US is the class action. This allows a single party to lodge a claim on behalf of a group, such as all the shareholders or customers of a company. Add the ability of lawyers to work on a contingency basis, meaning they get nothing if they lose but a percentage of the total – which can be considerable, for a large group – if they win, and infringing organisations can no longer afford to be quite so arrogant.
True, the GDPR does not use the words ‘class’ or ‘group’. But it’s a logical extension of Article 80, which includes the following:
Representation of data subjects
The data subject shall have the right to mandate a not-for-profit body, organisation or association …. to lodge the complaint on his or her behalf
I say it’s a logical extension because several European countries already allow representative or collective actions in a range of cases. Typically these have been restricted to the area of consumer protection, but they demonstrate that the potential advantages to the judicial process – e.g. cost, clarity, equal treatment for claimants – are already understood.
My privacy – none of your business?
One of the first to take up the challenge, if not the first, is Max Schrems, the Austrian lawyer and privacy campaigner whose case against Facebook has been winding its way through the Austrian and European courts for almost four years (a final decision is expected soon). Schrems claims that Facebook Ireland (the company’s EU arm) has spent considerable time and legal effort simply trying to get the case thrown out on procedural grounds, such as the validity of class actions.
So he and others have formed just such an Article 80 body, called None Of Your Business, to take on class action privacy cases in the future. As well as empowering individuals to defend their GDPR rights, NOYB says it wants to support businesses that seek to comply with the law, for example by publishing guidelines and best practices, and by making it harder for cheats to gain competitive advantage.
It’s just one more incentive, if any were needed, for organisations to come to terms with the GDPR and with privacy more generally. Get it right, and you could see profitable spin-offs in areas such as data governance and customer trust; get it wrong, and you could be in the legal – and financial – firing line.
IoT Back to Basics, chapter 1: The Internet of Things (IoT) is placing unprecedented demands on data storage, networking, processing and analytics. For end users, vendors and investors, it represents a challenge as well as a huge opportunity. But which five data processing and analytics technologies really matter for IoT?
While massively hyped, the IoT concept has matured in the last few years. There has been a growing focus on the importance of security and data governance, analytics at the edge and other technologies and platforms that are necessary to make projects successful.
It’s easy to concentrate on the wrong elements when it comes to thinking about, or even implementing, IoT. That’s because the term, first coined by Kevin Ashton – a British technology pioneer who co-founded the Auto-ID Center at MIT – focuses on two aspects of IoT: the Internet and the ‘things’ themselves.
The ‘things’ refer to sensors and other smart devices with the ability to monitor an object’s state, or even control it using actuators. Ashton envisaged that when such sensors and smart devices were on a ubiquitous network – the Internet – they would have far more value, and he was quite right.
Data, not things
What’s missing from the phrase ‘Internet of Things’ is perhaps the most important piece of the puzzle – the data itself.
If the sensors and smart devices do not capture data, they are clearly not as ‘smart’ as you might think. But it’s worth remembering that some sensors are actually relatively ‘dumb’ – doing little more than taking occasional temperature or pressure readings, for example.
But whether capturing data from sensors or smarter devices, if organisations are not in a position to somehow ingest, process and analyse that data, then it becomes worthless, and the IoT project will be considered a failure.
In fact, I’d argue that an IoT project that lacks effective data analytics is not an IoT project at all.
With the latest data ingestion, processing and analysis needs of IoT placing so much pressure on older, more traditional data platforms, we have identified a number of technologies that we believe are becoming even more important in the era of IoT
IoT: a broad church
While it’s not an exhaustive list – IoT is a very broad church and can draw on almost any technology in one shape or another – we certainly think that five categories of technology in particular are seeing an uptick in adoption as more and more companies establish how IoT might work for them, their partners and their customers. It’s worth bearing in mind the two-way nature of IoT though: it isn’t exclusively about ingesting data – there is also the importance of management tools and platforms, for example to take care of initial provisioning, updates and upgrades, re-configuration, diagnostics and remediation.
Nevertheless, I believe that these technologies on the data ingestion and analytics side of the house are becoming increasingly important:
- Security and data governance
- Infrastructure, in particular edge analytics
- Data processing, including in-memory technologies, NoSQL and Hadoop
- Advanced analytics
- Data integration and messaging
In the next chapter of this blog mini-series, I’ll look in more detail at just why these five categories in particular are so important to IoT, and explain further why it’s a mistake to put too much emphasis on the ‘things’ in the Internet of Things, despite the fact that Kevin Ashton’s term doesn’t even include the word ’data’.
Apple probably has more cash than it knows what to do with right now, and this “problem” looks set to intensify if the company decides to repatriate the $250 billion it’s holding in overseas accounts.
Imagination and vision
Not knowing how to spend your money sounds like a nice problem to have, but in the tech world this can reflect poorly on those leading the company, as it suggests lack of imagination and blinkered vision.
Stock buy-back is likely to feature in Apple’s financial plans this year, but this won’t spontaneously call into existence that all-important ‘next big thing’. Apple has increased R&D spending by 43% in the last three years, reporting a figure of $11.6 billion in 2017, but the company’s latest product – the iPhone X – represents a category evolution, not a revolution. There are some things that money can’t buy, and certainty of success is one of them.
Content is still king
Apple is clearly attracted by the idea of getting into the content business and then piping this – for a fee – to its customers, so perhaps Tim Cook will spend some of Apple’s cash pile on acquisitions in the media and communications sectors, competing head-on with Google, Amazon, and Facebook as it does so.
Apple is a lifestyle brand, so its market relevance is implicit from the perspective of brand-aware consumers and tech enthusiasts. The company no longer reports its advertising costs ($1.8 billion in 2015), but it believes that marketing and advertising is critical to its business strategy.
Those people unaware of iPhone, iPad, or iTunes are now a minority, so I expect the greatest marketing company on earth will be looking to take this to the next level during 2018.
The Disneyfication of consumer tech
At the individual customer level, continued investment in high-quality buying experiences and knowledgeable sales staff will sustain the company’s market relevance in the immediate future, and its “Disneyfication” of all things tech will surely lead to ample market growth.
But every company has an Achilles’ heel, even Apple, and it’s only a matter of time before a competitor, or the market, finds it. Maybe it’ll turn out to be the company’s culture and belief system (there’s a fine line between a seamless, liberating experience, and the perception of stifling control and proprietary lock-in), we’ll have to wait and see.
2017 was pretty much business as usual for Apple, albeit with a couple of wobbles towards the end of the year. Apple’s stock price was pretty resilient in the wake of the macOS root password security vulnerability and Batterygate saga, but these were self-inflicted wounds, not external market events or competitor actions – these tests have yet to materialise.
This article is part of a series on the challenges facing major technology firms in 2018. For more, please see the main Write Side Up blog page.