In this end-of-the-year edition of the Talking Data podcast, Senior Executive Editor Ed Scannell joined me to speak with Mike Matchett, founder and principal analyst of the Small World Big Data consultancy, as we rambled through some of the signal events of big data in 2018.
Mergers and acquisitions, naturally, tend to be the stepping stones when you look back at the path just traveled. Cloudera and Hortonworks, IBM and Red Hat – these deals set the tone for our end-of-year big data ruminations.
But what rises in importance in our podcasters’ ponderings, are not the mergers in their dollar terms but instead the mergers as they reveal the underlying currents and eddies of telling trends. What surfaces?
*Hadoop-centric big data analytics is morphing into machine learning and deep learning analytics.
*It is not that the shortcomings of Hadoop data processing have been solved, however.
*Rather, the vendors have declared victory, and moved on to the next world to conquer – the more mysterious one of AI, machine learning and statistics safely beyond the layperson’s ken. It’s happened before.
*AI is what you do with big data. The Web and cloud have become irresistible honeypots for said data, and the result is that the balance of power – for data, IT and business — is moving to the cloud.
*A long view would say that it’s taken more than 10 years for cloud computing to become an overnight success, and that assorted after effects will play out for some time to come.
In the podcast we talk about the pendulum effect, of which 1990s client/server computing is a ready example. In that case there was a swing away from central IT, which was called “the glass house.” The era saw independent departments within businesses beginning to set their own technology courses.
We see that with cloud today. A pendulum swing has put more technology decision making in the hands of developers within lines of business. They can use credit cards to start projects, and they can get very high-end systems via top cloud providers.
We may look back one day and see things shifting back toward central IT. If so, it will no more resemble today’s central IT than today’s resembles the IT shop of the glass house days.
The recent years have been topsy-turvy – and not just on the big data front. Thanks for hanging with us on the Talking Data podcast. – Jack Vaughan
As some TechTarget reporters were finishing their last podcasts for the year, we sat down briefly and tried to view the longer picture, to look through the glass darkly toward the past.
Now, you are taught not to dwell on history from your first days in this field called journalism; people can buy books if that is what they want.
But a calendar with days rapidly dwindling might lead you to do just that, and best editorial practices be dammed.
And the tentative conclusion on some of our parts was that the big mergers of 2018 don’t stack up to those of yore.
That is even though the transactions hit some pretty heady dollar amounts.
They don’t really seem on par with the big mergers of the past 20 years, these jaundiced observers ventured.
IBM buying Lotus, Oracle buying Sun — those were some game changers. They signaled big industry shifts or put bookends on identifiable tech eras. Maybe it was the outsized nature of some of the characters involved.
An ongoing move to cloud computing is behind IBM’s bid for Red Hat, or Microsoft’s deal for GitHub, which are the topics discussed in this podcast. The move to cloud now seems predestined, but how it will actually transpire for these noted player will be determined by customers. Stay tuned.
This is part of a series of end-of-the year podcasts. I joined Senior Executive Editor Ed Scannell and analyst Mike Matchett, principal and founder of the Big Data Small World consultancy, for this SearchDataManagement-hosted look-back at 2018 podcast.
New apps for cloud have found a home on Azure’s cloud database. What about existing apps? On closer inspection it appears that there is work ahead. At PASS 2018, Craig Stedman encountered signs of progress therein. Kicking off the event was Microsoft’s database group leader Roland Kumar who, Stedman reports, discussed managed instances of SQL Server on the cloud that more functionally equivalate with downhome SQL Server on premises. In any case, the pace is quick. Check out the latest Talking Data Podcast and related SearchSQLServer coverage for all things PASS. – Jack Vaughan
This podcast considers how likely it is for existing users of Oracle and Microsoft to move to the cloud, as well as what obstacles they may face if they make the leap. Senior Executive Editor Craig Stedman tells us that’s still a work somewhat in progress. And, I get a chance to provide a take on Oracle’s comparable moves, hearkening again to my days at Oracle Open World in October. Download the podcast and learn as we compare notes from our recent travels. Be there when “worlds collide.” – Jack Vaughan
Last month we ventured West to cover Oracle Open World in San Francisco. Now, in a Talking Data Podcast edition recorded live on tape from San Francisco’s Moscone Center, intrepid reporters Jack Vaughan and David Essex discuss what they saw.
Some of it was familiar – as always, Oracle’s Larry Ellison delivered a notable keynote. Some of it was new – Ellison’s discussion was much about cyber trust, impenetrable barriers and the gremlins lurking in the cloud.
Remember when the Web first caught on?
One thing I remember is people saying “yeah, it is pretty cool, but, you know, it is stateless.”
As most of what I heard on this issue was from enterprise software vendors, with all the bias that could entail, I should have taken what I was told with a grain of salt. The first big problem these folks saw with the Web was its statelessness, which made it far different from the synchronously connect clients and servers (at that time, Java servers) they were used to.
The first problem the Web presented, as the enterprise crew saw it was to connect the Web to the database, which – no question – was a transactional relational database.
The first response to the problem was CGI, which quickly faded, but which, alas, still shows up in a URL address window every once and a while.
There soon followed JMS message queues, Web Services and SOA, and then REST and AJAX. Then the Web turned into the Cloud and NoSQL, Kafka, container-based microservices and Kubernetes orchestration hit the beach.
These thoughts were like an elusive butterfly in my mind as I ventured recently by Amtrak to cover the Strata Data Conference at the Jacob Javits Center on the Hudson River in New York City. Kafka certainly did seem to be a common theme in the schema presenters displayed in the technical sessions, and the Kubernetes sessions were overflow.
At the event I had a chance to speak with Jay Kreps, CEO and co-founder of Confluent, and one of the creators of the aforementioned Kafka publish-and-subscribe messaging bus. He told me that event processing is gaining a commanding presence on the scene. You could say it’s “what’s happening.”
All along, the general trend toward event-based microservices architectures has overlapped with new distributed data architectures, which are edging toward becoming mainstream competitors to traditional enterprise data warehouses.
You could think of Kubernetes as the other key part of the middleware replacement story that is going on today, he said. At the same time, he added, “Nothing in the enterprise goes away overnight.”
There is more from my conversation with Krebs in the latest episode of the Talking Data Podcast, including a conversation on the new architecture with expert analyst Mike Matchett of Small World Big Data and the ruminations of the author as the train to New York drew near Penn Station for Strata. – Jack Vaughan
In this episode of the Talking Data Podcast we are joined by Nicole Laskowski, senior news writer for SearchCIO.com. She tells us about a podcast series she and her colleagues have created known as Schooled in AI. This series looks at cutting-edge AI research being done at Carnegie Mellon University in order to give IT leaders in businesses a clear view on where things are headed. This discussion is preceded by Laskowki’s comments on a recent O’Reilly survey that appears to show – in what may be a surprise to many — that business folk are quite aware of the bias that AI can bring out as it reaches data-driven conclusions. Listen to this Talking Data episode, and be sure to visit Schooled in AI as well. – Jack Vaughan
For this episode of the Talking Data podcast, Mark Labbe takes a look at the MIT Startup Exchange. This program gives members of the MIT community a chance to show their wares, and as you may have guessed, those wares these days have a lot to do with AI and machine learning. Among the underlying trends, Labbe tells us, are natural language processing and geo-location. Labbe came across the MIT startup activity as part of his coverage of the recent Forrester AI Forum in Boston. Also discussed,the work of an IBM machine learning system versed in debate strategy and tactics. Listen to Talking Data and get a feel for where AI may be headed.
At times the era of big data has taken on the flavor of the old West – the kind depicted in a movie like The Treasure of the Sierra Madre. While it was seldom an outright confrontation, there’s little question that conscientious data stewards were usurped in some organization by developers who slightly resembled freewheeling bandits such as those you didn’t “need no badges” as they went about their business in John Houston’s film.
We are a few years into this, and now there are signs that a bit of taming is going on in the Hadoop ecosphere. The recent approaches for bringing Hadoop-style data processing into wider production seem to bespeak a change. The General Data Protection Directive (GDPR) is in some part a driver of that change.
At last month’s DataWorks Summit in San Jose, California we spoke with Constellation Research analyst Doug Henschen, who agreed the shift of enterprises to include large-scale open-source distributed data processing in their analytics arsenal is now tempered by increased interest in data governance.
“What you see is companies re-platforming – that the buzz,” Henschen said in this episode of the Talking Data podcast. “Companies understand that they need a sort of next-generation information architecture.”
We have seen attempts before to add tooling to data lakes to tag and curate data, but now the push may be more fevered, and GDPR may be the impetus.
“The push for GDPR has gotten people thinking more and more about the governance aspects of that,” Henschen said. “As they are re-platforming they have an increased eye toward data governance, data lineage, access control, security — all of these good things that we have long required but haven’t necessarily nailed.”
Catch up with the big data doings in this edition of Talking Data. – Jack Vaughan
Real estate listing firm Trulia is on the cutting edge of applying computer vision. In this edition of the Talking Data podcast, we talk with the company’s vice president of engineering, Deep Varma, to learn more about how his team is applying computer vision.
As you’d expect, Trulia’s computer vision processes are built around deep learning algorithms. These are the machine learning models powering most of today’s most advanced AI applications. And while engineers have made significant advances in the functionality of these models, impactful business applications have been slower to come around.
Trulia, however, has found an interesting way to implement image recognition deep learning models to build computer vision applications that are capable of identifying and describing specific objects in images. This gives prospective home buyers an idea of what they should expect from a listing before clicking on it and enables Trulia’s recommendation engines to surface more relevant results.
Listen to the podcast to learn more about how the company is deploying deep learning models in the area of computer vision.