New apps for cloud have found a home on Azure’s cloud database. What about existing apps? On closer inspection it appears that there is work ahead. At PASS 2018, Craig Stedman encountered signs of progress therein. Kicking off the event was Microsoft’s database group leader Roland Kumar who, Stedman reports, discussed managed instances of SQL Server on the cloud that more functionally equivalate with downhome SQL Server on premises. In any case, the pace is quick. Check out the latest Talking Data Podcast and related SearchSQLServer coverage for all things PASS. – Jack Vaughan
This podcast considers how likely it is for existing users of Oracle and Microsoft to move to the cloud, as well as what obstacles they may face if they make the leap. Senior Executive Editor Craig Stedman tells us that’s still a work somewhat in progress. And, I get a chance to provide a take on Oracle’s comparable moves, hearkening again to my days at Oracle Open World in October. Download the podcast and learn as we compare notes from our recent travels. Be there when “worlds collide.” – Jack Vaughan
Last month we ventured West to cover Oracle Open World in San Francisco. Now, in a Talking Data Podcast edition recorded live on tape from San Francisco’s Moscone Center, intrepid reporters Jack Vaughan and David Essex discuss what they saw.
Some of it was familiar – as always, Oracle’s Larry Ellison delivered a notable keynote. Some of it was new – Ellison’s discussion was much about cyber trust, impenetrable barriers and the gremlins lurking in the cloud.
Remember when the Web first caught on?
One thing I remember is people saying “yeah, it is pretty cool, but, you know, it is stateless.”
As most of what I heard on this issue was from enterprise software vendors, with all the bias that could entail, I should have taken what I was told with a grain of salt. The first big problem these folks saw with the Web was its statelessness, which made it far different from the synchronously connect clients and servers (at that time, Java servers) they were used to.
The first problem the Web presented, as the enterprise crew saw it was to connect the Web to the database, which – no question – was a transactional relational database.
The first response to the problem was CGI, which quickly faded, but which, alas, still shows up in a URL address window every once and a while.
There soon followed JMS message queues, Web Services and SOA, and then REST and AJAX. Then the Web turned into the Cloud and NoSQL, Kafka, container-based microservices and Kubernetes orchestration hit the beach.
These thoughts were like an elusive butterfly in my mind as I ventured recently by Amtrak to cover the Strata Data Conference at the Jacob Javits Center on the Hudson River in New York City. Kafka certainly did seem to be a common theme in the schema presenters displayed in the technical sessions, and the Kubernetes sessions were overflow.
At the event I had a chance to speak with Jay Kreps, CEO and co-founder of Confluent, and one of the creators of the aforementioned Kafka publish-and-subscribe messaging bus. He told me that event processing is gaining a commanding presence on the scene. You could say it’s “what’s happening.”
All along, the general trend toward event-based microservices architectures has overlapped with new distributed data architectures, which are edging toward becoming mainstream competitors to traditional enterprise data warehouses.
You could think of Kubernetes as the other key part of the middleware replacement story that is going on today, he said. At the same time, he added, “Nothing in the enterprise goes away overnight.”
There is more from my conversation with Krebs in the latest episode of the Talking Data Podcast, including a conversation on the new architecture with expert analyst Mike Matchett of Small World Big Data and the ruminations of the author as the train to New York drew near Penn Station for Strata. – Jack Vaughan
In this episode of the Talking Data Podcast we are joined by Nicole Laskowski, senior news writer for SearchCIO.com. She tells us about a podcast series she and her colleagues have created known as Schooled in AI. This series looks at cutting-edge AI research being done at Carnegie Mellon University in order to give IT leaders in businesses a clear view on where things are headed. This discussion is preceded by Laskowki’s comments on a recent O’Reilly survey that appears to show – in what may be a surprise to many — that business folk are quite aware of the bias that AI can bring out as it reaches data-driven conclusions. Listen to this Talking Data episode, and be sure to visit Schooled in AI as well. – Jack Vaughan
For this episode of the Talking Data podcast, Mark Labbe takes a look at the MIT Startup Exchange. This program gives members of the MIT community a chance to show their wares, and as you may have guessed, those wares these days have a lot to do with AI and machine learning. Among the underlying trends, Labbe tells us, are natural language processing and geo-location. Labbe came across the MIT startup activity as part of his coverage of the recent Forrester AI Forum in Boston. Also discussed,the work of an IBM machine learning system versed in debate strategy and tactics. Listen to Talking Data and get a feel for where AI may be headed.
At times the era of big data has taken on the flavor of the old West – the kind depicted in a movie like The Treasure of the Sierra Madre. While it was seldom an outright confrontation, there’s little question that conscientious data stewards were usurped in some organization by developers who slightly resembled freewheeling bandits such as those you didn’t “need no badges” as they went about their business in John Houston’s film.
We are a few years into this, and now there are signs that a bit of taming is going on in the Hadoop ecosphere. The recent approaches for bringing Hadoop-style data processing into wider production seem to bespeak a change. The General Data Protection Directive (GDPR) is in some part a driver of that change.
At last month’s DataWorks Summit in San Jose, California we spoke with Constellation Research analyst Doug Henschen, who agreed the shift of enterprises to include large-scale open-source distributed data processing in their analytics arsenal is now tempered by increased interest in data governance.
“What you see is companies re-platforming – that the buzz,” Henschen said in this episode of the Talking Data podcast. “Companies understand that they need a sort of next-generation information architecture.”
We have seen attempts before to add tooling to data lakes to tag and curate data, but now the push may be more fevered, and GDPR may be the impetus.
“The push for GDPR has gotten people thinking more and more about the governance aspects of that,” Henschen said. “As they are re-platforming they have an increased eye toward data governance, data lineage, access control, security — all of these good things that we have long required but haven’t necessarily nailed.”
Catch up with the big data doings in this edition of Talking Data. – Jack Vaughan
Real estate listing firm Trulia is on the cutting edge of applying computer vision. In this edition of the Talking Data podcast, we talk with the company’s vice president of engineering, Deep Varma, to learn more about how his team is applying computer vision.
As you’d expect, Trulia’s computer vision processes are built around deep learning algorithms. These are the machine learning models powering most of today’s most advanced AI applications. And while engineers have made significant advances in the functionality of these models, impactful business applications have been slower to come around.
Trulia, however, has found an interesting way to implement image recognition deep learning models to build computer vision applications that are capable of identifying and describing specific objects in images. This gives prospective home buyers an idea of what they should expect from a listing before clicking on it and enables Trulia’s recommendation engines to surface more relevant results.
Listen to the podcast to learn more about how the company is deploying deep learning models in the area of computer vision.
The trend that sees the SQL query engine appearing on Hadoop, is just the start of a movement; the SQL query engine running on data other than HDFS may follow. If these trends portend fitful change for users, they also affect vendors.
One vendor’s journey here is particularly telling. Starburst Data might be called a ‘re-start-up.’ The company was the brainchild of some young data technicians that included Daniel Abadi, an academic researcher who helped forward the notion of column-store parallel databases in the early 2000s. In 2011, he helped form Hadapt — one of the first Hadoop-on-SQL providers.
In 2014, the company was purchased by Teradata. The timing proved a bit odd, as it nearly coincided with Facebook ceding much development responsibility to Teradata for Presto, a SQL-on-Hadoop tool that the social media giant had forged in-house, and which has subsequently been endorsed by no less than Amazon for its Athena SQL engine. The former-Hadapt group within Teradata shifted its efforts to improved performance for a Presto-compatible SQL query engine.
At the end of 2017, Hadapt principals within Teradata spun-out to form Starburst, with Teradata’s blessings. A Starburst goal is to bring SQL engine prowess to SMBs that are still outliers in Teradata’s more familiar big player universe. An early effort for standalone Starburst has been a Cost Based Optimizer for Presto, built in collaboration with Facebook technicians. For the many lovers of SQL joins, the new optimizer supports Join Reordering and Join Distribution Choice.
The picture emerging shows differences in use cases between plain vanilla Hadoop and SQL on Hadoop – the difference is between Hadoop being fit for the purposes of small data science groups and skunk works to Hadoop being useful for the interactive needs of wider groups of SQL business analytics users. We are also seeing HDFS, the file system at the base of Hadoop, giving way as more people choose to pursue these types of applications on the cloud rather than on the premises.
Listen to the latest Talking Data podcast, which features Starburst Data CEO Justin Borgman. We left a noisy restaurant to record the interview, and found a noisy Boston waterfront, with massively loud construction if full throat. Enjoy! – Jack Vaughan
It’s been said Oracle leader Larry Ellison advises his troops to focus on one competitor at a time, and in recent years that has been Amazon. What started out as an online book store eventually morphed into a general mega-store, and then, surprisingly, a mega-IT-outsourcer. In many ways it created the cloud computing formula.
Like other leading lights of enterprise computing, Oracle is in the midst of efforts to shift focus from customers’ on-premises data centers to its own cloud computing centers, and to keep those customers in the Oracle camp. Oracle’s counter thrusts to Amazon are one of the defining aspects of technology today. But it is a balancing act.
The Collaborate 2018 conference at Mandalay Resort and Casino in Las Vegas would seem an apt place to take measure of Oracle’s progress toward cloud. Recorded as the event began, this remote edition of the Talking Data podcast sorts through challenges the Redwood City Calif. -based IT giant faces on the road to cloud.
Underlying its cloud efforts are moves in both databases and applications. Those are key columns of Collaborate, which brings together IOUG Oracle database users, OAUG Oracle eBusiness applications users and Quest JD Edwards/PeopleSoft applications users.
Steady cloud movement, but less than a startling shift, seemed to be the basic cloud status takeaway from the event.
These databases and applications suites are well entrenched in organizations, usually in very large enterprises. Moving these into cloud is a multiyear project in most cases. While complex enterprise applications stay home, new applications are driving to the cloud.
A state of steady cloud movement — but less than a startling shift — seemed to be borne out at Collaborate 2018.
At the event, we asked: “How is the Oracle database and applications cloud migration going?”
“We are not seeing enough large scale movement to really tell . It’s really just one-off stuff,” Stephen Kost, CTO at Integrigy, a security software and services provider, told this reporter at the conference. “People are moving to small web applications.”
In Kost’s view, much of Oracle’s strength is in large companies that may have 1,000 databases – but he has seen, in many cases, only a handful of those have yet to be moved to the cloud.
Up the Las Vegas strip from Collaborate this same week, perhaps not so coincidentally, another Oracle-related conference took place. NetSuite SuiteWorld 2018 was built around the cloud ERP offerings that became part of the Oracle portfolio via acquisition in 2016. As at Collaborate, much of the discussion was around embedding AI into applications.
Oracle’s purchase of NetSuite was a tacit admission that “cloud is different” and that it needed a wholly separate product line to attract small- and medium-size business customers to its applications.
It was also an admission that it saw cloud migration as a multi-year effort that needed to be addressed from several directions. In a phone call after Suite World, Holger Mueller, Constellation Research, told us Oracle has avoided the temptation to roll NetSuite together with its incumbent applications suites. At the same time, he said, it has been expanding NetSuite globally, and injecting elements of its AI and machine learning research and development.
That is also what Oracle has begun to do with the e-Business Suite, JD Edwards, and PeopleSoft portfolio. Still, for now, Oracle’s cloud application migration might be described as a delicate balancing act within a delicate balancing act.- Jack Vaughan
Click here to see a video version of this podcast.