Cloud-native infrastructure company Fairwinds recently launched a SaaS product for DevOps teams so that they can manage multiple Kubernetes clusters.
The almost-eponymously named called Fairwinds Insights, uses an extensible architecture and has been launched with a curated set of open source security, reliability and auditing tools.
The initial suite of tools includes Fairwinds Polaris, Fairwinds Goldilocks and Aqua Security’s Kube-hunter.
Fairwinds Insights claims to be able to solve a few common problems faced by DevOps teams.
First, it eliminates the time-intensive process of researching, learning and deploying the Kubernetes auditing tools that are available.
Second, it automatically organises and normalises data from each tool, so engineers get prioritised recommendations across all clusters.
Finally, it enables DevOps teams to proactively manage the hand-off from development to production.
NOTE: For the record, we can define normalised data as relational database data which has been through a process of structuring in accordance with a series of so-called normal forms in order to reduce data redundancy and improve data integrity. By other definitions, data normalization ensures all of your data looks and reads the same way across all records in any given database (although typically a relational one).
The platform can integrate into deployment pipelines so misconfigurations can be identified and fixed before releasing to production.
“Many DevOps teams have sprawling Kubernetes environments and want to get a handle on it, but with lack of resources and expertise, it’s not a priority. Fairwinds Insights is the first platform that solves this problem by leveraging community-built open source tooling and operationalising it in a way DevOps teams can use at scale,” said Joe Pelletier, Fairwinds’ VP of strategy.
Fairwinds Insights is in public beta and free for any early adopter who wants to try the software during the beta period. The free tier, located at fairwinds.com/insights, is limited to a seven-day history for results and up to two clusters.
The Linux Foundation’s promotion and hosting of Delta Lake is an interesting development.
Delta Lake (wait for it… the clue is in the name) is a project focusing on improving the reliability and performance of data lakes.
Delta Lake was actually announced by unified analytics company Databricks earlier this year before this autumn becoming a Linux Foundation project with an open governance model.
The team points out that organisations in every vertical aspire to get more value from data through data science, machine learning and analytics, but they are hindered by the lack of data reliability within data lakes.
Delta Lake addresses data reliability challenges by making transactions ACID compliant enabling concurrent reads and writes.
NOTE: ACID compliance describes properties of database data that have atomicity, consistency, isolation and durability — MariaDB provides a nice fully-fledged definition here if you want to read more.
The schema enforcement capability in Delta Lake is said to help to ensure that the data lake is free of corrupt and not-conformant data.
“Bringing Delta Lake under the neutral home of the Linux Foundation will help the open source community dependent on the project develop the technology addressing how big data is stored and processed, both on-prem and in the cloud,” said Michael Dolan, VP of strategic programs at the Linux Foundation.
“Alibaba has been a leader, contributor, consumer and supporter for various open source initiatives, especially in the big data and AI area. We have been working with Databricks on a native Hive connector for Delta Lake on the open source front and we are thrilled to see the project joining the Linux Foundation. We will continue to foster and contribute to the open source community,” said Yangqing Jia, VP of big data & AI at Alibaba.
As noted above, Delta Lake will have an open governance model that encourages participation and technical contribution and will provide a framework for long-term stewardship by an ecosystem invested in Delta Lake.
Open source security and license compliance management company WhiteSource has brought dependency update company Renovate into its stable.
All of Renovate’s current commercial offerings will now be available for free under its new name, WhiteSource Renovate.
Founder of Renovate Rhys Arkins explains that Renovate was developed because running user-facing applications with outdated dependencies is not a serious option for software projects – or at least it shouldn’t be.
As we know, using outdated dependencies increases the likelihood of unfixed bugs and increases the quantity and impact of security vulnerabilities within software applications.
WhiteSource will continue to drive the Renovate open source project, which to date has received over 5,000 commits from more than 150 contributors.
Further, WhiteSource will now offer the existing paid offerings for free: a GitHub app, a GitLab app and a self-hosted solution — all under the WhiteSource Renovate umbrella.
“Dependency visibility and currency are essential ingredients for mature software organisations and an important complement to vulnerability and license management. We’re proud that a tool for updating dependencies is itself open source and will ensure the project continues to extend its leadership in multi-platform and language support,” said Rami Sass, CEO of WhiteSource.
WhiteSource Renovate will be integrated into the WhiteSource product portfolio, which includes WhiteSource Core and WhiteSource for Developers.
Software integration and analytics company Tibco has added Apache Pulsar as a fully supported component in its own messaging brand, TIBCO Messaging.
By way of definition and clarification then…
Apache Pulsar is a distributed ‘pub-sub’ messaging platform with a flexible messaging model and an intuitive client API.
Pub-sub (publish/subscribe) messaging is a form of asynchronous service-to-service communication used in serverless and microservices environments.
Tibco positions this as a) a commitment to open source technologies, obviously… but also b) a means of making sure that users of (the undeniably quite popular) Apache pub-sub messaging system can now use Tibco Messaging.
The suggestion here is that developers will be able to create a fully integrated application integration infrastructure with the freedom to choose the right messaging tool for the job at hand.
Streaming & Messaging
Here’s the core technology proposition: users can achieve connectivity from a data distribution solution that provides the support of a streaming and messaging infrastructure — and this, therefore, allows the creation of software that spans streaming, event processing, data analytics and AI/ML.
“Our support of Apache Pulsar gives customers the freedom of choice when navigating the need for a solution to assist with the real-time processing of high volumes of data for the most demanding enterprise use cases,” said Denny Page, chief engineer and senior vice president, Tibco.
Apache Pulsar enables lightweight compute logic using APIs, without needing to run a stream processing engine. It offers native support for streaming and event processing in a single package. This ensures horizontal scalability with low latency, allowing for flexible solutions for streaming.
Further, it provides native support for geo-replication and multi-tenancy without requiring add-on components to manage.
Users are free to choose from multiple messaging and streaming options and can work with a single vendor that delivers all their messaging needs, including fully distributed, high-performance, peer-to-peer messaging; certified JMS messaging; and open source, broker-based messaging including Apache Kafka®, Apache Pulsar, and Eclipse Mosquitto.
Autumn (or Fall, depending on your level of Americanization) was a busy period… so busy in fact that the Computer Weekly Open Source Insider blog saw a number of milestone advancements go whizzing past.
Among those news items we’re catching up on as we approach the Christmas silly season is the latest update from Canonical on Ubuntu.
Canonical is positioning Ubuntu as (in its view) an operating system (OS) of choice for ‘most’ (it was clear not to say all) public cloud workloads, as well as the emerging categories of ‘smart gateways’, self-driving cars and advanced robots.
NOTE: NXP defines smart gateways as an appliance that bridges a Wide Area Network (WAN/cloud) connection to a Local Area Network (LAN), usually via Wi-Fi and/or Ethernet in a user’s home or a company premises.
Now that we reach the Ubuntu 19.10 version release, Canonical says that it has increased its focus on accelerating developer productivity in AI/ML and brought forward new edge capabilities for MicroK8s and delivering the fastest GNOME desktop performance.
NOTE: MicroK8s is a CNCF certified upstream Kubernetes deployment that runs entirely on a workstation or edge device — being a ‘snap’ (a Canonical application packaging & delivery mechanism) it runs all Kubernetes services natively (i.e. no virtual machines) while packing the entire set of libraries and binaries needed.
Canonical CEO Mark Shuttleworth says that Ubuntu 19.10 brings enhanced edge computing capabilities with the addition of strict confinement to MicroK8s.
Strict confinement ensures complete isolation and a tightly secured production-grade Kubernetes environment, all in a small footprint ideal for edge gateways. MicroK8s add-ons – including Istio, Knative, CoreDNS, Prometheus, and Jaeger – can now be deployed securely at the edge with a single command.
The Raspberry Pi 4 Model B is supported by Ubuntu 19.10. The latest board from the Raspberry Pi Foundation offers a faster system-on-a-chip with a processor that uses the Cortex-A72 architecture (quad-core 64-bit ARMv8 at 1.5GHz) and offers up to 4GB of RAM.
Additionally here, Ubuntu 19.10 ships with the Train release of Charmed OpenStack – the 20th OpenStack release, backed by the Nautilus release of Ceph.
Shuttleworth and team insist that this marks Canonical’s long-term commitment to open infrastructure and improving the cost of cloud operations. Train provides live migration extensions to aid telcos in their infrastructure operations. Live migration allows users to move their machines from one hypervisor to another without shutting down the operating system of the machine.
Finally here, Canonical says it has thought about users running Ubuntu on older hardware — which, arguably, is contentious ground for some as open source purists will want to position an open OS as ‘more than just something you stick on an old Windows machine to bring it to life’ — and so with GNOME 3.34, Ubuntu 19.10 is the fastest release yet with significant performance improvements delivering what the company has called a more responsive and smooth experience, even on older hardware.
Ali Baba, the character is easily distinguished from the other Alibaba by virtue of a) its different spelling and b) the fact that the ‘other’ Alibaba is a Chinese multinational conglomerate holding company specializing in e-commerce, retail, Internet and other technologies.
The only perceivable connection between Ali Baba and Alibaba is that they both like to say ‘open’ — with the former opting for sesame… and the latter opting for source.
Alibaba Cloud (the company division, not the mythical Arabian character) is the data intelligence segment of Alibaba Group.
Open source sesame
In a flourish of open source sesame-ness, Alibaba Cloud has announced that the core codes of Alink, its self-developed algorithm platform, have been made available via open source on Github.
The platform offers a range of algorithm libraries that support both batch and stream processing, both of which are arguably pretty critical for Machine Learning (ML) and tasks such as online product recommendation and intelligent customer services.
Data analysts and software developers can access the codes on Github here to build their own software, facilitating tasks such as statistics analysis, machine learning, real-time prediction, personalized recommendation and abnormality detection.
“As a platform that consists of various algorithms combining learning in various data processing patterns, Alink can be a valuable option for developers looking for robust big data and advanced machine learning tools,” said Yangqing Jia, president and senior fellow of data platform at Alibaba Cloud Intelligence.
Jia claims that Alibaba is one of the ‘the top ten contributors’ to Github. Alibaba has gained over 690,000 stars, with about 20,000 contributors on GitHub.
Alink was developed based on Flink, a unified distributed computing engine. Based on Flink, Alink has provided what is said to be ‘seamless unification’ of batch and stream processing, offering a platform for developers to perform data analytics and machine learning tasks.
This technology supports Alibaba’s proprietary data storage and also other open source data storage, such as Kafka, HDFS and HBase.
The Computer Weekly Open Source Insider team speaks to Todd M Moore in his role as IBM VP ‘opentech’ & developer advocacy (and) CTO for developer ecosystems following the Open Source Summit Europe in Lyon.
Moore and his team of open source developers work with open source communities such as the Apache Software Foundation, Linux Foundation, eClipse, OSGi, OpenStack, Cloud Foundry, Docker, JS, Node.js and more.
He currently serves as chairperson of both the Open.js Foundation board of directors and the CNCF Governing Board.
Computer Weekly: What did you cover during your keynote?
Moore: My keynote in Lyon focused on topical areas in AI. There is so much to be done to both build trust in AI solutions and to secure them. Working in the LFAI organisation, we see the perfect opportunity to bring together the major participants to work on these issues for the good of the community at large.
In AI ‘explain-ability’ alone, there will be an explosion of algorithms and statistical analysis necessary to build a solid base. We have only scratched the surface so far. I also touched on the trends in open source AI projects and the projects to watch and the role of data governance.
Computer Weekly: Why is this topic so relevant right now?
Moore: The adoption of AI technology is a worldwide phenomenon. Studies show that the adoption of open technologies by industry participants goes hand in hand with becoming industry leaders. This is also the case with AI and it has reached the point of touching our every day lives through machine learning, image recognition, translation, speech recognition, autonomous driving, assisted decision making, etc.
This change is driven by the availability of data and substantially improved access to computational processing power. Both classical computers and those with ‘GPU assists’ are now available to substantially reduce model build, debug and tuning. Software technologies to help developers and data scientists in the end-to-end development and lifecycle of models are now appearing with some strong options forming in open source. No industry or government office will go untouched by the technologies that are in development.
Computer Weekly: What is your perspective on the growth and maturity of open source software — and, how can we sustain projects and developers for decades to come?
Moore: As I have said, many options are starting to become available to developers in this area. Open source has become the way to de facto standardisation and pave the way towards rapid marketplace adoption. It preserves freedom of action for clients seeking to prevent vendor lock-in and it opens the door to rapid development and marketplace growth. Products today are based in open source, and gone are the days when a single developer or vendor can out innovate the rest of the world by themselves.
Open source yields great software that developers can depend on. Look at the rise of containers and the rapid adoption of Linux or Kubernetes as cases in point of what happens when the world comes behind a technology.
Sustainability comes from mass adoption and the willingness of developers to commit themselves to a project. We have proven that widespread adoption fuels continuing interest and that corporations will commit resources to develop and maintain a strategic code base for decades. We need to protect against developer burnout and constantly be looking to aid in the tasks that are not glamorous such as documentation, CI/CD, code reviews etc.
Computer Weekly: What is lighting you up right now? What has your attention and is making you excited about your work?
The Irish county town of Kilkenny is known for its medieval buildings and castle, its rich history of brewing, its distinctive black marble and as the home of White House architect James Hoban.
This year’s event saw NearForm Research and Espruino surprise delegates by giving out something better than plain old lanyards and name tags — the two companies came together to offer an arguably rather more exciting Machine Learning (ML)-driven smartwatch to act as attendee’s conference badges.
Developers will be able to create their own AI applications for the Bangle.js device.
It comes pre-loaded with features and apps including: GPS, compass, heart rate monitor, maps, games and gesture-control of PC applications over Bluetooth.
“Bangle.js is not just about a single device, codebase or company. I believe it has the potential to bootstrap a community-driven open health platform where anyone can build or use any compatible device and everyone owns their own data. Machine Learning is a critical aspect of health technology and we’re so pleased to be further involved in the TensorFlow open source project,” said Conor O’Neill, chief product officer for NearForm.
County Waterford headquartered NearForm is known for its professional technology consultancy work with both local Irish and international companies spanning a range of industries. “Everything we do emanates from open source,” insists the company.
This first Bangle.js device can also be easily disassembled with just a screwdriver for ease of fixing and replacing its parts.
The teams also ported the Micro version of Google’s TensorFlow Lite to the watch to give it Machine Learning capabilities with input from Google’s TensorFlow community. They then designed an ML gesture detection algorithm which is built into every watch and enables the user to control applications, including PowerPoint, with hand gestures.
The companies explain that even ‘lapsed’ and non-programmers can also interact with Bangle.js using Blockly or low-code Node-RED.
Tibco is focused on open source and Agile this month.
The integration and analytics specialist has upped the toolset in a group of its products with a key focus on Agile agility for cloud-native deployments.
The company says it is putting AI inside (who isn’t?) its enhancements to the TIBCO Connected Intelligence platform
Matt Quinn, chief operating office at Tibco says that his firm’s vision is that customers should use Tibco as their ‘data foundation’.
In terms of cloud-native, Tibco’s API management software TIBCO Cloud Mashery is available in cloud-native deployments in public clouds, private clouds and on-premises. The company’s Mashery Local Developer Portal is now also available as a fully cloud-native deployment.
Quinn says that IT teams are faced with the increasing complexity of metadata governance — and the firm’s Cloud Metadata tool runs Tibco EBX master data management to address this.
NOTE: Metadata governance is used most often in relation to digital media, but older forms of metadata are catalogues, dictionaries and taxonomies.
Extra open source sauce
The company also continues to develop capabilities to support open source and is weaving more open offerings into its product mix.
The introduction of Tibco Messaging Manager 1.0.0, including an Apache Kafka Management Toolkit, provides a predictive and auto-completing command-line interface (CLI), which aims to simplify the setup and management of Apache Kafka. As readers will know, Kafka is used for building real-time data pipelines and high-throughput low-latency distributed streaming applications. Tibco Messaging components feature a common management plugin, use a common interface and allow for easier continuous integration and deployment. Tibco Messaging Manager extends the company’s support for Apache Kafka and enables the Tibco Connected Intelligence Cloud platform to take advantage of Kafka for integration, event processing and real-time messaging with historical context.
“In addition, Tibco now offers support for IoT-based machine-to-machine communication via OPC Foundation Unified Architecture in Tibco Streaming software. In support of open-source Project Flogo, Tibco announces the Project Flogo Streaming User Interface. Integrating with Tibco’s existing solutions, the Project Flogo Streaming User Interface lets developers build resource-efficient, smarter real-time streaming processing apps at the edge or in the cloud, improving the productivity of expert IT resources,” noted the company, in a press statement.
Also here Tibco’s AutoML extension for its Data Science software via Tibco LABS facilitates the development and selection of AI workflows. In addition, new Process Mining capabilities via Tibco LABS enable users to discover, improve, and predict process behaviour from data event logs produced by operational systems.
Lastly, to further strengthen Tibco contribution to the open-source community, the company says it has introduced an open source specification in the shape of CatalystML to capture data transformations and consume machine-learning artifacts in real-time for high-throughput applications.
DataStax offers a commercially supported ‘enterprise-robust’ database built on open source Apache Cassandra.
As such, DataStax has told Computer Weekly Open Source Insider that it is actively engaged with supporting a variety of live, working, growing open source projects.
Among those projects is Apache Tinkerpop… and inside Tinkerpop is Gremlin.
What is Tinkerpop?
Apache TinkerPop is a graph computing framework for both graph databases that work with OnLine Transactional Processing (OLTP) and graph analytic systems that work with OnLine Analytical Processing (OLAP).
For extra clarification, TinkerPop is an open source, vendor-agnostic, graph computing framework distributed under the commercial-friendly Apache2 license.
According to Apache, “When a data system is TinkerPop-enabled, its users are able to model their domain as a graph and analyse that graph using the Gremlin graph traversal language. Furthermore, all TinkerPop-enabled systems integrate with one another allowing them to easily expand their offerings as well as allowing users to choose the appropriate graph technology for their application.”
TinkerPop supports in-memory graph databases through to distributed computing databases that can run in parallel across hundreds of nodes, so you can scale up as much as your data set requires you to.
What is Gremlin?
Gremlin is the most common query language used for graph – it’s used across multiple graph technologies so provides a common framework for working with graph data.
Gremlin is a functional open source graph traversal language and it works like Java in that it is composed of a virtual machine and an instruction set.
DataStax on Gremlin
DataStax says that getting used to Gremlin can make it easier to understand how graphs work and how to query data.
According to an official company statement, “At DataStax, we support this project wholeheartedly – for example, the Gremlin project chair works at DataStax and the DataStax team contributes the vast majority of the commits. We will continue to support this project as it has organically grown to be the most widely adopted traversal framework for the whole community around graph.”
DataStax offers a free DataStax Academy course entitled Getting Started with TinkerPop and Gremlin at this link. The company also notes that in order to be familiar with Gremlin traversal syntax and techniques, developers need to understand how the language works… consequently, DataStax has provided a free Gremlin recipes series to offer some insight into Gremlin internals.