Autumn (or Fall, depending on your level of Americanization) was a busy period… so busy in fact that the Computer Weekly Open Source Insider blog saw a number of milestone advancements go whizzing past.
Among those news items we’re catching up on as we approach the Christmas silly season is the latest update from Canonical on Ubuntu.
Canonical is positioning Ubuntu as (in its view) an operating system (OS) of choice for ‘most’ (it was clear not to say all) public cloud workloads, as well as the emerging categories of ‘smart gateways’, self-driving cars and advanced robots.
NOTE: NXP defines smart gateways as an appliance that bridges a Wide Area Network (WAN/cloud) connection to a Local Area Network (LAN), usually via Wi-Fi and/or Ethernet in a user’s home or a company premises.
Now that we reach the Ubuntu 19.10 version release, Canonical says that it has increased its focus on accelerating developer productivity in AI/ML and brought forward new edge capabilities for MicroK8s and delivering the fastest GNOME desktop performance.
NOTE: MicroK8s is a CNCF certified upstream Kubernetes deployment that runs entirely on a workstation or edge device — being a ‘snap’ (a Canonical application packaging & delivery mechanism) it runs all Kubernetes services natively (i.e. no virtual machines) while packing the entire set of libraries and binaries needed.
Canonical CEO Mark Shuttleworth says that Ubuntu 19.10 brings enhanced edge computing capabilities with the addition of strict confinement to MicroK8s.
Strict confinement ensures complete isolation and a tightly secured production-grade Kubernetes environment, all in a small footprint ideal for edge gateways. MicroK8s add-ons – including Istio, Knative, CoreDNS, Prometheus, and Jaeger – can now be deployed securely at the edge with a single command.
The Raspberry Pi 4 Model B is supported by Ubuntu 19.10. The latest board from the Raspberry Pi Foundation offers a faster system-on-a-chip with a processor that uses the Cortex-A72 architecture (quad-core 64-bit ARMv8 at 1.5GHz) and offers up to 4GB of RAM.
Additionally here, Ubuntu 19.10 ships with the Train release of Charmed OpenStack – the 20th OpenStack release, backed by the Nautilus release of Ceph.
Shuttleworth and team insist that this marks Canonical’s long-term commitment to open infrastructure and improving the cost of cloud operations. Train provides live migration extensions to aid telcos in their infrastructure operations. Live migration allows users to move their machines from one hypervisor to another without shutting down the operating system of the machine.
Finally here, Canonical says it has thought about users running Ubuntu on older hardware — which, arguably, is contentious ground for some as open source purists will want to position an open OS as ‘more than just something you stick on an old Windows machine to bring it to life’ — and so with GNOME 3.34, Ubuntu 19.10 is the fastest release yet with significant performance improvements delivering what the company has called a more responsive and smooth experience, even on older hardware.
Ali Baba, the character is easily distinguished from the other Alibaba by virtue of a) its different spelling and b) the fact that the ‘other’ Alibaba is a Chinese multinational conglomerate holding company specializing in e-commerce, retail, Internet and other technologies.
The only perceivable connection between Ali Baba and Alibaba is that they both like to say ‘open’ — with the former opting for sesame… and the latter opting for source.
Alibaba Cloud (the company division, not the mythical Arabian character) is the data intelligence segment of Alibaba Group.
Open source sesame
In a flourish of open source sesame-ness, Alibaba Cloud has announced that the core codes of Alink, its self-developed algorithm platform, have been made available via open source on Github.
The platform offers a range of algorithm libraries that support both batch and stream processing, both of which are arguably pretty critical for Machine Learning (ML) and tasks such as online product recommendation and intelligent customer services.
Data analysts and software developers can access the codes on Github here to build their own software, facilitating tasks such as statistics analysis, machine learning, real-time prediction, personalized recommendation and abnormality detection.
“As a platform that consists of various algorithms combining learning in various data processing patterns, Alink can be a valuable option for developers looking for robust big data and advanced machine learning tools,” said Yangqing Jia, president and senior fellow of data platform at Alibaba Cloud Intelligence.
Jia claims that Alibaba is one of the ‘the top ten contributors’ to Github. Alibaba has gained over 690,000 stars, with about 20,000 contributors on GitHub.
Alink was developed based on Flink, a unified distributed computing engine. Based on Flink, Alink has provided what is said to be ‘seamless unification’ of batch and stream processing, offering a platform for developers to perform data analytics and machine learning tasks.
This technology supports Alibaba’s proprietary data storage and also other open source data storage, such as Kafka, HDFS and HBase.
The Computer Weekly Open Source Insider team speaks to Todd M Moore in his role as IBM VP ‘opentech’ & developer advocacy (and) CTO for developer ecosystems following the Open Source Summit Europe in Lyon.
Moore and his team of open source developers work with open source communities such as the Apache Software Foundation, Linux Foundation, eClipse, OSGi, OpenStack, Cloud Foundry, Docker, JS, Node.js and more.
He currently serves as chairperson of both the Open.js Foundation board of directors and the CNCF Governing Board.
Computer Weekly: What did you cover during your keynote?
Moore: My keynote in Lyon focused on topical areas in AI. There is so much to be done to both build trust in AI solutions and to secure them. Working in the LFAI organisation, we see the perfect opportunity to bring together the major participants to work on these issues for the good of the community at large.
In AI ‘explain-ability’ alone, there will be an explosion of algorithms and statistical analysis necessary to build a solid base. We have only scratched the surface so far. I also touched on the trends in open source AI projects and the projects to watch and the role of data governance.
Computer Weekly: Why is this topic so relevant right now?
Moore: The adoption of AI technology is a worldwide phenomenon. Studies show that the adoption of open technologies by industry participants goes hand in hand with becoming industry leaders. This is also the case with AI and it has reached the point of touching our every day lives through machine learning, image recognition, translation, speech recognition, autonomous driving, assisted decision making, etc.
This change is driven by the availability of data and substantially improved access to computational processing power. Both classical computers and those with ‘GPU assists’ are now available to substantially reduce model build, debug and tuning. Software technologies to help developers and data scientists in the end-to-end development and lifecycle of models are now appearing with some strong options forming in open source. No industry or government office will go untouched by the technologies that are in development.
Computer Weekly: What is your perspective on the growth and maturity of open source software — and, how can we sustain projects and developers for decades to come?
Moore: As I have said, many options are starting to become available to developers in this area. Open source has become the way to de facto standardisation and pave the way towards rapid marketplace adoption. It preserves freedom of action for clients seeking to prevent vendor lock-in and it opens the door to rapid development and marketplace growth. Products today are based in open source, and gone are the days when a single developer or vendor can out innovate the rest of the world by themselves.
Open source yields great software that developers can depend on. Look at the rise of containers and the rapid adoption of Linux or Kubernetes as cases in point of what happens when the world comes behind a technology.
Sustainability comes from mass adoption and the willingness of developers to commit themselves to a project. We have proven that widespread adoption fuels continuing interest and that corporations will commit resources to develop and maintain a strategic code base for decades. We need to protect against developer burnout and constantly be looking to aid in the tasks that are not glamorous such as documentation, CI/CD, code reviews etc.
Computer Weekly: What is lighting you up right now? What has your attention and is making you excited about your work?
The Irish county town of Kilkenny is known for its medieval buildings and castle, its rich history of brewing, its distinctive black marble and as the home of White House architect James Hoban.
This year’s event saw NearForm Research and Espruino surprise delegates by giving out something better than plain old lanyards and name tags — the two companies came together to offer an arguably rather more exciting Machine Learning (ML)-driven smartwatch to act as attendee’s conference badges.
Developers will be able to create their own AI applications for the Bangle.js device.
It comes pre-loaded with features and apps including: GPS, compass, heart rate monitor, maps, games and gesture-control of PC applications over Bluetooth.
“Bangle.js is not just about a single device, codebase or company. I believe it has the potential to bootstrap a community-driven open health platform where anyone can build or use any compatible device and everyone owns their own data. Machine Learning is a critical aspect of health technology and we’re so pleased to be further involved in the TensorFlow open source project,” said Conor O’Neill, chief product officer for NearForm.
County Waterford headquartered NearForm is known for its professional technology consultancy work with both local Irish and international companies spanning a range of industries. “Everything we do emanates from open source,” insists the company.
This first Bangle.js device can also be easily disassembled with just a screwdriver for ease of fixing and replacing its parts.
The teams also ported the Micro version of Google’s TensorFlow Lite to the watch to give it Machine Learning capabilities with input from Google’s TensorFlow community. They then designed an ML gesture detection algorithm which is built into every watch and enables the user to control applications, including PowerPoint, with hand gestures.
The companies explain that even ‘lapsed’ and non-programmers can also interact with Bangle.js using Blockly or low-code Node-RED.
Tibco is focused on open source and Agile this month.
The integration and analytics specialist has upped the toolset in a group of its products with a key focus on Agile agility for cloud-native deployments.
The company says it is putting AI inside (who isn’t?) its enhancements to the TIBCO Connected Intelligence platform
Matt Quinn, chief operating office at Tibco says that his firm’s vision is that customers should use Tibco as their ‘data foundation’.
In terms of cloud-native, Tibco’s API management software TIBCO Cloud Mashery is available in cloud-native deployments in public clouds, private clouds and on-premises. The company’s Mashery Local Developer Portal is now also available as a fully cloud-native deployment.
Quinn says that IT teams are faced with the increasing complexity of metadata governance — and the firm’s Cloud Metadata tool runs Tibco EBX master data management to address this.
NOTE: Metadata governance is used most often in relation to digital media, but older forms of metadata are catalogues, dictionaries and taxonomies.
Extra open source sauce
The company also continues to develop capabilities to support open source and is weaving more open offerings into its product mix.
The introduction of Tibco Messaging Manager 1.0.0, including an Apache Kafka Management Toolkit, provides a predictive and auto-completing command-line interface (CLI), which aims to simplify the setup and management of Apache Kafka. As readers will know, Kafka is used for building real-time data pipelines and high-throughput low-latency distributed streaming applications. Tibco Messaging components feature a common management plugin, use a common interface and allow for easier continuous integration and deployment. Tibco Messaging Manager extends the company’s support for Apache Kafka and enables the Tibco Connected Intelligence Cloud platform to take advantage of Kafka for integration, event processing and real-time messaging with historical context.
“In addition, Tibco now offers support for IoT-based machine-to-machine communication via OPC Foundation Unified Architecture in Tibco Streaming software. In support of open-source Project Flogo, Tibco announces the Project Flogo Streaming User Interface. Integrating with Tibco’s existing solutions, the Project Flogo Streaming User Interface lets developers build resource-efficient, smarter real-time streaming processing apps at the edge or in the cloud, improving the productivity of expert IT resources,” noted the company, in a press statement.
Also here Tibco’s AutoML extension for its Data Science software via Tibco LABS facilitates the development and selection of AI workflows. In addition, new Process Mining capabilities via Tibco LABS enable users to discover, improve, and predict process behaviour from data event logs produced by operational systems.
Lastly, to further strengthen Tibco contribution to the open-source community, the company says it has introduced an open source specification in the shape of CatalystML to capture data transformations and consume machine-learning artifacts in real-time for high-throughput applications.
DataStax offers a commercially supported ‘enterprise-robust’ database built on open source Apache Cassandra.
As such, DataStax has told Computer Weekly Open Source Insider that it is actively engaged with supporting a variety of live, working, growing open source projects.
Among those projects is Apache Tinkerpop… and inside Tinkerpop is Gremlin.
What is Tinkerpop?
Apache TinkerPop is a graph computing framework for both graph databases that work with OnLine Transactional Processing (OLTP) and graph analytic systems that work with OnLine Analytical Processing (OLAP).
For extra clarification, TinkerPop is an open source, vendor-agnostic, graph computing framework distributed under the commercial-friendly Apache2 license.
According to Apache, “When a data system is TinkerPop-enabled, its users are able to model their domain as a graph and analyse that graph using the Gremlin graph traversal language. Furthermore, all TinkerPop-enabled systems integrate with one another allowing them to easily expand their offerings as well as allowing users to choose the appropriate graph technology for their application.”
TinkerPop supports in-memory graph databases through to distributed computing databases that can run in parallel across hundreds of nodes, so you can scale up as much as your data set requires you to.
What is Gremlin?
Gremlin is the most common query language used for graph – it’s used across multiple graph technologies so provides a common framework for working with graph data.
Gremlin is a functional open source graph traversal language and it works like Java in that it is composed of a virtual machine and an instruction set.
DataStax on Gremlin
DataStax says that getting used to Gremlin can make it easier to understand how graphs work and how to query data.
According to an official company statement, “At DataStax, we support this project wholeheartedly – for example, the Gremlin project chair works at DataStax and the DataStax team contributes the vast majority of the commits. We will continue to support this project as it has organically grown to be the most widely adopted traversal framework for the whole community around graph.”
DataStax offers a free DataStax Academy course entitled Getting Started with TinkerPop and Gremlin at this link. The company also notes that in order to be familiar with Gremlin traversal syntax and techniques, developers need to understand how the language works… consequently, DataStax has provided a free Gremlin recipes series to offer some insight into Gremlin internals.
Towo Labs, a Swedish startup aimed at simplifying ‘crypto self-custody’, has announced an investment from Xpring, Ripple’s developer initiative.
Xpring is described as an initiative by Ripple that will invest in, incubate, acquire and provide grants to companies and projects run by entrepreneurs.
Ripple Labs Inc. itself develops the Ripple payment protocol and exchange network.
According to Crypto Digest News, a [cryptocurrency] custodian holds and keeps assets safe — its goal is to minimise the risk of loss or theft and usually also provides additional services like account administration.
Towo Labs is now focused on the development of hardware wallet firmware with support for all XRP Ledger (XRPL) transaction types and a trustless, non-custodial web interface to the XRP Ledger.
NOTE: The XRP Ledger is a decentralised cryptographic ledger powered by a network of peer-to-peer servers to process XRP digital assets — Towo Labs founder Markus Alvila is the creator of the existing XRP Toolkit.
At the outset, Towo Labs will focus on developing hardware wallet firmware with full XRPL support for Ledger Nano S, Ledger Nano X and Trezor T with the aim of making it easier to securely sign transactions.
Open source contributions
All open source code contributions will be subject to the normal code and security reviews of the involved repository maintainers.
Today’s existing firmware only supports XRP payment transactions, which in some cases block further XRPL and Interledger innovation. The new firmware, however, will support the signing of cross-currency payments, trust lines, escrows, orders, payment channels, account settings and so forth.
“With full XRP support among leading hardware wallets, transactions can be prepared from untrusted devices and applications (for example, over the web) before being reviewed and securely signed inside a hardware wallet,” noted the company, in a press statement.
This added support also enables new applications like trustless, non-custodial trading interfaces to the XRPL decentralized exchange, improved self-custody for DeFi applications and hybrid multi-signing schemes requiring signatures from both hardware and software wallets.
The coming updates to the XRP Toolkit seek to achieve a trustless, non-custodial XRP Ledger web interface, where you can prepare and submit any transaction type from any device, signing using the wallet of your choice or one generated with the XRP Toolkit.
In addition to the leading hardware wallets, XRPL Labs’ signing platform Xumm will also be integrated as a signing option.
The Computer Weekly Developer Network and Open Source Insider team are big fans of Greek classics, San Francisco clam chowder shared-nothing architectures and open source-centric real-time big data databases.
Luckily then, we’re off to Scylla Summit 2019, staged in San Francisco on November 5 and 6.
Scylla (the company) takes its name directly from Scylla [pronounced: sill-la], a Greek god sea monster whose mission was to haunt and torment the rocks of a narrow strait of water opposite the Charybdis whirlpool.
Outside of Greek history, Scylla is an open source essentially distributed NoSQL data store that uses a sharded design on each node, meaning each CPU core handles a different subset of data.
TECHNICAL NOTE: Sharding is a type of database partitioning that separates very large databases the into smaller, faster, more easily managed parts called data shards. Technically speaking, sharding is a synonym for horizontal partitioning.
Scylla is fully compatible with Apache Cassandra and embraces a shared-nothing approach that increases throughput and storage capacity as much as 10X that of Cassandra itself.
Yay, for users
Scylla Summit is heavily focused on users and use cases. As such, the 2019 Scylla User Award Categories will look to recognise the most innovative use of Scylla, the biggest node reduction with Scylla and the best Scylla cloud use case.
Other commendations in the company’s awards will include: best real-time use case; biggest big data use case; Scylla community member of the year; best use of Scylla with Kafka; Best Use of Scylla with Spark; and best use of Scylla with a graph database.
“There’s nothing we enjoy more than seeing the creative and impressive things our users are doing with Scylla. With that in mind, we presented our Scylla User Awards at last week’s Scylla Summit, where we brought the winners up on stage for a big round of applause and bestowed them with commemorative trophies,” wrote Scylla’s Bob Dever in his capacity as VP of marketing.
According to the company’s official event statement, Scylla Summit features three days of customer use cases, training sessions, product demonstrations and news from ScyllaDB.
Best & brightest
Developers share best practices, product managers learn how to reduce the cost and complexity of their infrastructure and entrepreneurs connect with the best and brightest in the community.
Scylla CEO Dor Laor insists that this year’s Scylla Summit is shaping up well and he notes that the company will roll-out first-of-their-kind features.
“We will announce our lightweight transactions and CDC capabilities, dig into our new Scylla Alternator API for DynamoDB users and hear from some of the world’s most innovative companies about how they’re putting Scylla to work. We’ll also unveil results of a major new performance test. If you thought it was something when we hit one million OPS per node — you haven’t seen anything yet. As always, Scylla Summit will be a great place to get inspired, learn from your colleagues and keep up to date with the latest advances in big data and NoSQL,” said Laor.
TECHNICAL NOTE: Laor mentions CDC in the above quote… so, in the world of databases, Change Data Capture (CDC) is a set of software design patterns used to determine (and track) the data that has changed so that action can be taken using the changed data.
“At past events, people have told us that the conversations they have at Scylla Summit have helped shaped their big data strategies,” said Dor Laor, CEO of ScyllaDB. “This is a chance to share ideas with the smartest, most plugged-in members of the NoSQL and big data communities. It’s completely relaxed, massively useful and lots of fun.”
The company is also likely to announce a note of new customers and spell out its wider roadmap.
In our voyages from software conference to software conference, we technology journalists often find stories that are developing and worth sharing.
At Percona Live Europe last week, one such example came up around the open source scene that is developing in Russia and how one of the projects that is now starting to open up to international use.
Think about Russia typically… and you may not automatically think about open source software. However, the country has a strong software developer community that is looking to expand the number of projects that are used internationally.
An example of this is ClickHouse, an open source data warehouse project found on GitHub here. The technology was originally developed at Yandex, the Russian equivalent of Google.
As defined on TechTarget: a data warehouse is a ‘federated repository’ for all the data collected by an enterprise’s various operational systems – and the practice of data warehousing itself puts emphasis on the ‘capture’ of data from different sources for access and analysis.
ClickHouse’s performance claims to exceed that of comparable column-oriented database management systems (DBMS) currently available. As such, it processes hundreds of millions (to more than a billion) of rows — and tens of gigabytes of data per single server, per second.
According to its development team, ClickHouse allows users to add servers to their clusters when necessary without investing time or money into any additional DBMS modification.
According to the development team notes, “ClickHouse processes typical analytical queries two to three orders of magnitude faster than traditional row-oriented systems with the same available I/O throughput. The system’s columnar storage format allows fitting more hot data in RAM, which leads to a shorter response times. ClickHouse is CPU efficient because of its vectorised query execution involving relevant processor instructions and runtime code generation.”
The central go-to-market proposition here is that by minimising data transfers for most types of queries, ClickHouse enables companies to manage their data and create reports without using specialised networks that are aimed at high-performance computing.
The technology, which is essentially aligned for Online Analytical Processing (OLAP), uses all available hardware to process each query as fast as possible, which amounts to a speed of more than 2-terabytes per second.
The project is starting to expand and get more adopters. As part of its monitoring product launch at the event, database monitoring and management company Percona announced that it will use ClickHouse for load testing and to monitor accessibility and other performance KPIs. Percona’s leadership team originally hails from Russia, so there are a lot of relationships there as well.
Alongside Percona, Altinity is also looking to expand use of ClickHouse over time. Robert Hodges, CEO at Altinity describes the company as a provider of the highest ClickHouse expertise on the market to deploy and run demanding analytic applications. The company also provides software to manage ClickHouse in Kubernetes, cloud and bare-metal environments.
Explaining how his firm has developed alongside the core ClickHouse technology proposition, Hodges says that the enterprise version of ClickHouse can run on laptop, yet be ready to scale up for significant enterprise workloads.
“ClickHouse is very efficient at processing and handling time-series data…. and it has SQL features which are great at monitoring specific cloud issues such as ‘last point query’ [a way of looking at the last thing that happened in a cloud application]. Say for example you had a bunch of Virtual Machines (VMs) running in the cloud and you wanted to know the CPU load on them, ClickHouse is good at that getting that measure to you. It’s good to know to the ‘current state’ of VMs, because from that point you can then drill-in a look at the load over (for example) the last two weeks and so get a sharper idea of performance status,” said Altinity’s Hodges.
Hodges also explains that Percona is interested in ClickHouse because it is so similar to MySQL – it has ‘surface similarities’ and has good abilities to load data into it and pull data down from it.
This project is an example of how open source communities can expand with new approaches to existing problems.
Enterprise applications company IFS has used its annual user conference to detail work carried out on its major application suite.
The company wants to put the ‘open’ in service management, enterprise resource management (ERP) and enterprise asset management (EAM).
The company says it has ‘evolved’ its technology foundation with 15,000+ native APIs to open paths to extensibility, integration and flexibility… because that what APIs do, obviously.
Of some significance here, IFS has noted that it is a new member of the OpenAPI Initiative (OAI), a consortium of experts who champion standardization on how REST APIs are described.
According to OAI official statements, the group has an open governance structure under the Linux Foundation – and the OAI is focused on creating, evolving and promoting a vendor-neutral description format.
IFS insists that it is promoting open applications to allow freedom to develop and connect data sources to drive value in a way that is ‘meaningful’ to enterprises.
“By prioritising open applications, IFS is upping the ante in terms of innovation and customer-centricity while decisively turning away from platform coercion and lock-in,” noted the company, in a press statement.
IFS offers native OData-based RESTful APIs across its entire suite of ERP, EAM and service management products, to make connecting, extending or integrating into the IFS core quicker and easier.
OData-based RESTful APIs are defined on Stack Overflow as, “A special kind of REST where we can query data uniformly from a URL. REST stands for REpresentational State Transfer which is a resource-based architectural style. OData is a web based protocol that defines a set of best practices for building and consuming RESTful web services.”
The APIs at IFS have been engineered in tandem with IFS’s new ‘Aurena’ user experience brand, which is now available across the full breadth of IFS applications.
“With this approach, IFS is giving its customers 15,000 new ways to flex,” IFS CEO Darren Roos said. “It goes without saying that, as excited as we are about reaching this milestone, the driving force behind our deliveries is our unwavering commitment to offer choice and value to our customers. Providing ‘open’ solutions is a critical factor in making good on this promise. The quality, pace, and focus of our product development speaks to a business that is outperforming the legacy vendors in the enterprise software space.”
Aurena user experience
The IFS Aurena user experience offering has now been extended across the entire IFS Applications suite for Service Management, ERP and EAM. It uses the same set of APIs, which are now generally available and provides a browser-based user experience optimised for each role and user type, with a focus on employee engagement and productivity.
“IFS Aurena provides customers with a truly responsive design, allowing the entire suite to automatically adapt to different form factors as well as capabilities to design and build truly native applications targeted across iOS, Android and Windows, with support for offline scenarios and device-specific capabilities such as GPS and camera,” noted IFS chief product officer Christian Pedersen.
Among the more significant industry updates here is support for International Traffic in Arms Regulations (ITAR) compliance initiatives in the cloud.
Customers who have ITAR obligations, such as those operating in or trading with the U.S. aerospace, defence or government sectors, can deploy and use IFS software to support their ITAR compliant business needs, in an independently validated environment hosted in the Microsoft Azure Government Cloud, fully managed by IFS.