Tableau has been acquired by Salesforce, but you wouldn’t know it if you attended the organisation’s annual data-developer conference this month. Why wouldn’t you know it? Because the firm is continuing to roll out product enhancements and partner connections as normal.
Okay, you would know it, swaggering Salesforce CEO Marc Benioff took part in the event’s opening keynote engaging in a ‘fireside chat’ with Tableau CEO Adam Selipsky.
In predictable style, Benioff dominated the chat in his trademark steamroller style addressing the [adoring?] audience almost as directly as Selipsky himself, all without even taking his trilby off. Maybe he was cold.
Regardless of CEO standoffs, Selipsky and team delivered a series of technology-rich keynotes detailing higher-level ‘data culture’ vision and product-centric functionality extensions during each morning of the Tableau Conference 2019.
Selipsky highlighted Tableau Blueprint, a methodology that helps companies create a data culture within their organisations. Tableau says that today, research suggests that only 8% of companies achieve analytics at scale.
“By building a data culture and empowering more people with analytics through improved data literacy and analytics proficiency, companies can realise an entire workforce that makes better decisions with data. Focused on agility, proficiency and community, Tableau Blueprint creates a roadmap for businesses that want to transform their organisation with a data culture to weave analytics in the fabric of their organisations and achieve better outcomes with data-informed decisions,” notes the company, in a press statement.
Organisations including the World Food Programme (WFP) use Tableau Blueprint to build a working data culture. Following Blueprint’s prescriptive guidance, WFP developed a data strategy, introduced a formal governance framework and built an engaged community of data advocates that encourage learning, sharing and exploration at every level.
“This has been a monumental year for Tableau and our customers,” said Jackie Yeaney, executive vice president of marketing at Tableau. “We’ve seen incredible enthusiasm and adoption of our smart analytics and enterprise capabilities as companies continue investing in a data culture.”
The company used its annual conference to demo new features in Artificial Intelligence (AI), Machine Learning (ML) and self-service data management. Key product updates were announced in Tableau Catalog and Tableau Prep Conductor intended to puts Tableau at the centre of an organisation’s data strategy.
The company also introduced Netflix-style, AI-powered, visualisation recommendations to help users find relevant visualisations and major updates to Ask Data, Tableau’s natural language processing solution, which can now interpret more complex questions including year-over-year and geo-spatial comparisons.
Tableau also demonstrated Explain Data, which uses sophisticated statistical algorithms to allow people to instantly uncover AI-driven insights about their data. During the keynote, Tableau also introduced Metrics to help users keep an eye on performance indicators. Metrics can be individually curated and personalised to identify KPIs in a mobile-first view.
“We are building an open and extensible platform that enables our partners to build solutions for our joint customers that leverage the full power of Tableau.” said Francois Ajenstat, chief product officer at Tableau. “We’re thrilled to see the partner ecosystem expanding, giving customers the most choice and flexibility to solve their unique needs.”
Integrations highlighted at the conference included: Alibaba Cloud, which released native connectors to Tableau. The Chinese public cloud provider says it is helping customers connect to all the data they need as efficiently as possible, no matter where it resides.
Tableau 2019.4 introduced three native connectors for MaxCompute, AnalyticDB and Data Lake Analytics – built with Tableau’s Connector SDK.
Other partner connections included Data science company, Alteryx, which released a new integration with Tableau’s Hyper API. This is meant to enable joint customers to input data from .hyper file types, which supports faster analytical and query performance for larger data sets, into Alteryx processes.
Also in the lineup was the new Databricks Connector, which claims to offer enhanced performance and an optimised connection directly within Tableau. By allowing customers to more easily tap into data lakes and analyse massive datasets, the connector will power new insights based on the most up-to-date and real-time data.
Overall, Tableau Conference was a full-on couple of days extending over most of the week if we include the pre-conference tracks and the wrap up inspirational speaker session which all companies seem to include these days.
More of the same inside Salesforce next year? Sure thing… set the table.
Poorer areas of India and Ethiopia need access to clean water, civil engineering infrastructure development to improve sanitation as well as hygiene supplies and medicine.
They need all of those things… and the intermediary charity layer that brings these essential elements of life to them needs some means of analysing its performance and monitoring how successfully it is carrying out its missions and initiatives.
One such organisation is Splash.
The Seattle-based nonprofit is dedicated to providing water, sanitation and hygiene solutions to children in urban Asia and Africa.
In line with its annual technology conference, data analytics visualisation company Tableau announced news of Tableau Foundation now committing US$1 million to help deliver clean water to 1 million children in India and Ethiopia.
“Our partnership with the Tableau Foundation will not only help us reach 1 million kids, helping them to lead healthier lives, but it is also transforming how we approach our mission by uncovering reliable, real-time information about the water, sanitation, and hygiene infrastructure in the schools where we work, ensuring that our solutions are sustainable over the long-term,” stated Eric Stowe, Splash founder and executive director.
Splash is working to expand its mission to two of the world’s largest cities – Kolkata, India and Addis Ababa, Ethiopia. In addition to financial support from Tableau Foundation, Splash will use the Tableau analytics technology to expand its reach and create a working model for governments in both cities.
Global head of Tableau Foundation Neal Myrick explains that poor water quality and inadequate sanitation are among the leading causes of disease, especially for children.
Where most water-focused organizations are focused on bringing clean water to rural areas, Splash is focused on urban WASH (water And sanitation & hygiene), specifically in schools.
The United Nations predicts that by 2030, the global population will increase to 8.5 billion, and by 2050, 75% of the world’s population will be in urban cities, with growth centered in developing countries. This explosive growth highlights the urgent need for improving critical water, sanitation, and hygiene (WASH) services and infrastructure.
Splash focuses on WASH because 88% of cases of diarrhea worldwide are attributable to unsafe water, inadequate sanitation or insufficient hygiene. These cases result in 1.5 million deaths each year, most being children. With children up to 14 years of age in developing countries suffering a disproportionate share of this burden, Splash focuses on child-serving institutions.
This is a guest post for the Computer Weekly Developer Network in our Continuous Integration (CI) & Continuous Delivery (CD) series.
This contribution is written by Chen Harel in his role as VP of product at OverOps — the company is known for its work focused on enabling companies who create software to ensure rapid code changes do not impact customer experience.
Reminding us that there’s a slightly scary notion in the world of software development, harel urges us to think about the so-called “Rule of Ten”.
Harel writes as follows…
Its premise is simple: the cost of finding and fixing software defects increases 10X the further you progress through the software delivery lifecycle.
For anyone tasked with building and delivering software, this is a terrifying thought.
Many of us have seen first-hand how software bugs in a production environment can translate to millions lost in customer transactions, brand tarnishment and developer productivity. The implications of a customer-impacting incident seem to grow more serious by the day.
As a result [of the impact of heinous software bugs], many teams are adopting what’s called a ‘shift-left’ approach, focusing their attention to quality earlier in the software delivery lifecycle to help detect bugs when they are least costly.
While most organisations adopt CI/CD tooling and practices for the purpose of agility, if done correctly, CI/CD can also play a crucial role in making these shift-left quality initiatives successful. After all, you can keep pumping code through your shiny new pipeline at breakneck speed, but if it’s poor quality code you’re just accelerating a lot of operational headaches.
Code Quality Gates
Incorporating test automation in your CI/CD pipeline can increase release velocity without jeopardising the production environment or creating technical debt – as long as you’re analysing your code for the right things. After making sure you have meaningful tests and code coverage, I’ve found that the best way to optimise a CI/CD pipeline for quality is by gating your builds based on severe error types. The concept of quality gates isn’t new, but this particular application within the CI/CD pipeline is one of its more powerful uses.
For my team, it’s the key to preventing Sev1s — so some examples of useful quality gates include:
- New Errors – Did the release introduce any errors that didn’t previously exist?
- Increasing Error Rate – Did the rate of a known error increase dramatically in this release?
- Slowdowns – Did the release introduce any slow or failing transactions?
These gates should be configured based on the code-level issues that matter to your application. If you aren’t sure what the right threshold is, there are numerous open source plugins that provide out of the box recommended quality gates you can later configure as you continue to learn how different issues impact your system.
Commit to committal
By evaluating thresholds of this nature within your CI/CD pipeline, you gain several advantages. Each code commit allows you begin testing pieces of code for an upcoming release earlier in the process, increasing the speed of feedback and leaving more time to find and fix defects without disrupting deadlines.
It also reduces reliance on traditional manual testing methods and adds accountability to your development team without derailing release progress or sparking finger-pointing.
CI/CD is more than just a means of increasing agility. With the right quality gates, it can mean the difference between hoping your code will work in production… and knowing that it will.
Tableau is known for its specialism in business intelligence, big data analytics and (perhaps most famously) for its interactive data dashboard technologies.
But, Tableau still needs partners.
Among the colluding collaborators the company welcomed to its annual conference in Las Vegas this month was Rockset.
As a company, Rockset is best described as focused on ‘serverless search and analytics’ that works to enable ‘real-time SQL’ on NoSQL data – the technology is serverless because it does not require provisioning, capacity planning or server administration in the cloud.
Rockset used its time at the conference to explain how it has now announced the capability to build interactive live Tableau dashboards on NoSQL data, without requiring users to write any lines of code.
Making it possible to capture NoSQL data from sources such as Apache Kafka and Amazon DynamoDB, Rockset provides access to new data types and formats. The core concept here is to allow Tableau users to get operational monitoring and analytics on their business data.
ETL & data pipelines
The company suggest that businesses often fail to capture the full story of their data as they fail to capture NoSQL data from databases like DynamoDB and event-streaming platforms like Kafka. In an effort to bring NoSQL data into business intelligence, considerable data engineering effort goes into complex ETL and data pipelines, which are difficult to build, expensive to maintain and experience hours of latency – this, suggests Tableau, makes real-time analytics extremely hard.
“As unstructured and semi-structured data needs continue to evolve, NoSQL databases like DynamoDB and event streaming platforms like Kafka are quickly gaining momentum, bringing with them the additional complexity of less structured, higher velocity data,” said Venkat Venkataramani, co-founder and CEO of Rockset.
“Businesses need to make sense of their NoSQL data while moving from batch to real-time analytics, but legacy ETL pipelines are ill-suited to meet this need. Rockset enables the ability to do fast SQL on NoSQL, enabling Tableau users to easily analyse data from sources such as DynamoDB and Kafka.”
Rockset says it delivers millisecond-latency SQL directly on raw data, including nested JSON, XML, Parquet and CSV.
This enables developers and data engineers to build live dashboards and visualisations that query operational data from databases, data lakes and event streaming platforms via Java Database Connectivity (JDBC).
The newly redesigned Tableau Partner Network aims to provide a consistent, predictable foundation for partners across the globe as they grow their business and expand their offerings in close partnership with Tableau. The program’s three tracks – Reseller, Services and Technology – align these core partner business models.
Tableau’s revamped program will better clarify the breadth and depth of its partner ecosystem to customers. For partners, new self-service capabilities that simplify partner engagement and better align resources will help partners serve Tableau customers.
Also among the more vocal Tableau partners this year was enterprise AI company DataRobot.
The company used its time at the Tableau 2019 conference to announced an enhanced integration that aims to make it easier for analysts to visualise predictions within their Tableau dashboards. The features will allow Tableau users to publish predictions from DataRobot into a Tableau Data Source (TDS) that can be visualised alongside other business data.
The additional functionality gives business analysts without technical data science training (but with deep domain expertise) the ability to identify hidden patterns that would previously go undetected and analyse the cause-&-effect of different variables on a predicted outcome.
“With the demand for AI far exceeding the capacity of available data scientists and hundreds of potential use cases for AI across every business, it’s critical that forward-thinking companies scale AI efforts by broadening the pool of users who can participate in AI initiatives,” said Adam Weinstein, vice president of product management at DataRobot.
DataRobot says that its Enterprise AI Platform provides built-in best practices and guardrails for automation across the entire ‘AI lifecycle’ (AI now has a lifecycle, apparently).
The AI lifecycle (if indeed it exists) is the journey that company’s take as they move from historical to predictive data analysis.
This is a guest post for the Computer Weekly Developer Network in our Continuous Integration (CI) & Continuous Delivery (CD) series.
He says that says that CI/CD has meant that in large organisations, where the software deployment process once took days, today, an entire build and test processes each take about five minutes — and when deployments are triggered they take about an additional 10 minutes.
Sinha writes as follows…
Amazon managed to decrease the number of simultaneous outages and increase revenue by releasing code every 11.7 seconds on average. Netflix isn’t as fast – its developers release code only several times a day, yet it still manages to adjust to its customer’s needs.
What this means is that CI/CD has most impact on products that need regular feature updates and at the same time need continuous fixes for bugs.
In other words, when changing code is routine, CI/CD is an efficient way of developing software enabling more frequent, meaningful and faster deployments releasing updates at any time in a sustainable way.
CI/CD pipeline is a connected sequence of development process broken down into different stages and helping developers to get a quick feedback at different levels. CI is the first stage of the pipeline laying the foundation for continuous delivery and continuous deployment stages.
- Continuous Integration = Build, Test, Merge
- Continuous Delivery = Automatically release to repository
- Continuous Deployment = Automatically deploy to production
- Continuous Integration = The process to build codes and test them automatically. It makes sure that all the changes are regularly collected automatically.
Successful CI is a branch that helps individual software developers to make new code changes frequently (or on a daily basis) to the app and integrate those changes into a shared repository without breaking or conflicting the existing code of the application.
It enables multiple developers to work simultaneously on different features of the same application and merge it with the existing codes at the common place.
Post code merger, the updates are validated by automatically building the application and running automated tests at various levels of the entire application. This ensures that the new code changes have not broken the existing codes. If any conflict is discovered during the automated testing, CI helps to do away with those conflicts by fixing those bugs quickly and often.
Continuous delivery is a process that picks up from the point where CI ends. It is a way to allow the teams to release the updates from the main repository to customers. The regular continuous delivery that can be monthly, weekly, daily or as early as possible helps to streamline and automate the whole process through training and retraining, laying the foundation for continuous deployment.
Continuous deployment is nothing but an automated version of continuous delivery. Post continuous delivery stage the new changes are released to the customers through automation. It helps to eliminate the risk of human error and only an unsuccessful test can prevent the new changes from going through.
Such automation allows the development team to do away with release day pressure. This automated process built through training and retraining helps to deploy the changes at a click of the button. Release of code changes as soon as possible is advisable, as making fixes in smaller batches are easier than those in the larger ones. In addition, it helps to get the user feedback faster and have the result in a shorter time.
Hence, the difference between the continuous delivery and continuous deployment releases of code changes is of automation. Continuous deployment release, unlike, continuous delivery are done using automation.
Benefits of CI/CD
The integration and deployment of new code change into the existing application through automation at a faster pace not only makes the life of developers easier but also makes them more efficient. Some clear advantages to using this approach include:
Shorter periods: Back to back continuous releases using CD are less time consuming and easier.
Real time visibility: End user involvement and their real time feedback during CD increases the visibility of development process in real time.
Faster software builds for issue detection and quicker resolutions: The easy integration of new code changes through CI and deployment using CD helps to release the changes earlier than compared to those done without automation.
Code quality improvement: CI/ CD allows to reduce the human interventions that leads to conflicts. CI/CD automation helps to integrate new changes into common repository. Integration at common repository helps to build up multiple times a day through proper coordination between the developers and reduce integration costs. It helps to address the issues on real time leading to easier and on time resolution.
In general, CI/CD automation not only makes the developers more responsive but also minimises the cost of the software adding more room for further development.
The quick turn around and real-world adaptability enablement offered by CI/CD tools make them mandatory for any product-based technology company. Organisations that do not adapt to this hyper reactive development and deployment cycle are under real threat of being left behind.
This is a guest post for the Computer Weekly Developer Network in our Continuous Integration (CI) & Continuous Delivery (CD) series.
This contribution is written by Marcus Merrell in his role as director of technical services at Sauce Labs – the company is known for its continuous, automated and live testing capabilities which can be used for cloud, web and mobile applications.
The Sauce Labs director says it can be tempting to look at the mechanics of the modern CI/CD pipeline and think it’s a black and white process, the success or failure of which is defined by coding skills and technology purchases.
Merrell writes as follows…
The reality for those on the ground, however, is considerably different, and considerably grayer than any notion of a black and white process.
While having skilled coders and arming them with the right development, testing and integration technologies is indeed important, soft skills and intangibles are often what make or break most development teams.
Sustainable soft skills
Take adaptability, for example. If there’s one constant in the development world, it’s change. Changes to organisational structure and business priorities happen all the time. Changes in customer behaviour and product requirements are equally recurrent. As are changes to the overall market. In the midst of change, it’s more important to have a development team that adapts well to new processes, leaves their collective ego out of it and takes constructive criticism from peers, than to have a team of superior coders and developers who are unable or (worse) unwilling to adapt to the inevitable cycles of change. That’s why, when building out your CI/CD team, soft skills are every bit as important (if not more) as coding acumen.
The good news for tech leaders and practitioners alike is that adaptability is a skill that can be honed and advanced by spending time to understand the roles and needs of other functional teams within the organisation, as well as the roles and needs of your customers.
Try to get your developers close to the customer, whether through your ‘customer success’ team, or by participating in customer advisory boards.
The more CI/CD teams understand and empathise with the challenges their customers and colleagues face, the more adaptable they’ll become.
But even adaptability won’t be enough to overcome modern development challenges if everyone on your team has the same resume or CV. The most successful CI/CD teams in the world embrace diversity and foster a culture of inclusion.
It’s impossible to understate how important it is to have varying perspectives and life experiences on your development and delivery teams. Your customers are diverse, and your development team needs to be as well. You can’t put yourself into the mind of someone with a completely different background and set of life experiences as you. To develop and deliver software that meets your customers’ needs, you have to understand their needs. To understand those needs, you need people on your team who share their perspectives.
Most teams already have the requisite skills, invest in the right technologies, and instill the right processes and procedures.
What they typically do not have are the critical soft skills, and that will make the difference between a struggling team and a high-performing one.
ScyllaDB is the firm behind the Scylla NoSQL database claims to be thinking big — and so, the organisation has used its annual Scylla Summit conference to detail a whole selection box of new features designed to serve real-time big data applications.
Scylla has unveiled new capabilities including Lightweight Transactions, Change Data Capture (CDC) and an Incremental Compaction Strategy.
CEO and cofounder of ScyllaDB Dor Laor explains that he knows his firm has become known for its approach to speed and reliability for latency-sensitive big data applications… and for its ability to help reduce storage costs.
Latency-sensitive applications are (perhaps obviously) chunks of enterprise software that can not work effectively with any extended period latency (or wait time), typically because they serve a real-time data need in some live operational/transactional deployment.
“But performance [for low-latency] is just part of what makes Scylla so powerful. With these latest features, we’re extending Scylla’s benefits to exciting new use cases and opening the door to a wide range of new functionality,” said Laor.
Among the new features is Lightweight Transactions (LWT), a development that is already committed into Scylla’s main source code branch.
LWT works to deliver ‘transaction isolation’ similar to that of traditional relational databases, which the company says will help bring Scylla to a new range of business-critical use cases. In database systems, isolation determines how transaction integrity is visible to other users and other systems… so it’s a good form of lock down and control where needed.
Going deeper here… LWT ensures, with a reasonable degree of certainty, that whenever you [i.e. your database] reads a record, you see the version that was last written. Without LWT, you might read a record (call it Record A) off one node of the cluster just as someone was updating the record on another node. With LWT, the database updates Record A on all clusters at the same time, so the nodes don’t (or very rarely) disagree, and two applications querying the same record are much less likely to see two different versions.
The company will also soon release a Scylla Open Source version that includes this LWT feature.
Change Data Capture (CDC)
Scylla reminds us that the modern application stack is no longer a monolith.
Because of this core truth, we see microservices that need to constantly push and pull data from their persistence layer. CDC enables Scylla users to stream updates to datasets with external systems, such as analytics or ETL tools.
Scylla CDC identifies and exports only new or modified data, instead of a full database scan. Beyond its efficiency, CDC allows organisations to use Scylla tables interchangeably.
“Opening new possibilities for users to consume data. CDC is already committed into Scylla’s main source code branch. We will soon release a Scylla Open Source version that includes this feature,” noted the company, in a press statement.
Incremental Compaction Strategy (ICS)
Reducing storage costs by what are said to be up to 40%, Incremental Compaction will soon be available with Scylla Enterprise and Scylla Cloud.
While compaction reduces disk space usage by removing unused and old data from database files, the operation itself typically suffers from space amplification. ICS lowers costs significantly by improving this operation.
Finally here there’s DynamoDB-compatible API for Scylla Cloud: Project Alternator, which presents an alternative to Amazon DynamoDB… and the technology is now available in beta for Scylla Cloud, the company’s powerful database as a service (DBaaS).
Applications written for DynamoDB can now run on Scylla Cloud without requiring code changes. This enables DynamoDB users to quickly transition to Scylla Cloud to significantly reduce costs, improve performance, and take advantage of Scylla’s cloud and hybrid topologies.
In more detail, Pager Duty offers what it calls a Digital Operations Management Platform that aims to integrate machine data alongside human intelligence on the road to building interactive applications that benefit from optimised response orchestration, continuous development and delivery.
We need to remember that CI/CD is not just Agile iteration, but a direct comparison isn’t quite right either – it’s a little bit like asking for the differences between a car and an engine.
CI/CD can definitely enhance and support Agile methodologies, but it’s not directly linked. You don’t necessarily need to be doing Agile in a structured way to take advantage and get benefits of CI/CD — and CI/CD is a great way to support feedback loops – but they aren’t required.
That’s not to say I don’t support or endorse good feedback loops – they are critical to software success. But they’re slightly orthogonal to CI/CD and shouldn’t be conflated.
You can absolutely do CI/CD without feedback loops – CI/CD is about how changes are tested and deployed. Your CI/CD implementation will still “work” even if you aren’t getting feedback – it’s about how the software is built and shipped. I can write a great CI/CD pipeline that has nothing to do with feedback, and it will work well and accomplish its goal, which is to ensure that the software is shippable at any given time.
No plug-&-play CD/CD
Nothing in CI/CD by itself will provide fewer bugs.
You can’t just ‘install some CI/CD’ and increase quality. Having useful and high quality tests as part of your pipeline are what will help reduce bugs – or at least make those bugs cheaper to fix. But bugs will happen. We want to detect them as close to the introduction of the bug as possible and CI/CD provides the ability to do so; but it won’t do it ‘automatically’. You need to include proper tests (including security tests for example) at all stages of the pipeline… and as early in the process as possible. Your pipeline is only as good as you build it – and continually reviewing it just like you would review your application code. If a bug escapes into production, part of the postmortem on the incident for that bug should include any potential improvements to your deployment process to address catching it earlier next time.
So how frequent should CI/CD cycles be? Well, how frequent is an easy point to answer – because every commit should trigger the pipeline. Smaller batch sizes always help.
That being said, executing your CI/CD pipeline doesn’t mean you release/deploy to production! But if you have rolled up a week’s worth of changes into one merge, that makes it a lot harder to see what in that change caused the issue. But there is no magic frequency number – you should be able to deploy/release as quickly and as often as your business requires. CI/CD is about ensuring that all changes are shippable at any given time, and they are available to be released to production when the business requires it.
A question of concurrency
If your deployments and builds required a lot of computing and are taking longer than you would like, looking at ways to parallelize them is definitely helpful, but by no means required. I would caution folks against over-optimizing for an issue until it actually comes up.
Start small and scale when you do hit the cases that require you to scale.
I find that I reiterate the distinction between how CI/CD contrasts with continuous deployment quite often. The only difference between continuous delivery and continuous deployment is whether the last step – the push to production – is either automatic or a human gate. Not everyone is ready for continuous deployment (or may never need it!) but everyone can, and should, engage in continuous delivery. We should be testing the deployment of every change, just like we perform functional tests of every change. That’s continuous delivery.
It’s also important to note that the difference between functional and unit tests has nothing to do with whether they exist in CI/CD or elsewhere. The difference is in what is being tested.
That being said, unit tests tend to exist in the “CI” part; which is to say they are testing the code without it being deployed somewhere. The functional tests require the application to be running, which is more of the “CD” side of the equation.
A document trail is as valuable as it’s readability by others and also how often it is actually looked at.
Stack traces aren’t always helpful for someone other than the engineer who wrote the code – having your tests provide meaningful, human-readable output is key.
This is a guest post for the Computer Weekly Developer Network written by Mathew Lodge in his capacity as CEO of Diffblue — the company is known for its work as it aims to ‘revolutionizse’ software engineering through Artificial Intelligence (AI) and offer what it calls an #AIforCode experience.
Diffblue’s Lodge reminds us that, in the last decade, automation of software development pipelines has rapidly taken off as more teams have adopted DevOps practices and cloud-native architecture.
Automated software pipelines have led to The Rise of The Bots: robot assistants within the continuous integration loop that automate tedious and repetitive tasks like updating project dependencies to avoid security flaws.
Lodge says that today, bots can generate ‘pull requests’ (a pull request is a method of submitting contributions to an open or other development project) to update dependencies and those requests are reviewed by other bots and, if they pass the tests, automatically merged.
Lodge writes as follows…
The crucial part that makes all of this [AI coding] work is tests.
Without tests to quickly validate commits, automated pipelines will risk automatically promoting junk – which is much harder and slower to fix later in the software delivery process.
In his canonical 2006 article on Continuous Integration, Martin Fowler pithily notes that: “Imperfect tests, run frequently, are much better than perfect tests that are never written at all.”
AI for code: developing at scale
Writing tests is like eating healthily and drinking enough water: everyone aspires to do it, but life tends to get in the way.
It’s often the least enjoyable part of development and it takes time and attention away from the more interesting stuff. So automation seems like it would be a great fit – except that the rules-based automation that works well for dependency-checking bots, does not work well for automating test generation; it’s a much harder problem to solve.
AI-based test-writing approaches that apply new algorithms to the problem have emerged in the last few years. Machine learning-based tools can look at browser-based UI testing code, compare it to the Document Object Model… and suggest how to fix failing tests based on training data from analysing millions of UI tests.
But much more code will have to be written by AI to move the needle.
Gartner has estimated that by 2021, demand for application development will grow five times faster than tech teams can deliver. So we’re now seeing the emergence of AI that writes full unit test code, by analysing the program to understand what it does and connecting inputs to outputs.
While the tests aren’t perfect, as no one has solved the halting problem and other well-known challenges in program analysis, the tests are good enough – and infinitely better than perfect tests that were never written.
Benefits beyond automation
AI for code can do more than simply increase the speed at which developers work: it can actually improve the quality of the finished software product, and reduce the amount of required debugging. It can quickly complete repetitive tasks, without losing interest or making mistakes as humans sometimes do. Automating the boring (but necessary) parts of the job can also prevent burnout and increase job satisfaction at a time when companies have to compete for the best talent.
With AI for code, the developers of tomorrow will have more freedom to innovate in the way only people can – benefits that go beyond what’s possible with automation alone. Expect the future of software development to be increasingly AI-assisted.
Recent discussions with database company Scylla threw up the term close-to-the-metal, or some simply say close-to-metal.
But what does close-to-the-metal mean?
The Computer Weekly Developer Network team has gathered a handful of comments and definitions on this subject and lists them (in part, with full credit and links) below for your reference.
Essentially, close-to-the-metal means database software that works in close proximity to and with knowledge of the actual instruction set and addresses of the hardware system that it is built to run on.
This means that the database (or potentially other software program type) itself can work to ‘squeeze’ out as much power for any given hardware estate (the process of scaling up) before it then needs to expand with further processing and analytics nodes (the process of scaling out).
As noted by wikic2, close-to-the-metal (or close-to-the-hardware) means we’re deep in the guts of the system. “The C [programming language’s] memory management is considered close-to-the-metal compared to other application languages because one can easily see and do mathematics on actual hardware RAM addresses (or something pretty close to them).”
The above-linked definition suggests that close-to-the-metal can sacrifices hardware choice through lock-in and may introduce risk because there is no interface layer to protect against silly or dangerous ranges, settings, or values.
Roger DiPaolo, provides an additional (and much needed) piece of extra colour here when he says that with close-to-the-metal (in programming terms) means that the language compiles (or assembles) all the way down to native machine code for the CPU it is to run on.
“This is so that the code has no ‘layers’ it has to go through to get to the CPU at run time (such as a Virtual Machine or interpreter). A close-to-the-metal language has the facilities for directly manipulating RAM and hardware registers. C and C++ can both do this.”
So trade offs or not, this is the approach Scylla has taken to building its core technology proposition.
The company claims that independent tests show a cluster of Scylla servers reading 1 billion rows per second (RPS) – and so the firm says that this is a performance that ranks ‘far beyond’ the capabilities of a database using persistent storage.
“Everyone thought you’d need an in-memory database to hit [MOPS] numbers like that,“ said Dor Laor, CEO of ScyllaDB. “It shows the power of Scylla’s close-to-the-metal architecture. For 99.9% of applications, Scylla delivers all the power [a customer] will ever need, on workloads other NoSQL databases can’t touch and at a fraction of the cost of an in-memory solution. Scylla delivers real-time performance on infrastructure organisations are already using.”
Bare-metal platform provider Packet partnered with Scylla to conduct the test on 83 of its n2.xlarge servers, each running a meaty 28 physical cores.
The benchmark populated the database with randomly generated temperature readings from 1 million simulated temperature sensors that reported every minute for 365 days, producing a total of 526 billion total data points.