Coffee Talk: Java, News, Stories and Opinions


July 3, 2017  11:26 AM

Advancing JVM performance with the LLVM compiler

cameronmcnz Cameron McKenzie Profile: cameronmcnz

The following is a transcript of an interview between TheServerSide’s Cameron W. McKenzie and Azul Systems’ CTO Gil Tene.

Cameron McKenzie: I always like talking to Gil Tene, the CTO of Azul Systems.

Before jumping on the phone, PR reps often send me a PowerPoint of what we’re supposed to talk about. But  with Tene, I always figure that if I can jump in with a quick question before he gets into the PowerPoint presentation, I can get him to answer some interesting questions that I want the answers to. He’s a technical guy and he’s prepared to get technical about Java and the JVM.

Now, the reason for our latest talk was Azul Systems’ 17.3 release of Zing, which includes an LLVM-based, code-named Falcon, just-in-time compiler. Apparently, it’s incredibly fast, like all of Azul Systems’ JVMs typically are.

But before we got into discussing Azul Systems’ Falcon just-in-time compiler, I thought I’d do a bit of bear-baiting with Gil and tell him that I was sorry that in this new age of serverless computing and cloud and containers, and a world where nobody actually buys hardware anymore, that it must be difficult flogging a high-performance JVM when nobody’s going to need to download one and to install it locally. Well, anyways, Gil wasn’t having any of it.

Gil Tene: So, the way I look at it is actually we don’t really care because we have a bunch of people running Zing on Amazon, so where the hardware comes from and whether it’s a cloud environment or a public cloud or private cloud, a hybrid cloud, or a data center, whatever you want to call it, as long as people are running Java software, we’ve got places where we can sell our JVM. And that doesn’t seem to be happening less, it seems to be happening more.

Cameron McKenzie: Now, I was really just joking around with that first question, but that brought us into a discussion about using Java and Zing in the cloud. And actually, I’m interested in that. How are people using Java and JVMs they’ve purchased in the cloud? Is it mostly EC2 instances or is there some other unique way that people are using the cloud to leverage high-performance JVMs like Zing?

Gil Tene: It is running on EC2 instances. In practical terms, most of what is being run on Amazon today, it is run as virtual instances running on the public cloud. They end up looking like normal servers running Linux on an x86 somewhere, but they run on Amazon, and they do it very efficiently and very elastically, they are very operationally dynamic. And whether it’s Amazon or Azure or the Google Cloud, we’re seeing all of those happening.

But in many of those cases, that’s just a starting point where instead of getting server or running your own virtualized environment, you just do it on Amazon.

The next step is usually that you operationally adapt to using the model, so people no longer have to plan and know how much hardware they’re going to need in three months time, because they can turn it on anytime they want. So they can empower teams to turn on a hundred machines on the weekend because they think it’s needed, and if they were wrong they’ll turn them off. But that’s no longer some dramatic thing to do. Doing it in a company internal data center? It’s a very different thing from a planning perspective.

But from our point of view, that all looks the same, right? Zing and Zulu run just fine in those environments. And whether people consume them on Amazon or Azure or in their own servers, to us it all looks the same.

Cameron McKenzie: Now, cloud computing and virtualization is all really cool, but we’re here to talk about performance. So what do you see these days in terms of bare iron deployments or bare metal deployments or people actually deploying to bare metal and if so, when are they doing it?

Gil Tene: We do see bare metal deployments. You know, we have a very wide mix of customers, so we have everything from e-commerce and analytics and customers that run their own stuff, to banks obviously, that do a lot of stuff themselves. There is more and more of a move towards virtualization in some sort of cloud, whether it’s internal or external. So I’d say that a lot of what we see today is virtualized, but we do see a bunch of the bare metal in latency-sensitive environments or in dedicated super environments. So for example, a lot of people will run dedicated machines for databases or for low-latency trading or for messaging because they don’t want to take the hit for what the virtualized infrastructure might do to them if they don’t.

But having said that, we’re seeing some really good results from people on consistency and latency and everything else running just on the higher-end Amazon. So for example, Cassandra is one of the workloads that fits very well with Zing and we see a lot of turnkey deployments. If you want Cassandra, you turn Zing on and you’re happy, you don’t look back. In an Amazon, that type of cookie-cutter deployment works very well. We tend to see that the typical instances that people use for Cassandra in Amazon with or without us is they’ll move to the latest greatest things that Amazon offers. I think the i3 class of Amazon instances right now are the most popular for Cassandra.

Cameron McKenzie: Now, I believe that the reason we’re talking today is because there are some big news from Azul. So what is the big news?

Gil Tene: The big news for us was the latest release of Zing. We are introducing a brand-new JIT compiler to the JVM, and it is based on LLVM. The reason this is big news, we think, especially in the JVM community, is that the current JIT compiler that’s in use was first introduced 20 years ago. So it’s aging. And we’ve been working with it and within it for most of that time, so we know it very well. But a few years ago, we decided to make the long-term investment in building a brand-new JIT compiler in order to be able to go beyond what we could before. And we chose to use LLVM as the basis for that compiler.

Java had a very rapid acceleration of performance in the first few years, from the late ’90s to the early 2000s, but it’s been a very flat growth curve since then. Performance has improved year over year, but not by a lot, not by the way that we’d like it to. With LLVM, you have a very mature compiler. C and C++ compilers use it, Swift from Apple is based on its, Objective-C as well, the RAS language from Azul is based on it. And you’ll see a lot of exotic things done with it as well, like database query optimizations and all kinds of interesting analytics. It’s a general compiler and optimization framework that has been built for other people to build things with.

It was built over the last decade, so we were lucky enough that it was mature by the time we were making a choice in how to build a new compiler. It incorporates a tremendous amount of work in terms of optimizations that we probably would have never been able to invest in ourselves.

To give you a concrete example of this, the latest CPUs from Intel, the current ones that run, whether they’re bare metal or powered mostly on Amazon servers today, have some really cool new vector optimization capabilities. There’s new vector registers, new instructions and you could do some really nice things with them. But that’s only useful if you have some optimizer that’s able to make use of those instructions when they know it’s there.

With Falcon, our LLVM-based compiler, you take regular Java loops that would run normally on previous hardware, and when our JVM runs on a new hardware, it recognizes the capabilities and basically produces much better loops that use the vector instructions to run faster. And here, you’re talking about factors that could be, 50%, 100%, or sometimes 2 times or 3 times faster even, because those instructions are that much faster. The cool thing for us is not that we sat there and thought of how to use the latest Broadwell chip instructions, it’s that LLVM does that for us without us having to work hard.

Intel has put work into LLVM over the last two years to make sure that the backend optimizers know how to do the stuff. And we just need to bring the code to the right form and the rest is taken care of by other people’s work. So that’s a concrete example of extreme leverage. As the processor hits the market, we already have the optimizations for it. So it’s a great demonstration of how a runtime like a JVM could run the exact same code and when you put it on a new hardware, it’s not just the better clock speed and not just slightly faster, it can actually use the instructions to literally run the code better, and you don’t have to change anything to do it.

Cameron McKenzie: Now, whenever I talk about high-performance JVM computing, I always feel the need to talk about potential JVM pauses and garbage collection. Is there anything new in terms of JVM garbage collection algorithms with this latest release of Zing?

Gil Tene: Garbage collection is not big news at this point, mostly because we’ve already solved it. To us, garbage collection is simply a solved problem. And I do realize that that often sounds like what marketing people would say, but I’m the CTO, and I stand behind that statement.

With our C4 collector in Zing, we’re basically eliminating all the concerns that people have with garbage collections that are above, say, half a millisecond in size. That pretty much means everybody except low-latency traders simply don’t have to worry about it anymore.

When it comes to low-latency traders, we sometimes have to have some conversations about tuning. But with everybody else, they stop even thinking about the question. Now, that’s been the state of Zing for a while now, but the nice thing for us with Falcon and the LLVM compiler is we get to optimize better. So because we have a lot more freedom to build new optimizations and do them more rapidly, the velocity of the resulting optimizations is higher for us with LLVM.

We’re able to optimize around our garbage collection code better and get even faster code for the Java applications running it. But from a garbage collection perspective, it’s the same as it was in our previous release and the one before that because those were close to as perfect as we could get them.

Cameron McKenzie: Now, one of the complaints people that use JVMs often have is the startup time. So I was wondering if there’s anything that was new in terms of the technologies you put into your JVM to improve JVM startup? And for that matter, I was wondering what you’re thinking about Project Jigsaw and how the new modularity that’s coming in with Java 9 might impact the startup of Java applications.

Gil Tene: So those are two separate questions. And you probably saw in our material that we have a feature called ReadyNow! that deals with the startup issue for Java. It’s something we’ve had for a couple of years now. But, again, with the Falcon release, we’re able to do a much better job. Basically, we have a much better vertical rise right when the JVM starts to speed.

The ReadyNow! feature is focused on applications that basically want to reduce the number of operations that go slow before you get to go fast, whether it’s when you start up a new server in the cluster and you don’t want the first 10,000 database queries to go slow before they go fast, or whether it’s when you roll out new code in a continuous deployment environment where you update your servers 20 times a day so you rollout code continuously and, again, you don’t want the first 10,000 or 20,000 web request for every instance to go slow before they get to go fast. Or the extreme examples of trading where at market open conditions, you don’t want to be running your high volume and most volatile trades in interpreter Java speed before they become optimized.

In all of those cases, ReadyNow! is basically focused on having the JVM hyper-optimize the code right when it starts rather than profile and learn and only optimize after it runs. And we do it with a very simple to explain technique, it’s not that simple to implement, but it’s basically we save previous run profiles and we start a run assuming or learning from the previous run’s behavior rather than having to learn from scratch again for the first thousand operations. And that allows us to run basically fast code, either from the first transaction or the tenth transaction, but not from the ten-thousandth transaction. That’s a feature in Zing we’re very proud of.

To the other part of your question about startup behavior, I think that Java 9 is bringing in some interesting features that could over time affect startup behavior. It’s not just the Jigsaw parts, it’s certainly the idea that you could perform some sort of analysis on code-enclosed modules and try to optimize some of it for startup.

Cameron McKenzie: So, anyways, if you want to find out more about high-performance JVM computing, head over to Azul’s website. And if you want to hear more of Gil’s insights, follow him on Twitter, @giltene.
You can follow Cameron McKenzie on Twitter: @cameronmckenzie

December 13, 2018  1:30 PM

Learn Java lambda syntax quickly with these examples

cameronmcnz Cameron McKenzie Profile: cameronmcnz

For those who are new to functional programming, basic Java lambda syntax can be a bit intimidating at first. Once you break lambda expressions down into their component parts, though, the syntax quickly makes sense and becomes quite natural.

The goal of a lambda expression in Java is to implement a single method. All Java methods have an argument list and a body, so it should come as no surprise that these two elements are an important part of Java lambda syntax. Furthermore, the Java lambda syntax separates these two elements with an arrow. So to learn Java lambda syntax, you need to be familiar with its three component parts:

  1. The argument list
  2. The arrow
  3. The method body

To apply these concepts, we first need a functional interface. A functional interface is an interface that only defines a single method that must be implemented. Here is the functional interface we will use for this example:

interface SingleArgument {
   public void foo(String s);
}

An implementation of this method requires a String to be passed in and a body that performs some logic on the String. We will break it down into its constituent elements in a moment, but for now, here’s a very basic example in which a lambda provides an implementation to the SingleArgument interface, along with a couple of invocations of the interface’s foo method:

SingleArgument sa1 =  n -> System.out.print(n);
sa1.foo("Let us all");
sa1.foo(" learn lambda syntax");

The following is a complete class implementing this logic:

package com.mcnz.lambda;

public class LearnJavaLambdaSyntax {
	
   public static void main(String args[]) {	
      SingleArgument sa1 =  n -> System.out.print(n);
      sa1.foo("Let us all");
      sa1.foo(" learn lambda syntax");
   }
}

interface SingleArgument {
   public void foo(String s);
}

Concise and verbose Java lambda syntax

The implementation demonstrated here is highly abbreviated. This can sometimes make it a bit difficult for newcomers to learn Java lambda syntax. It is sometimes helpful, then, to add a bit more ceremony to the code. One enhancement that can make it easier to learn Java lambda syntax is to put round brackets around the method signature and include type declarations on the left-hand side:

SingleArgument sa2 =  (String n) -> System.out.print(n) ;

Furthermore, you can put curly braces around the content on the right-hand side and end each statement with a semi-colon.

SingleArgument sa3 =  (String n) -> { System.out.print(n); } ;
learn java lambda syntax

Compare these different approaches to learn Java lambda syntax.

Multi-line lambda expression syntax

In fact, if your method implementation has more than a single statement, semi-colons and curly braces become a requirement. For example, if we wanted to use a regular expression, strip out all of the whitespace before printing out a given piece of text, our Java lambda syntax would look like this:

(String n) -> {
    n = n.replaceAll("\\s","");
    System.out.print(n);
}

Multi-argument lambda functions

In this example, the method in the functional interface has only one argument, but multiple arguments are completely valid, so long as the number of arguments in the lambda expression match the number in the method of the functional interface. And since Java is a strongly typed language, the object types must be a polymorphic match as well.

Take the following functional interface as an example:

interface MultipleArguments {
   public void bar(String s, int i);
}

The highly ceremonial Java lambda syntax for implementing this functional interface is as follows:

MultipleArguments ma1 = (String p, int x) -> {
   System.out.printf("%s wants %s slices of pie.\n", p, x);
};

As you can see, this lambda expression leverages multiple arguments, not just one.

I described this example as being highly ceremonial because we can significantly reduce its verbosity. We can remove the type declarations on the left, and we can remove the curly braces and the colon on the right since there is only one instruction in the method implementation. A more concise use of Java lambda syntax is as follows:

( p, x ) -> System.out.printf ( "%s wants %s slices.\n", p, x )

As you can see, Java lambda syntax is quite a bit different from anything traditional JDK developers are used to, but at the same time, when you break it down, it’s easy to see how all the pieces fit together. With a bit of practice, developers quickly learn to love Java lambda syntax.

Here is the full listing of code used in this example:

package com.mcnz.lambda;

public class LearnJavaLambdaSyntax {
	
  public static void main(String args[]) {
		
    SingleArgument sa1 =  n -> System.out.print(n);
    sa1.foo("Let us all ");
    sa1.foo("learn Java lambda syntax.\n");

    SingleArgument sa2 =  (String n) -> System.out.print(n);
    sa2.foo("Java lambda syntax ");
    sa2.foo("isn't hard.\n");
		
    SingleArgument sa3 =  (String n) -> { System.out.print(n); };
    sa3.foo("You just need a few ");
    sa3.foo("good Java lambda examples.\n");
		
    SingleArgument sa4 =  (String n) -> {
      n = n.replaceAll("\\s","");
      System.out.print(n);
    };
    sa4.foo("This Java lambda example ");
    sa4.foo("will not print with whitespace.\n");
		
    MultipleArguments ma1 = (String p, int x) -> {
      System.out.printf("%s1 wants %s2 slices of pie.\n", p, x);
    };
    ma1.bar("Cameron ", 3);
    ma1.bar("Callie", 4);
		
    MultipleArguments ma2 = 
      ( p, x ) -> System.out.printf ( "%s1 wants %s2 slices.\n", p, x );
    ma2.bar("Brandyn", 1);
    ma2.bar("Carter", 2);
	
  }

}

interface SingleArgument {
  public void foo(String s);
}

interface MultipleArguments {
 public void bar(String s, int i);
}

When this Java lambda syntax example runs, the full printout is:

Let us all learn lambda syntax.
Java lambda syntax isn't hard.
You just need a few good Java lambda examples.
ThisJavalambdaexamplewillnotprintwithwhitespace.Cameron 1 wants 32 slices of pie.
Callie1 wants 42 slices of pie.
Brandyn1 wants 12 slices.
Carter1 wants 22 slices.

You can find the source code used in this tutorial on GitHub.


December 1, 2018  11:33 PM

What is a lambda expression and from where did the term ‘lambda’ elute?

cameronmcnz Cameron McKenzie Profile: cameronmcnz

Due to various language constraints, lambda expressions had, until recently, never made it into the Java language. The concept had long been baked into other languages, such as Groovy and Ruby. That all changed with Java 8. As organizations slowly move to Java 8 and Java 11 platforms, more and more developers are getting the opportunity to use lambda expressions — or lambda functions, as they’re also called — in their code. This has generated a great deal of excitement but also confusion. Many developers have questions. So, why is this new language feature called a lambda function?

Why are they called ‘lambda functions?’

The term lamdba function actually has its roots in Calculus. In mathematics, a lambda function is one in which a single set of assigned variables are mapped to a calculation. Here are a few algebraic lambda functions. Anyone who took high school math should recognize them.

(x) = x2
(x, y) = x + y
(x, y, z) = x3 - y2 + z

For the first equation, if x is 3, the function would evaluate to 9. If x and y are both 2 for the second function, the result is 4. If x, y and z are 1, 2 and 3 in the third function, the calculated result is zero.

As you can see, a single set of variables are mapped onto a function, which generates a result. The corollary to computer science is to take a set of variables and map those variables to a single function. Let’s place extra emphasis on the word single. Lambdas work when there is only a single function to implement. The concept of a lambda completely falls apart in computer science when multiple methods get thrown into the mix.

The anonymous nature of lambda functions

A second point worth mentioning is that lambda functions are anonymous and unnamed. That’s not an obvious point when dealing with mathematic functions, but if you look at the first function listed earlier, the following was sufficient to explain everything that was going on:

(x) = x2

There was no need to give the function a name, such as:

basicParabola (x) = x2

In this sense, lambda functions are unnamed and anonymous.

Lambda functions in Java

This discussion on the etymology of lambda functions is interesting, but the real question is how these mathematical concepts translate into Java.

lambda function in Java

An example of a lambda function in a Java program.

In Java, there are many, many places in which a piece of code needs a single method implementation. And there are many interfaces in the Java API where only a single method needs to be implemented. Also known as functional interfaces, commonly used single-method interfaces include Flushable, Runnable, Callable, Comparator, ActionLIstener, FileFilter, XAConnection and RowSetWriter. Using any of these interfaces in Java can be somewhat cumbersome. For example, Comparator is a functional interface that allows you to rank objects for easy sorting. Code for sorting an array prior to Java 8 would look something like this:

Integer[] numbers = {5, 12, 11, 7};
Arrays.sort(numbers, new Comparator<Integer>() {
   public int compare(Integer a, Integer b) {
      return b - a;
   }             
});

System.out.println(Arrays.toString(numbers));

When you use a lambda function, the verbosity goes away, and the result is this:

Integer[] numbers = {5, 12, 11, 7};
Arrays.sort(numbers, (a, b) -> b-a);
System.out.println(Arrays.toString(numbers));
lambda expression code bloat

An implementation of the Comparator interface with and without a lambda function.

When you first learn to use lambda expressions, sometimes it’s easier to assign the lambda expression to the interface it’s implementing in a separate step. The prior example might read a bit clearer if coded with an additional line:

Integer[] numbers = {5, 12, 11, 7};
Comparator theComparator =  (a, b) -> b-a ;
Arrays.sort(numbers, theComparator); 
System.out.println(Arrays.toString(numbers));

As you can see, lambda expressions make code is much more concise. Lambda expressions make code much easier to write and maintain and have relatively few drawbacks. The only drawback is that the syntax may seem somewhat cryptic to new users. After a little bit of time with lambdas, though, it becomes natural. Developers will wonder how they ever managed to write code without them.

The code for these Lambda expressions in Java example can be found on GitHub.


November 30, 2018  5:21 PM

DeepCode and AI tools poised to revolutionize static code analysis

George Lawton Profile: George Lawton

Developers use static analysis tools to identify problems in their code sooner in the development lifecycle. However, the overall architecture of these tools has only changed incrementally with the addition of new rules crafted by experts. Researchers are now, though, starting to use AI to automatically generate much more elaborate rule sets for parsing code. This can help identify more problems earlier in the lifecycle and provide better feedback.

Some companies, like the game maker Ubisoft, are already working on these kinds of tools internally. A team of researchers at ETH Zurich is now making a similar AI tool available for mainstream adoption, called DeepCode. It analyzes Java, JavaScript and Python code using about 250,000 rules compared to about 4,000 for traditional static analyzer tools. We caught up with Boris Paskalev, CEO at DeepCode, to find out how this works and what’s next.

What experiences and related work informed your decision in using AI to improve software development?

Boris Paskalev: The idea for using AI to improve software came from longer-term research done at the Secure, Reliable, and Intelligent Systems Lab at the Department of Computer Science, ETH Zurich (http://www.sri.inf.ethz.ch). During a period of several years, we explored a number of concepts, built several research systems based on these (some of which are widely used), and received various awards. We observed the enormous impact our technology could have on software construction. As a result, we started DeepCode with the vision of pushing the limit of these AI techniques and bringing those benefits to every software developer worldwide.

How does DeepCode compare with other approaches like static or dynamic analysis in terms of usage, performance, or the kinds of problems it can identify?

Paskalev: DeepCode relies on a creative and non-trivial combination of static analysis and custom machine learning algorithms. Unlike traditional static analysis, it does not rely on manually hardcoded rules, but learns these automatically from data and uses them to analyze your program. This concept of never-ending learning enables the system to constantly improve with more data, without supervision.

DeepCode also enables analysis with zero configuration which means one can simply point their repository to DeepCode which will then provide the results several seconds later, without the need to compile the program or locate all external code. These features are especially desirable in an enterprise setting, where running the code via dynamic analysis or trying to perform standard static analysis can be very time-consuming and difficult.

How does DeepCode fit into the developer workflow, and how does this contrast with other approaches for finding similar bugs, such as identifying a problem in QA or after code is released?

Paskalev: Currently, we optimized DeepCode to report issues at code review time, as this is a serious pain point in the software creation lifecycle. However, it is possible to integrate DeepCode at any step of the lifecycle.

How does DeepCode compare, contrast, and complement JSNice, Nice2Predict, and DeGuard?

Paskalev: JSNice and DeGuard are systems we created which target the specific problem of code layout deobfuscation. DeepCode is a more general system which aims to automatically find a wide range of issues in code. This makes DeepCode applicable not only when trying to understand someone else’s code (e.g. to audit it for security), but also when writing and committing new code.

What other research on using AI to explore bugs have you come across, and how does DeepCode compare and contrast with these?

Paskalev: The field of using AI for code is fairly new but growing. However, we are currently not aware of any system with the capabilities of DeepCode. Unlike other systems that try to use AI methods directly over code, DeepCode is based on AI that is actually able to learn interpretable rules. This means the rules can be examined by a human and easily integrated into an analyzer.

Can you say more about the process of parsing code with the AI tools and building up the rule-set? What kinds of AI or other analytics techniques are used?

Paskalev: DeepCode is based on custom AI and semantic analysis techniques specifically designed to learn rules and other information from code, as opposed to other data (e.g., images, videos) which are less effective when dealing with code.

How do you go about classifying code as a mistake?

Paskalev: Our AI engine learns rules based on patterns that others have fixed in the past and understands what problem it fixed for them based on the commit messages and bug databases. Then, it uses the learned rules to analyze your code, which if they trigger, are reported to the developer.

What have you learned about making recommendations for fixing bugs?

Paskalev: We learned that simply localizing the bug is not enough. The real challenge is to explain the issue and provide an actionable feedback on what the problem actually is. DeepCode connects the report to how others have fixed a similar issue, which is an important step towards that goal.

What languages does it support now, and what is involved in adding support for new ones?

Paskalev: Currently, DeepCode supports Java, JavaScript, and Python. Adding a language requires adding a parser and extending our semantic code analyzer to handle special features of the language. Because of the particular way DeepCode is architected, we can add a language every few months.

How does DeepCode compare differ from traditional static analysis tools?

Paskalev: Static analysis tools available out there often come with a set of hardcoded rules that aim to capture what is considered “bad” in code. Then, they detect these rules in your code. Over the last decade, many companies have created such tools, e.g. Coverity, Grammatech, JetBrains, SonarSource, and others. That type of approach typically gets one to a few thousands of rules across tens of programming languages.

250,000 rules seem like a lot compared to 4,000. Is it the case that it can identify more types of problems, or that it can provide greater granularity in identifying how to rectify an issue, or perhaps a little bit of both? 

Paskalev: We identify many different types of issues than what existing hardcoded rule analyzers cover. We also provide a more detailed explanation what the issue is and how others have fixed a similar problem. This enables users to more quickly figure out what fix they should apply.

What categories of problems does it identify now – is it just different categories of bugs or can it find opportunities for performance improvement?

Paskalev: DeepCode finds bugs, security issues, possible performance improvements and also code style issues. We learn these from commits in open source code and we use natural language processing to understand the issue that the commits fix.

Can DeepCode be used for code or architecture refactoring? Is that something you are looking at doing in the future?

Paskalev: Some of our suggestions are indeed suggesting refactoring of code, but not yet on a project-wide architectural level. Our platform’s utility is to enable any service that requires a deep understanding of your code to be quickly and easily created. We are already scoping the launch of several exciting services that some of our early adopters have asked for.

How do you expect the use and technology of DeepCode to evolve and the use of AI as part of improving developer workflow in general?

Paskalev: Our platform is constantly getting better. This will enable developers to work on much larger projects/scopes with the same or smaller effort while minimizing the risk defects and costly production problems.


November 21, 2018  12:01 AM

Continuous integration benefits: Why adopting a CI/CD tool like Jenkins makes sense

cameronmcnz Cameron McKenzie Profile: cameronmcnz

At a recent Java meetup in downtown Toronto, a few old-school Java EE developers assembled in a birds of a feather session discussing current DevOps trends. A big part of the discussion centered around the DevOps journeys organizations are currently undertaking. The first thing old-school operations and development pros like to question when they TripTik their DevOps roadmap is the value of a continuous integration (CI) tool. It’s probably not a discussion that would interest daily readers of DevOpsAgenda, but for traditional enterprise Java developers, who are slowly moving the big banks and insurance companies forward, the discussion over Jenkins and the benefits of a CI tool, in general, tends to be an enlightening one.

On the night of the meetup, a few of the birds had to return to their nest as the discussion went on long, so I promised I’d recap the finer points of our Jenkins discussion. Jenkins, as we discussed, has helped a variety of medium to large scale enterprises. Here’s how these organizations have streamlined their build, deployment, testing and automation processes and why they went with Jenkins as their CI tool.

tl;dr (Too long, didn’t read)

Jenkins is a simple, straight-forward, open source tool. It compresses the entire development, build, integration and deployment process into an artifact known as a pipeline. This utility and ease of use means it tends to get enthusiastically adopted by developers, testers, operations personnel and even members of the business team.

/tl;dr

What’s the deal with Jenkins? Consistency in deployment

Consistency is one of the biggest benefits from a CI tool like Jenkins, especially when it comes to application deployment. By adding parameters to a Jenkins task, it can be easily configured for different environments and different users. This results in a single Jenkins deployment pipeline that all stakeholders can share. Everyone uses the same build and deploy configuration, and as a result, there is no indispensable employee who has sole control of key steps in the integration process.

And since every user shares the exact same integration pipeline, there are no mysteries in the deployment chain. This eliminates the often opaque steps many organizations follow when kicking off a deployment. And more to the point, it eliminates the wasteful downtime that occurs when poorly documented steps fail, and you don’t have the expertise to fix it.

Jenkins reinforces build integrity

We should note that while every stakeholder uses the same integration and deployment pipeline, that pipeline will be parameterized based on the the role and environment in which a user works. Integration steps, such as switching between source code repository or flipping between a test and production database, are repeated constantly. This reinforces the integrity of the build and eliminating surprises during the deployment process.

The efficient delivery stream

The inevitable result of using Jenkins is a move towards a DevOps-based approach to development and deployment. This factors error-prone manual steps out of the build process. Automation makes deployments more consistent. The end result is an efficient delivery stream that spends less time under operational siege. Deployments happen more frequently, and features and fixes get integrated more often.

Why Jenkins? Here’s what I’ve seen it do

Over the past several years, I’ve watched organizations embark on what C-level executives like to describe as their “DevOps journey.” I simply see it as a way t0 make the build, test and deployment process highly automated, infinitely repeatable and consistent across environments. But if people want to slap a “DevOps journey” label on it, I’m happy to go with it.

There are plenty of moving parts in a DevOps transition, but from what I’ve seen, the most important one is the introduction of a CI tool such as Hudson, Concourse, Bamboo or Jenkins. Jenkins is open source and a continuation of Oracle’s Hudson CI, and it enjoys more community support than all other CI tools combined. Furthermore, enterprise support is available from companies such as Cloudbees if required. Each environment I’ve supported, though, found Jenkins more than sufficient to meet our needs.

The weird thing about Jenkins is how deceptively simple it is, while at the same time, being incredibly powerful.

All Jenkins does is pull together and consolidate build steps. Jenkins can be configured to pull from a source code repository like Git through CodeCommit. It will run and report on the status of unit and code quality tests. And assuming all of the requisite tests pass, Jenkins will build the application using a tool such as Apache Ant, Gradle or Maven. The assembled application — whether it’s a microservice deployed to a container or a portlet deployed to the WebSphere Portal Server — can then optionally be sent to a Nexus or Artifactory repository or moved into production.

SCM integration of builds

While all of this may sound impressive on the surface, it’s actually not that revolutionary. Every organization has a mechanism for moving code into production. Plenty of code has gone into production without using a CI tool. But there is something intrinsically valuable about having all of the deployment steps consolidated within a single tool, with all the underlying data saved in a structured format. It’s easier to track changes and revert back to previous processes if all of the underlying files are stored in a source code repository.

Code quality tool standardization

Jenkins helps standardize the use of various testing tools in the build and development process. And we’re not talking about compulsory testing tools like JUnit or JMock, but code quality tools like SonarCube, problem determination tools like PMD, and even performance and load testing tools. It only takes one developer to make a product like CheckStyle or PMD part of the Jenkins build and check that change into Git. So the next time another developer on the team rebases, their own local Jenkins installation will start using that testing tool as well. Furthermore, the fact that the tests are handled outside of a development tool like Eclipse, and instead run on every commit, rather than every save, tends to make developers more productive, rather than slowing down their creative process.

Pipelines as workflows

We should note that while Jenkins nudges organizations to automate as many build steps as possible, CI pipelines can also be built with hard stops that require manual intervention. In fact, the old way to say Jenkins “pipeline” was workflow. So you can configure a CI build to require a member of the user acceptance testing (UAT) team to log into Jenkins in order for a job to move on to the next stage in the process. Or a member of the operations team can have final say over whether or not an assembled application goes into production or not. Automation is always a goal, but Jenkins doesn’t dismiss the reality that manual deployment steps always exist.

Jenkins will build your DevOps culture

Many evangelists talk about the need to change organizational culture in order to embrace CI tools or perform a DevOps transition. I have always objected to that thinking.

We can define culture as a set of repeated behaviors which groups of loosely associated people perform. Organizations don’t need to change their culture in order to start doing CI and CD. Instead, what I’ve found is that when you introduce good tools like Jenkins to development and operations teams, they embrace the tool and its capabilities, and as a result, the tool changes their behaviors. The culture change isn’t a prerequisite, but instead, is a result of operations and development teams finding a better and more productive way of doing their jobs.

Plenty more to say

Anyways, I promised my Java meetup mates a quick description of integrating Jenkins at various enterprises has helped to make builds more consistent, repeatable and frequent. As you can tell, I’m not very good at “quick descriptions.” And even at this point, there is plenty more I could say about how a good CI tool impacts the software development lifecycle. But I think that covers the basics. If a DevOps transition is in your future, be sure to plot an early Jenkins pitstop on your DevOps roadmap.


November 2, 2018  8:13 PM

Just git reset and push when you need to undo previous local commits

cameronmcnz Cameron McKenzie Profile: cameronmcnz

I published an article recently on how to perform a hard git reset, but one of the questions that I was repeatedly asked on social media was what happens after you do a hard git reset on local commits and then publish the changes to your remote GitHub or GitLab repository? When you do a git reset and push, does the entire commit history get published, including the commits that happened subsequent to the reset point, or are the commits that Git rolled back ignored?

How to git reset hard and push

When working locally, it’s not exactly clear what happens when you git reset to a previous commit and push those commits to a remote repository. So to demonstrate exactly what happens when you git reset and push, I’m first going to create an empty, remote GitHub repository named git-reset-explained. It will contain nothing but a readme.md file, an MIT licence, a .gitignore file and a single commit.

Remote GitHub repository for git reset and push example

Remote repository for git reset and push.

Cloning the git-reset-explained repo

With the remote GitHub repository created, I will locally clone the repo and begin working inside of it.

/c/ git reset hard and push /
$ git clone https://github.com/cameronmcnz/git-reset-explained.git
Cloning into 'git-reset-explained'...
$ cd git*
/c/ git reset hard and push / git reset explained

Creating a local commit history

From within the cloned repo, I will create five new files, adding a new commit each time.

/c/ git reset hard and push / git reset explained
$ touch alpha.html
$ git add . & git commit -m "Local commit #1"
$ touch beta.html
$ git add . & git commit -m "Local commit #2"
$ touch charlie.html
$ git add . & git commit -m "Local commit #3"
$ touch depeche.html
$ git add . & git commit -m "Local commit #4"
$ touch enid.html
$ git add . & git commit -m "Local commit #5"

A call to the reflog shows the history of HEAD as the git commit commands were issued:

/c/ git reset hard and push / git reset explained
$ git reflog
014df6a (HEAD -> master) HEAD@{0}: commit: Local commit #5
6237772 HEAD@{1}: commit: Local commit #4
593794d HEAD@{2}: commit: Local commit #3
b1a6865 HEAD@{3}: commit: Local commit #2
8a3358e HEAD@{4}: commit: Local commit #1
d072c0a (origin/master, origin/HEAD) HEAD@{5}: clone

git reset to a previous commit

Now if I was to perform a hard git reset and shift HEAD to the third local commit, commits 4 and 5 should disappear, right? The git reset should remove those commits from my commit history and take me right back to the reset point, right? Let’s see what actually happens when we issue the command to git reset local commits.

/c/ git reset hard and push / git reset explained
$  git reset --hard 593794d
HEAD is now at 593794d Local commit #3

Now let’s see what the reflog looks like:

/c/ git reset hard and push / git reset explained
$ git reflog
593794d (HEAD -> master) HEAD@{0} reset to 593794d
014df6a HEAD@{1}: commit: Local commit #5
6237772 HEAD@{2}: commit: Local commit #4
593794d (HEAD -> master) HEAD@{3}: commit: Local commit #3
b1a6865 HEAD@{4}: commit: Local commit #2
8a3358e HEAD@{5}: commit: Local commit #1
d072c0a (origin/master, origin/HEAD) HEAD@{6} clone

git reset hard and push

As you can see from the git reflog command, commits 014df6a and 6237772 are still hanging around. When you git reset local commits, those commits don’t disappear.

Knowing Git’s propensity to store everything, this isn’t a particularly unexpected result. The real question is, what would happen if you were to git reset to a previous commit and push to a remote repository? Would the two local commits git leapfrogged over get pushed as well, or would they remain isolated locally? To find out, we simply push to the remote origin:

/c/ git reset hard and push / git reset explained
$ git push origin
Counting objects: 7, done.
To github.com/cameronmcnz/git-reset-explained.git
d072c0a..593794d master -> master

After the push, when we look at the commit history on GitHub, we notice there are only four commits, the server-side commit that created the GitHub repository, and the three local commits that we published. When the git reset and push happened, the fourth and fifth local commits were not pushed to the server, essentially deleting any history of them ever existing.

Performing a git reset and push to remote on GitHub

Result of a git reset hard and remote push.

git reset vs revert

So what did we learn? Well, that when we git reset to a previous commit and push to a remote repository, no trace of the undone commits are published. That’s in stark contrast to a git revert in which the revert command itself creates a new commit, and none of the past commit history is lost. So if you ever want to undo a previous commit with Git, reset is the right Git command to use, not revert.


Want to learn more about Git?

Are you new to Git and interested in learning more about distributed version control? Here are some Git tutorials and Jenkins-Git integration examples designed to help you quickly learn Git, DVCS and other popular DevOps tools such as Jenkins and Maven.

 


November 1, 2018  10:09 PM

To the brave new world of reactive systems and back

Uladzimir Profile: Uladzimir
Uncategorized

Reactivity is surely an important topic, though I believe we’ve spent too much time talking about reactive programming, while only briefly mentioning the other implementation — reactive systems. It’s time to reevaluate.

Reactive systems are of particular attention in today’s IT world, because we need complex, highly distributed applications that can handle a high load.

Before we explore reactive systems, their basic principles and practical application, let’s brush up quickly the core idea of reactivity. 

A quick warm-up on reactivity

The big idea behind reactivity is to create applications that will gracefully deal with modern data that is often fast, high volume, and highly variable.

Reactive systems are not the same thing as reactive programming. Reactive programming is used at the code level, while reactive systems deal with architecture. And they don’t imply the obligatory use of reactive programming.

In 2014, the Reactive Manifesto 2.0 boiled the basic concepts of modern reactive systems into four fundamental principles:

  • Responsiveness: Be available for users and, whatever happens (overload, failure, etc.), be ready to respond to them.
  • Resilience: Stay immune to faults, disruptions, and extremely high loads.
  • Elasticity: Use of resources efficiently and balance the machine performance — vertical scaling up or down — or easily regulate the number of machines involved — horizontal scaling up or down — depending on the load.
  • Message-driven character: Embrace a completely non-blocking communication via sending immutable messages to addressable recipients.

Though the principles often get enumerated as a list, they are closely interrelated and can be modified to read as follows:

A reactive system bases on message-driven communication that guarantees exceptionally loose coupling of the components and thus allows the system’s elasticity and resilience, both of which contribute to its high availability (responsiveness) to a user in case of overloads, disruptions, and failures.

What you need to build a reactive system

Reactive systems are about architectural decisions. There is no need to use a specific language to create a reactive application There’s also no obligatory need to call on any particular framework or tool.

There are frameworks, however, that adhere to the reactive philosophy and make a system’s implementation simpler. For example, you can leverage the benefits of Akka actors, Lagom framework, or Axon.

As we mentioned, reactive systems are based on certain design decisions, some of which are detailed in the book Reactive Design Patterns by Brian Hanafee, Jamie Allen, and Roland Kuhn. We’ll give you a taste of several popular patterns that make a system reactive.

  • Multi-level organization: Level down potentially unstable or dangerous components — ones that are often overloaded or vulnerable to frequent changes or exposed to third parties. So if failures or disruptions occur, there will always be a higher-level component to continue the work or to tell a user that something is wrong.
  • Message queues: Separate data consumers from data producers. Back pressure mechanisms allow a reactive system — not a user — to control the speed of the environment. In order not to shock the server with a massive data flow, the back pressure mechanisms allow the server to extract the messages from the queue at a convenient and safe speed. 
  • Pulse patterns: An accountable server should send health check responses to a responsible server at regular intervals. This prevents the messages from going into the void in case of an unnoticed server failure.
  • Data replication: Different data consistency patterns — active-passive, consensus-based, conflict-free — maintain the system availability in case of failures and crashes in database clusters.
  • Safety locks: Continuously track the state of servers. In case of too many breakdowns or highly increased latency, safety locks automatically exempt the server from the process and let it recover. 

When to switch to a reactive approach

Disclaimer: In the vast majority of cases, building a reactive architecture is rather costly and requires a lot of effort and time. It requires the introduction of mediator components, data replication, etc. If you choose the reactive approach, make sure your application really needs it.

Simply put, you adopt reactive architecture when you need its benefits. You turn to it when mission-critical applications can’t fail or you need to tackle extremely heavy loads. If you build an application with more than 100,000 users and want impeccable UX with increased responsiveness and high availability, then a reactive architecture may be worth it.


November 1, 2018  9:37 PM

How to choose the right virtual reality development engine

charlesdearing Profile: charlesdearing
Uncategorized

The promise of virtual 3D worlds has captivated programmers for decades. Virtual reality (VR), once a faraway fiction, is becoming a reality. Failures like Nintendo’s infamous Virtual Boy are now a distant memory, and major successes like PSVR and Google Cardboard have become the norm. In fact, Statista projects incredible growth for VR, estimating that the market will expand to $40 billion by 2020.

It’s more feasible now than ever to create your own VR applications. The cost to participate in VR, for both consumers and developers, has lowered dramatically in recent years. A plethora of tools is available for new development teams to enter the fray as well.

One of the most important elements to your VR gaming development process is the engine you use to build with. Unless you have unlimited time and resources, it’s in your best interest to use a commercial engine rather than create one yourself.

Develop VR apps with premium engines

Akin to other development environments, there are many free-to-use and open source engines at your disposal. You’ll have plenty of options to choose from, but you’ll need to educate yourself about these tools to ensure you’re making the best project in regards to your specifications.

The Unreal Engine and Unity have been used for a myriad of 3D video games and VR applications. They have been classic choices for video game development and even for mobile app development.

The Unreal Engine is free-to-use and allows development teams to create their own interactive applications at no cost. The caveat, however, is that you’ll have to share a small percentage of your profits with the Unreal team.

How to build VR apps without coding

Interestingly, you can code your entire application with simple logic through Unreal Engine’s Blueprint Visual Scripting functionality. With Blueprints, you can design programmatic actions, methods, and computer behavior without writing a single line of code.

You won’t find this design feature on any other major engine. If you have a design-heavy team, filled with more analytical designers and artists than programmers, you may see the appeal Unreal Engine.

Unity is a developer favorite

Unity, a similar if underpowered engine compared to Unreal, costs a small upfront fee, but you won’t have to pay out any royalties once you’ve finished your application. In order to use Unity, though, you’ll also need to have a team with strong C# skills.

If you don’t have a strong background in C# or the funds to bring on a more experienced C# programmer, you should strongly consider using the Unreal Engine. If your team has the programming ability and design ability, Unity can be a great and relatively low-cost option that sacrifices little in terms of quality.

There are great open source options, too

If you’re looking for the lowest cost possible, you’ll want to investigate completely free engines. Godot may be serviceable, but VR compatibility is not completely assured. You’ll have to devote more time and resources to fit the engine to your needs.

Completely open source VR-ready engines are also available for use. Apertus VR is one such example. It’s a set of embeddable libraries that can easily be inserted into existing projects. Open Source Virtual Reality (OSVR) is another VR framework that can help you begin developing your own games. Both OSVR and Apertus VR are fairly new creations, however, and you may experience bugs and other issues you wouldn’t with Unity or Unreal.

VR applications are incredibly hard work, but with a bit of persistence and some help from experienced developers, you’ll get the hang of the VR development process.

While you can’t control a great deal of what happens within the development process itself, you should make absolutely sure that you select the right engine or VR framework. Take the time to weigh the pros and cons of the available tools available. It’s the most important decision you’ll make in the development process.


November 1, 2018  3:08 PM

How to learn software development tools faster

George Lawton Profile: George Lawton

The rapid pace of innovation in app development tools means developers have a much richer toolbox for writing better code. It also demands a much faster pace of learning. We caught up with Ken Goetz, vice president of Global Training Services at Red Hat, to find out how developers can code at the top of their game.

What is state of the art for teaching developers and employees how to use software or execute particular tasks or workflows?
Ken Goetz: There are lots of industry words to describe modern learning trends, including microlearning, peer-to-peer, blended and on-demand. In today’s age of digital learning, I favor learning that is heavier on experiences vs. lecture, immersive with high production value (e.g. involving students in a compelling story that invokes real-life experiences), and training that is equal to one’s skill level. On this last point, the focus is on building training that is not too easy, which results in boredom, and not too hard, which results in frustration. When it comes to training, failure is actually a good thing.  

What are the best practices, tools, and strategies for learning new programming languages, APIs, or principles?

Goetz: We are finding a growing importance in peer-to-peer learning. If you buy into the 70:20:10 model of learning, then 20% of how we learn is based on socializing the material with others. As training becomes increasingly self-paced, we have gone in the exact opposite direction. So, building avenues for IT professionals to socialize what they’re learning is a growing need.  

What are some of the techniques and useful metrics for assessing the impact of particular learning approaches?

Goetz: Most training programs are stuck in the middle ages of using end of class surveys to measure effectiveness. But this method is flawed, because students haven’t had the chance to evaluate the learning to see how their new skills are helping them back at work. The only way to observe impact from training is to observe it some time after the training was completed. In this way, it’s possible to measure organizational impact. Are the employees more efficient, more effective and has this translated into benefits for the enterprise (e.g. lower turnover, faster app development and less downtime)? We conducted such a survey here at Red Hat in collaboration with the IDC, which demonstrated that Red Hat Training has nearly a 4x ROI over a three year period. It’s thrilling to see how big the return can be from modest investments in training.

Learning how to use a software tool has traditionally been done using a manual, screen recording, or help menu, and the user has to go back and forth between these and the app. Can better guidance be integrated directly into the application experience itself, like interactive macros or screencasts?

Goetz: We are certainly seeing more and more of this as software inevitably gets smarter and better. The exact nature depends on the software and how it works. Offering screencasts from a command line doesn’t create a very seamless integration, but there are all sorts of help functions built into command line applications that have been around for a long time. I have worked on building custom learning that can be accessed directly from within SaaS applications, creating a truly on-demand, moment-of-need learning function. I think these advancements are exciting but largely constrained by the type of software in question.

DevOps and continuous deployment have done a lot on the technical side around making it easier to push out software quickly, but what are enterprises doing to bring similar agility to documentation? How do you keep the screenshots up to date, or automatically incorporate new features or updates into manuals? How can a software company also make this easily available to enterprise customers

Goetz: There are several tools that can help with this process. Keeping screenshots up to date in a DevOps world, where companies like Amazon are deploying new features on a near constant basis, may never be a winning battle. As soon as you publish the screenshot, the screen has changed.

What are some of the different approaches for learning how to code better from within the software itself or for coaching developers a better approach to solving a coding problem after the fact?

Goetz: The major paradigm shift occurring in software development is the use of DevOps principles and processes that allow for continuous check-in of code, automated regression testing and incremental publishing that provide a real-time performance loop for the developer.

What are some of the best practices for measuring the performance of a particular training approach?

Goetz: From its inception, Red Hat has had a strong performance-driven approach to validate training effectiveness. Red Hat’s certification exams are hands-on tests of an individual’s ability to execute critical, real-world tasks. Candidates earning a Red Hat credential have demonstrated not only comprehension but the ability to apply that knowledge in practice. In recent years, these concepts have been extended into the training as well. At the end of each chapter, the hands-on lab presents the learner a practical application and challenges the individual to complete with limited prompting. The student then is able to run a grading script that goes beyond pass/fail to offer sub-task specific feedback of success. This is without question an industry best practice and not something found in most technical training.

Could companies gamify the process of learning to use software, do enterprise processes, or perform creative tasks more efficiently?

Goetz: First, we need to differentiate between gamification — where features of games are used to make learning more immersive — and game-based learning, in which the learning experience is actually a game itself. Both have merit and applicability because they invoke emotional experiences that can positively impact learning. Game designers are some of the best curriculum developers in the world.


November 1, 2018  2:44 PM

Reinhold advocates adding fiber to your Java diet in Oracle Code One keynote

cameronmcnz Cameron McKenzie Profile: cameronmcnz

At last year’s JavaOne, Mark Reinhold, Chief Architect, Java Platform Group, introduced a forthcoming rapid release cadence in which a new version of Java would be produced every six months. The release train would always leave the station on time. If a planned feature wasn’t ready, it’d have to catch the next ride. It was an ambitious promise, especially given the fact that three years and six months separated the release of Java 8 and Java 9.

But despite being ambitious, the plan went off without a hitch.  A year has passed since the original announcement, and in his Java language keynote at  Oracle Code One, you could tell there was a bit of pride, if not subtle boasting, about the fact that the promise was kept.  Java 10 and Java 11 were both released as planned in March and September of 2018.

“A moment of silence for Java 9 please – the last final massive JDK release.”
-Mark Reinhold, Chief Architect, Java Platform Group

Reinhold only briefly discussed some of the new features that got baked into the 2018 releases, with special emphasis on the fact that the rapid releases even included a significant language change, namely the inclusion of the var keyword. “Java 10 shipped on time in March of 2018, it contained merely 12 JEPs, but these weren’t trivial,” said Reinhold. “And Java 11 shipped just a few weeks ago. It contained 17 JEPs along with many bug fixes and enhancements.”

But you could tell that what Reinhold wanted to talk about most, which mapped nicely to what the audience was there to hear about, was the new features and facilities that the future of Java has in store. Reinhold highlighted four projects in particular, namely Amber, Loom, Panama and Valhalla.

Project Amber

According to Reinhold, Project Amber is all about rightsizing language ceremony. In this new era of machine learning and data driven microservices, it’s important for a developer to be able to express themselves clearly and concisely through their code. Project Amber attempts to address that, in a more meaningful way than simply templating boilerplate code.

In his keynote, Reinhold performed a batch of live coding, demonstrating the following three Project Amber features:

  1. Local variable type inference
  2. Raw string literals that don’t require escape sequencing, which is available as a preview feature in the JDK 12 early access release
  3. Switch expressions with enums and type inferences which will enhance case based conditional logic

Project Loom

Working with threads in Java has always been a bit of a mess in Java. From their useless priority settings to the meaningless ThreadGroup, the age old API leaves plenty of room for improvement. “Threads have a lot of baggage,” said Reinhold. “They have a lot of things that don’t make sense in the modern world. In fact, they have a lot of things that didn’t make sense when they were introduced.”

Demoing a recent build of Project Loom, Reinhold demonstrated how easy it is to implement concurrent threads using Fibers, much to the delight of every software developer who has ever struggled with I/O blocking and concurrency issues. A full implementation of this new, lightweight thread construct appears to only be a release or two away.

Project Panama

Panama is the isthmus that connects North and South America, while also providing a canal that connects the Atlantic with the Pacific. Add to that, a project about connecting native code written in languages like C++ and Go with what’s running on the JVM. Panama is about improving the connection between native code, foreign data and Java.

“Many people know the pain of JNI, the Java Native Interface,” said Reinhold. Project Panama promises to make integrating with libraries written in other languages not only easier, but capable of significantly increased performance over the frustratingly throttled JNI bridge.

Project Valhalla

As everyone who codes Java knows, the language will not scale linearly. The JDK will scale, which is why languages like Scala and Kotlin are so popular. But the manner in which Java uses pointers and mutable data means throwing twice as many resources at an application under heavy load will not result in anything near a doubling of the throughput. But all of that is about to change.

“Today’s processors are vastly different than they were in 1995. The cost of a cache miss has increased by a factor 200 and in some cases 1000. Processors got faster but the speed of light didn’t.”
-Mark Reinhold, Chief Architect, Java Platform Group

Project Valhalla introduces value types, a mechanism to allow the data used by Java programs to be managed much more efficiently at runtime. Value types are pure data aggregates that have no identity. At runtime, their data trees can be flattened out into a bytecode pancake. When Project Valhalla is finally incorporated into the JDK, the whole performance landscape will change.

“Chasing pointers is costly,” said Reinhold. Objects have identity, they can be attached to a sync monitor and they have internal state. When your applications create a massive numbers of objects, as do big data systems or artificial neural networks, the impact on performance can be significant.

Like all good magic tricks, Project Valhalla’s is deceivingly simple. All it requires is a single keyword, value, to be added to the class declaration, and this small addition completely changes an instance’s data is managed. There are of course small caveats that go along with the value keyword’s use, but that’s to be expected. In a live coding demonstration, Reinhold added the value keyword to a class that performed matrix calculations, and the result was almost a threefold increase in executed instructions per cycle, along with notable changes with regards to how memory was allocated and how garbage collection routines behaved.

The future of Java holds in store a number of impressive improvements that will make programming easier and the applications we create faster. And the great thing about the new rapid release cadence is the fact that we won’t have to wait for all of these things to be finalized before we get to use them. When features are complete, they’ll trickle their way into the upcoming release cycle, with many of these features only being one or two release cycles away.


November 1, 2018  11:08 AM

Ballerina language promises to improve app integration

George Lawton Profile: George Lawton
Uncategorized

The Ballerina language is new to the programming world. It promises to streamline integration development for enterprise apps. It supports several primitives designed to reduce much of the burden Java developers typically face when coding against APIs through JSON, REST, and XML interfaces.

It’s still early, but the Balerina language has attracted the support of Google, WSO2, Bitnami, and the Apache OpenWhisk group. The companies recently convened in San Francisco for the inaugural Ballerinacon to discuss the open source project and other details that support the language. What will Ballerina mean for Java developers? TheServerSide caught up with Tyler Jewell, CEO of WSO2, to find out.

So why do we need another programming language?

Tyler Jewell: The Ballerina language was born out of frustration with programming frameworks and integration products that embed programming logic within YAML, XML, or other configuration-based files. These approaches disrupted the developer flow, requiring special purpose tools and debuggers that took developers away from focusing on iterative development.

One had to either choose robust, complex, and heavy server products for managing integrations or use a general-purpose language with a framework that varied by programming language and objectives. There has not existed a way to get agility with rapid code development that runs microintegration servers for message brokering, service hosting, and transaction coordination.

Ballerina is an attempt to combine the agility of a type-safe programming language with the syntax of integration sequence diagrams. Once compiled, the resulting binaries embed microengines that perform inline integration semantics such as mediation, orchestration, transformations, asynchrony, event generation and transactions.

Finally, working with the Ballerina language is intended to be cloud-native. The language has constructs that define the architectural environment, so the compiler understands the logical environment the application will be running within. This enables the compiler to generate numerous runtime environment artifacts that are typically generated by continuous integration solutions.

How are organizations using the Ballerina language and who are some of the leading companies adopting or supporting it?

Jewell: Like most languages, the early community takes a number of years to nurture. The first production-ready version of the language was made available in May, and there are numerous early contributors and users. 

Today, the early adoptions have happened in two areas:

  1. Ballerina now powers WSO2’s API microgateway engine, which is for enabling per-API gateway and management. Dozens of WSO2’s enterprise customers are actively deploying and using it.
  2. Early cloud-native and Kubernetes adopters who have complex integration scenarios do not want an ESB within their orchestrator. Ballerina is a more distributed form of an ESB.

We have seen early contributions to the ecosystem from the community and Google, Bitnami, Honeycomb, and Apache OpenWhisk were all early contributors and speakers at yesterday’s Ballerinacon.

There are Fortune 500 companies that have contracted support for deploying Ballerina on major applications, but they are not yet ready to share with the public who they are.

How does the Ballerina language compare and contrast with other languages like Java, Go or Scala?

Jewell: Ballerina’s language design principles are to focus on simplifying issues tied to integrating systems over a network. As such, the core design principles are:

Sequence Diagrammatic

Ballerina’s underlying language semantics were designed by modeling how independent parties communicate via structured interactions. Subsequently, every Ballerina program can be displayed as a sequence diagram of its flow with endpoints, including synchronous and asynchronous calls. The Ballerina Composer is an included tool for creating Ballerina services with sequence diagrams. Sequence diagrams are a reflection of how designers and architects think and document interconnected systems. Ballerina’s syntax is structured to let any tool or system derive a sequence diagram, and subsequently the way a developer thinks when writing Ballerina code encourages strong interaction best practices. This theory is elaborated upon in Sanjiva Weerawarana’s blog.

Concurrency Workers

The Ballerina language’s execution model is composed of lightweight parallel execution units, known as workers. Workers use a full non-blocking policy where no function locks an executing thread, such as an HTTP I/O call awaiting response. These semantics manifest sequence concurrency where workers are independent concurrent actors that do not share state but can interact using messages. Workers and fork/join language semantics abstract the underlying non-blocking approach to enable a simpler concurrency programming model.

Network Aware Type Safety

Ballerina has a structural type system with primitive, object, union, and tuple types. Network systems return messages with different payload types and errors. Ballerina’s type system embraces this variability with an approach based on union types. This type-safe model incorporates type inference at assignment to provide numerous compile time integrity checks for network-bound payloads.

DevOps Ready

Over the past 15 years, best practices and expectations on the associated toolset that a language provides have evolved. Now, a language is not ready for adoption unless it includes unit test framework, build system, dependency management and versioning, and a way to share modules of reusable code. Ballerina includes all of these subsystems as part of its core distribution so that there is no risk of community drift, which is what happens when the ecosystem needs to build tools on top of a language, instead of designing it within the language.

Environment Aware

The Ballerina language and its components are intended to be used within distributed, event-driven architectures. Subsequently, each service written within Ballerina is residing in an environment that may also include legacy services, service meshes, orchestrators, API gateways, identity gateways, message brokers and databases. Ballerina’s language and annotation extension are intentionally environment-aware, treating these other components as syntactical objects and relationships as decorated annotations. By having the language and build system be environmentally aware of other components surrounding our service, we can generate essential artifact code ahead of CI/CD, perform data and integrity checks around network-bound payloads, and pre-package dependent but not yet deployed components as part of the Ballerina binary.

In what ways do you see Ballerina complementing or replacing other programming languages?

Jewell: More than 50% of the time and cost for digital transformation and API projects within enterprises is now integration. When resilient logic and microservices need to be built, Ballerina doesn’t impose much of the scaffolding tied to data formats, network interactions, and resilience. This makes the runtime lighter and developer productivity for coding the solutions higher. Developers will continue to use the languages that they are comfortable with, but we see Ballerina as providing a simpler experience for microservices, API development, system administrator network scripting, and composite development.

What are some of the specific features of the Ballerina language that make it useful for integration?

Jewell: Let me highlight a few of the important ones:

  1. Network-aware type systems make it easier to program data types against remote APIs.
  2. Intuitive mapping of complex data structures into primitive value and union types makes data transformation statically typed and intuitive to follow.
  3. Treating services and endpoints as first-class constructs, by recognizing that network locations and APIs are understood by the compiler and runtime.
  4. Built-in compiler support to layer in failover, retries, circuit breakers, load balancing, and distributed microtransactions for communicating with endpoints.
  5. Broad and native support for a variety of over the wire protocols, enabling service and endpoint abstractions that feel local.

What new concepts would a typical Java developer need to learn to make the most of Ballerina over and above the syntax and grammar?

Jewell: It’s more about what things a Java developer no longer needs to understand! The underlying value type system makes JSON, XML, tables, records, maps, and errors primitives so that they do not require libraries to work with these fundamental data structures. This enables developers to do a lot of data structure manipulation using simple constructs within the source code that is unavailable in most other languages.  

Also, the union type system makes it possible for a type to be multiple different types, such as “string | error,” as working over a network has many situations where a single request can return different kinds of payloads, each suited to a different data structure. So, union types enable someone to make a single request and have the response be mapped into any number of different types. Developers then have to learn some techniques about determining the actual type returned when a union type is present.

What are some of the best practices for learning and implementing Ballerina apps?

Jewell: The language developers have done a great job of providing two sets of examples:

  1. More than 100 Ballerina by Examples … which go into each nuance of the language’s syntax with a complete example.
  2. A growing list of Ballerina by Guides … which provide an end to end development experience for tackling different types of enterprise integration scenarios.

What are some of the current or planned tools and IDEs for the Ballerina language?

Jewell: Ballerina has:

  1. A fairly advanced VS Code plugin that offers a wide range of language server IntelliSense capabilities.
  2. A similar capability for IntelliJ.
  3. The Ballerina Composer, which lets you visually map any Ballerina service as a sequence diagram along with monitoring execution.
  4. A debugger that works with any Ballerina runtime.
  5. Built-in package management, Ballerina Central for sharing packages, package versioning, and package build management.
  6. Test frameworks for running unit tests and mocks for hosted services written in Ballerina.
  7. A documentation framework for auto-generating different types of documentation from within Ballerina code.

How do you expect the future of the Ballerina language to evolve?

Jewell: The Ballerina language is production ready now, but the syntax has not yet achieved a 1.0 lock. The designers are working towards a 1.0 release, where they will then provide long-term backwards compatibility. We hope that this level of stability is reached by the end of the year.

We anticipate the language to continue to evolve and investments being made to optimize its capabilities for stateful services, serverless execution, complex multi-tier compensations, and optimizations for running in large-scale orchestration systems.


Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to: