Coffee Talk: Java, News, Stories and Opinions


July 3, 2017  11:26 AM

Advancing JVM performance with the LLVM compiler

cameronmcnz Cameron McKenzie Profile: cameronmcnz

The following is a transcript of an interview between TheServerSide’s Cameron W. McKenzie and Azul Systems’ CTO Gil Tene.

Cameron McKenzie: I always like talking to Gil Tene, the CTO of Azul Systems.

Before jumping on the phone, PR reps often send me a PowerPoint of what we’re supposed to talk about. But  with Tene, I always figure that if I can jump in with a quick question before he gets into the PowerPoint presentation, I can get him to answer some interesting questions that I want the answers to. He’s a technical guy and he’s prepared to get technical about Java and the JVM.

Now, the reason for our latest talk was Azul Systems’ 17.3 release of Zing, which includes an LLVM-based, code-named Falcon, just-in-time compiler. Apparently, it’s incredibly fast, like all of Azul Systems’ JVMs typically are.

But before we got into discussing Azul Systems’ Falcon just-in-time compiler, I thought I’d do a bit of bear-baiting with Gil and tell him that I was sorry that in this new age of serverless computing and cloud and containers, and a world where nobody actually buys hardware anymore, that it must be difficult flogging a high-performance JVM when nobody’s going to need to download one and to install it locally. Well, anyways, Gil wasn’t having any of it.

Gil Tene: So, the way I look at it is actually we don’t really care because we have a bunch of people running Zing on Amazon, so where the hardware comes from and whether it’s a cloud environment or a public cloud or private cloud, a hybrid cloud, or a data center, whatever you want to call it, as long as people are running Java software, we’ve got places where we can sell our JVM. And that doesn’t seem to be happening less, it seems to be happening more.

Cameron McKenzie: Now, I was really just joking around with that first question, but that brought us into a discussion about using Java and Zing in the cloud. And actually, I’m interested in that. How are people using Java and JVMs they’ve purchased in the cloud? Is it mostly EC2 instances or is there some other unique way that people are using the cloud to leverage high-performance JVMs like Zing?

Gil Tene: It is running on EC2 instances. In practical terms, most of what is being run on Amazon today, it is run as virtual instances running on the public cloud. They end up looking like normal servers running Linux on an x86 somewhere, but they run on Amazon, and they do it very efficiently and very elastically, they are very operationally dynamic. And whether it’s Amazon or Azure or the Google Cloud, we’re seeing all of those happening.

But in many of those cases, that’s just a starting point where instead of getting server or running your own virtualized environment, you just do it on Amazon.

The next step is usually that you operationally adapt to using the model, so people no longer have to plan and know how much hardware they’re going to need in three months time, because they can turn it on anytime they want. So they can empower teams to turn on a hundred machines on the weekend because they think it’s needed, and if they were wrong they’ll turn them off. But that’s no longer some dramatic thing to do. Doing it in a company internal data center? It’s a very different thing from a planning perspective.

But from our point of view, that all looks the same, right? Zing and Zulu run just fine in those environments. And whether people consume them on Amazon or Azure or in their own servers, to us it all looks the same.

Cameron McKenzie: Now, cloud computing and virtualization is all really cool, but we’re here to talk about performance. So what do you see these days in terms of bare iron deployments or bare metal deployments or people actually deploying to bare metal and if so, when are they doing it?

Gil Tene: We do see bare metal deployments. You know, we have a very wide mix of customers, so we have everything from e-commerce and analytics and customers that run their own stuff, to banks obviously, that do a lot of stuff themselves. There is more and more of a move towards virtualization in some sort of cloud, whether it’s internal or external. So I’d say that a lot of what we see today is virtualized, but we do see a bunch of the bare metal in latency-sensitive environments or in dedicated super environments. So for example, a lot of people will run dedicated machines for databases or for low-latency trading or for messaging because they don’t want to take the hit for what the virtualized infrastructure might do to them if they don’t.

But having said that, we’re seeing some really good results from people on consistency and latency and everything else running just on the higher-end Amazon. So for example, Cassandra is one of the workloads that fits very well with Zing and we see a lot of turnkey deployments. If you want Cassandra, you turn Zing on and you’re happy, you don’t look back. In an Amazon, that type of cookie-cutter deployment works very well. We tend to see that the typical instances that people use for Cassandra in Amazon with or without us is they’ll move to the latest greatest things that Amazon offers. I think the i3 class of Amazon instances right now are the most popular for Cassandra.

Cameron McKenzie: Now, I believe that the reason we’re talking today is because there are some big news from Azul. So what is the big news?

Gil Tene: The big news for us was the latest release of Zing. We are introducing a brand-new JIT compiler to the JVM, and it is based on LLVM. The reason this is big news, we think, especially in the JVM community, is that the current JIT compiler that’s in use was first introduced 20 years ago. So it’s aging. And we’ve been working with it and within it for most of that time, so we know it very well. But a few years ago, we decided to make the long-term investment in building a brand-new JIT compiler in order to be able to go beyond what we could before. And we chose to use LLVM as the basis for that compiler.

Java had a very rapid acceleration of performance in the first few years, from the late ’90s to the early 2000s, but it’s been a very flat growth curve since then. Performance has improved year over year, but not by a lot, not by the way that we’d like it to. With LLVM, you have a very mature compiler. C and C++ compilers use it, Swift from Apple is based on its, Objective-C as well, the RAS language from Azul is based on it. And you’ll see a lot of exotic things done with it as well, like database query optimizations and all kinds of interesting analytics. It’s a general compiler and optimization framework that has been built for other people to build things with.

It was built over the last decade, so we were lucky enough that it was mature by the time we were making a choice in how to build a new compiler. It incorporates a tremendous amount of work in terms of optimizations that we probably would have never been able to invest in ourselves.

To give you a concrete example of this, the latest CPUs from Intel, the current ones that run, whether they’re bare metal or powered mostly on Amazon servers today, have some really cool new vector optimization capabilities. There’s new vector registers, new instructions and you could do some really nice things with them. But that’s only useful if you have some optimizer that’s able to make use of those instructions when they know it’s there.

With Falcon, our LLVM-based compiler, you take regular Java loops that would run normally on previous hardware, and when our JVM runs on a new hardware, it recognizes the capabilities and basically produces much better loops that use the vector instructions to run faster. And here, you’re talking about factors that could be, 50%, 100%, or sometimes 2 times or 3 times faster even, because those instructions are that much faster. The cool thing for us is not that we sat there and thought of how to use the latest Broadwell chip instructions, it’s that LLVM does that for us without us having to work hard.

Intel has put work into LLVM over the last two years to make sure that the backend optimizers know how to do the stuff. And we just need to bring the code to the right form and the rest is taken care of by other people’s work. So that’s a concrete example of extreme leverage. As the processor hits the market, we already have the optimizations for it. So it’s a great demonstration of how a runtime like a JVM could run the exact same code and when you put it on a new hardware, it’s not just the better clock speed and not just slightly faster, it can actually use the instructions to literally run the code better, and you don’t have to change anything to do it.

Cameron McKenzie: Now, whenever I talk about high-performance JVM computing, I always feel the need to talk about potential JVM pauses and garbage collection. Is there anything new in terms of JVM garbage collection algorithms with this latest release of Zing?

Gil Tene: Garbage collection is not big news at this point, mostly because we’ve already solved it. To us, garbage collection is simply a solved problem. And I do realize that that often sounds like what marketing people would say, but I’m the CTO, and I stand behind that statement.

With our C4 collector in Zing, we’re basically eliminating all the concerns that people have with garbage collections that are above, say, half a millisecond in size. That pretty much means everybody except low-latency traders simply don’t have to worry about it anymore.

When it comes to low-latency traders, we sometimes have to have some conversations about tuning. But with everybody else, they stop even thinking about the question. Now, that’s been the state of Zing for a while now, but the nice thing for us with Falcon and the LLVM compiler is we get to optimize better. So because we have a lot more freedom to build new optimizations and do them more rapidly, the velocity of the resulting optimizations is higher for us with LLVM.

We’re able to optimize around our garbage collection code better and get even faster code for the Java applications running it. But from a garbage collection perspective, it’s the same as it was in our previous release and the one before that because those were close to as perfect as we could get them.

Cameron McKenzie: Now, one of the complaints people that use JVMs often have is the startup time. So I was wondering if there’s anything that was new in terms of the technologies you put into your JVM to improve JVM startup? And for that matter, I was wondering what you’re thinking about Project Jigsaw and how the new modularity that’s coming in with Java 9 might impact the startup of Java applications.

Gil Tene: So those are two separate questions. And you probably saw in our material that we have a feature called ReadyNow! that deals with the startup issue for Java. It’s something we’ve had for a couple of years now. But, again, with the Falcon release, we’re able to do a much better job. Basically, we have a much better vertical rise right when the JVM starts to speed.

The ReadyNow! feature is focused on applications that basically want to reduce the number of operations that go slow before you get to go fast, whether it’s when you start up a new server in the cluster and you don’t want the first 10,000 database queries to go slow before they go fast, or whether it’s when you roll out new code in a continuous deployment environment where you update your servers 20 times a day so you rollout code continuously and, again, you don’t want the first 10,000 or 20,000 web request for every instance to go slow before they get to go fast. Or the extreme examples of trading where at market open conditions, you don’t want to be running your high volume and most volatile trades in interpreter Java speed before they become optimized.

In all of those cases, ReadyNow! is basically focused on having the JVM hyper-optimize the code right when it starts rather than profile and learn and only optimize after it runs. And we do it with a very simple to explain technique, it’s not that simple to implement, but it’s basically we save previous run profiles and we start a run assuming or learning from the previous run’s behavior rather than having to learn from scratch again for the first thousand operations. And that allows us to run basically fast code, either from the first transaction or the tenth transaction, but not from the ten-thousandth transaction. That’s a feature in Zing we’re very proud of.

To the other part of your question about startup behavior, I think that Java 9 is bringing in some interesting features that could over time affect startup behavior. It’s not just the Jigsaw parts, it’s certainly the idea that you could perform some sort of analysis on code-enclosed modules and try to optimize some of it for startup.

Cameron McKenzie: So, anyways, if you want to find out more about high-performance JVM computing, head over to Azul’s website. And if you want to hear more of Gil’s insights, follow him on Twitter, @giltene.
You can follow Cameron McKenzie on Twitter: @cameronmckenzie

April 5, 2019  9:52 PM

How to install Tomcat as your Java application server

cameronmcnz Cameron McKenzie Profile: cameronmcnz

If you’re interested in Java based web development, you’ll more than likely need to install Tomcat. This Tomcat installation tutorial will take you through the prerequisites, show you where to download Tomcat, help you configure the requisite Tomcat environment variables and finally kick off the Tomcat server and run a couple of example Servlets and JSPs to prove a successful installation.

Tomcat prerequisites

There are minimal prerequisites to install Tomcat. All you need is a version 1.8 installation of the JDK or newer with the JAVA_HOME environment set up, and optionally the JDK’s bin folder added to the Windows PATH. Here is a Java installation tutorial if that prerequisite is yet to be met.

If you are unsure as to whether the JDK is installed — or what version it is — simply open up a command prompt and type java -version. If the JDK is installed, this command will display version and build details.

C:\example\tomcat-install\bin>java -version
java version "1.8.0"
Java(TM) SE Runtime Environment (build pwa6480sr3fp20-20161019_02(SR3 FP20))
IBM J9 VM (build 2.8, JRE 1.8.0 Windows 10 amd64-64 Compressed References 20161013_322271 (JIT enabled, AOT enabled)
J9VM - R28_Java8_SR3_20161013_1635_B322271
JIT - tr.r14.java.green_20161011_125790
GC - R28_Java8_SR3_20161013_1635_B322271_CMPRSS
J9CL - 20161013_322271)
JCL - 20161018_01 based on Oracle jdk8u111-b14

Download Tomcat

You can obtain Tomcat from the project’s download page at Apache.org. Find the zip file that matches your computer’s architecture. This example of how to install Tomcat is on a 64-bit Windows Xeon machine, so I have chosen the 64-bit option.

Unzip the file and rename the folder tomcat-9. Then copy the tomcat-9 folder out of the \downloads directory and into a more suitable place on your files system. In this Tomcat tutorial, I’ve moved the tomcat-9 folder into the C:\_tools directory.

Tomcat Home

Tomcat installation home directory

Tomcat environment variables

Applications that use Tomcat seek out the application server’s location by inspecting the CATALINA_HOME environment variable value. So, create a new environment variable named CATALINA_HOME and have it point to C:\_tools\tomcat

To make Tomcat utilities such as startup.bat and shutdown.bat universally available to command prompts and Bash shells, you can put Tomcat’s \bin directory on the Windows PATH, but this isn’t required.

CATALINA_HOME

Set CATALINA_HOME for Tomcat installation

How to start the Tomcat server

At this point, it is time to start Tomcat. Simply open a Command Prompt in Tomcat’s \bin directory and run the startup.bat command. This will start Tomcat and make it accessible through http://localhost:8080

example@tutorial MINGW64 /c/example/tomcat-9/bin
$ ./startup.bat
Using CATALINA_BASE: “C:\_tools\tomcat-9”
Using CATALINA_HOME: “C:\_tools\tomcat-9”
Using CATALINA_TMPDIR: “C:\_tools\tomcat-9\temp”
Using JRE_HOME: “C:\IBM\WebSphere\AppServer\java\8.0”
Using CLASSPATH: “C:\_tools\tomcat-9\bin\bootstrap.jar;C:\_tools\tomcat-9\bin\tomcat-juli.jar”

After you verify that the Apache Tomcat landing page appears at localhost:8080, navigate to http://localhost:8080/examples/jsp/ and look for the option to execute the Snoop servlet. This Tomcat example Servlet will print out various details about the browser and your HTTP request. Some values may come back as null, but that is okay. So long as the page appears, you have validated the veracity of the Tomcat install.

verify tomcat install

Tomcat installation verification

And that’s it. That’s all you need to do to install Tomcat on a Windows machine.


March 26, 2019  8:38 PM

How to learn new technology in a corporate environment

BobReselman BobReselman Profile: BobReselman

Here’s how it usually goes when it comes to technical training in a corporate environment. A company decides to implement a new technology. The powers-that-be look around to determine if the IT staff has the knowledge and skills necessary to adopt the technology in question. If the determination is found wanting what usually happens is that management will decide to hire a training company to deliver an intensive training session on the technology in question. The length of the session typically run three days to a week, but never longer.

The company sends the employees to the training. The employees get trained. The technology gets implemented.

Right?

Wrong.

And, it’s wrong in so many ways. Allow me to elaborate on how to learn new technology in the corporate world.

Training vs. education

The first and foremost wrongness about the situation described above is that the thing that’s being called technical training isn’t really about training at all. Training is the process of instilling behavior in a subject in response to an event or expectation. Taking the Pavlovian approach, you can train a dog to salivate upon hearing a bell ring. Those of us with kids have gone through the whole process of potty training: getting the child to notify you when the urge strikes.

More advanced training, such as a teenager learning to drive or a pilot landing on an aircraft carrier require more skills and attention, but the end goal remains the same.

While training might be all well and good when it comes to driving or landing a plane, the goal, and the process to achieve it, are both well known. But, when it comes to how to learn new technology, the notion of training doesn’t fully apply.

You can’t train a deployment engineer to create and maintain an efficient Kubernetes cluster any more than you can train a chef to create a dish worthy of a 3-star Michelin rating. The process that gets this to happen is something different altogether. It’s called education.

Most tech requires education

Education takes place on a much broader cognitive landscape than training. Most training is confined to the lower end of the cognitive hierarchy. Education goes deeper, and targets advanced thinking and abstraction. Education wants you to acquire the skills and knowledge necessary to create new ideas and adapt to unusual circumstances.

You cannot train your way to innovation. Training, by nature, is not that concerned with creativity or cleverness. On the other hand, education is. Thus, when you take a look around to see what really matters in IT — creativity, innovation and efficiency, education becomes paramount.

Activities and tasks where IT staff can be trained are just candidates for the next round of automation. If you want your staff to be viable in modern IT, especially when you consider how quickly tech moves forward, then a proper education really counts.

Think about. The term is life-long learning, not life-long training.

Retention is key

Let’s say we accept that fact that for a company to effectively adopt a new piece of technology, its employees must be educated about it, not trained. Then, one might ask, what’s so bad about employees attending week-long intensive sessions? The problem is that it’s very difficult for learners to absorb and retain new information presented in these crash courses over a short period of time. Learning the information is one thing, retaining it is another.

You can be subjected to an intensive class that covers all aspects of a new technology and might even quickly get the hang of the tech. But to retain what you’ve learned, you need to use it every day or it will all slip away. For the process to be effective, it must be continuous.

Sadly, many companies don’t plan properly. Employees will be sent out for training over the course of a week, with no scheduled follow-ups to monitor progress. Some employees might be assigned to immediately use the new technology. Others have to wait months to get a shot at it. By that time, all they would’ve learned will be lost.

Is there a better way for companies to get the most bang for the “training” bucks? Yes, there is.

Save money, but keep your employees educated

The week long, intensive training session has been a conventional standard in the corporate education playbook for years. But does it work? Without some hard data in front of me, it’s hard to say. Nonetheless, maybe it’s time for companies to reconsider its effectiveness.

Now, please know that I say this with some hesitation because I make a portion of my living from these intensive classes. Although, I will say in my defense that I’ve cut back on this work since I came to the realization that there’s a better way.

So, what is this better way on how to learn new technology?

You first need to realize that most IT employees worth their salt are pretty good at learning new technologies on their own. They know how they learn, what books to read, which YouTube videos to watch and have a structured mindset

You’ll also need to realize that all good things take time. This is important, so let me say it again: all good things take time.

There are a limited number of people that can quickly acquire a long-term understanding of a new technology through various means. Most of us require a good deal of ongoing daily exposure and practice with the tech to get good at it.

As a result, I find that the better way to conduct technical education in a corporate environment is to provide employees with the time they need to get competent with the tech at hand.

If it’s an accomplished self-learner, give him or her the time to dabble. If it’s an employee that needs a structured learning experience, do a one-day intensive basics class followed up by once a week sessions that take place over an extended timespan, say three months. These sessions can be led by a third-party expert or by someone in-house. The most important thing is that employees get to work with the tech in a consistent, continuous manner over a long period of time so that they properly retain the information.

The choice is yours. You can continue to send employees to one-week intensive classes with the expectation they will learn everything they need to know for their positions.

Or, you can go with a cost-effective approach that gives employees to the time get a firm grasp on the tech at hand.

Me? I’ll go with the wise spend every time.


March 25, 2019  4:45 PM

How not to write a Git commit message

cameronmcnz Cameron McKenzie Profile: cameronmcnz

I’m working on an article that outlines how to write a good Git commit message, along with a variety of Git commit message conventions and rules that developers should follow. But, as I write about the best practices developers should follow, I constantly find myself in an internal discussion of what developers should not do.

I want the original article to contain a list of best practices, not a list of things not to do. So, I’ve trimmed the Git commit worst practices parts out of it and decided to list them here.

Git commit anti-patterns

What makes a bad Git commit message? What are some things developers shouldn’t do? Here’s my top 10 list:

  1. Don’t go beyond 50 characters in the subject line. It should be easy to succinctly describe any Git commit.

    TAGRI

    They ain’t gonna read your long Git commit message.

  2. Don’t use passive voice or past tense when you annotate commits. Always use the active voice.
  3. Don’t add unnecessary capitalization to the subject line. Standard rules for capitalization aside, only capitalize the first letter of the first word in subject line. Definitely don’t shout in all caps, snake_case or worse of all, SCREAMING_SNAKE_CASE. Also, don’t put a period at the end of the subject line.
  4. Don’t try and format your commit message with superfluous asterisks, ampersands and hash marks.
  5. Don’t forget that someone troubleshooting might use a Unix utility that doesn’t automatically perform text wrapping. Instead, add a carriage return in and around the 70 character mark in the body.
  6. Don’t describe the low-level code you wrote. If someone wants to see the code you wrote, they’ll do a git diff. Commit messages should describe context and purpose, not implementation.
  7. Don’t forget to separate the commit body and the subject line with a full carriage return.
  8. Don’t simply reference a JIRA ticket in your commit. People shouldn’t have to open a bug tracking tool to know why you made a change to the codebase.
  9. Don’t say something nasty about another member of the team, even if you don’t push the branch. Local commits have a tendency to unexpectedly make it into the central code base. A subject line of ‘My idiot team lead made me do this‘ likely won’t go over well in your annual review.
  10. Don’t go on ad nauseam in the body of the commit. No matter how brilliant your prose, the TAGRI rule always applies in the software development world.

What gets your Git goat?

This is a list of the top 10 Git commit mistakes I see fellow developers make, but it is my no means a complete compendium. As a developer, what things do see developers do with Git that really gets your goat? I’d be interested to hear what other developers see in the field, so please share your Git commit horror stories in the comments.


March 18, 2019  3:31 PM

How Instacart works around buggy Elasticsearch queries

George Lawton Profile: George Lawton

Enterprises that use Elasticsearch to find dynamic information in other apps are struggling to identify errant code that stalls enterprise apps. In theory, application performance monitoring tools should help. But, it wasn’t enough for Instacart to identify the queries that consistently created problems for their consumers and shoppers, said John Meagher, senior software engineer, search infrastructure at Instacart.

Simply scaling up their Elasticsearch instance didn’t solve the problem. So, Meagher decided to find a better way to find out what was responsible for their performance issues. As it turned out, a small number of poorly coded Elasticsearch queries were responsible for most of their problems. Once they found a better way to monitor queries, those errors were reduced by 90% and a lot of other problems went away too.

Building a digital grocer

A key element of Instacart’s business was the creation of the world largest and constantly updated digital catalog of grocery items. Consumers can access the catalog through mobile and web apps when they order food from one or more stores. It’s also used to guide shoppers through store aisles who purchase food on behalf of the consumers. The app needs to present a different view of the information to customers and shoppers.

Elasticsearch sits at the core of this whole process. It makes it easy to surface a dynamic view of the available food and presents different options for consumers and shoppers. Instacart standardized its catalog management on top of Elasticsearch because it’s highly scalable in a way that makes it easy to update items and subsequent information. For example, they wanted a platform that allows one team to update nutritional information and description of a product, and also allow the store to update its inventory.

Elasticsearch makes it easy for developers to code logic that dynamically aggregates and generates information in response to complex queries on the fly. However, when Elasticsearch goes down, everyone else’s services do too. Elasticsearch’s catalog features almost 600 million items that are updated about 750 times per second. Its distributed makes it easier to spread queries across clusters. As a result, each cluster only has to handle about 500 queries per second, while the entire infrastructure handles about 15,000 queries per second.

The pain of buggy Elasticsearch queries

Instacart’s main application includes some outdated code from its founding along with code from new developers. As a result, it’s hard to find the buggy code when a problem emerges. “It feels like we are trying to find a needle in the haystack when looking for what is causing a problem in a cluster, only its worse. It’s more like trying to find one needle in a pile of other needles,” Meagher said.

In early 2018, Instacart would see tens of thousands of time out errors per day. Many components of the Instacart app wouldn’t wait for Elasticsearch queries to come back, and they’d time out early. Some of the particularly bad queries would see as low as a 10% success rate, and a few had a 0% success rate. The site would often go down on the weekends during peak shopping periods, and cause major issues for the app.

The biggest problem area with these code issues was a lack of visibility. Instacart developers could see bulk aggregated errors and latency, but couldn’t get the proper visibility into the code that caused the problem. In most cases, Instacart staff would just get reports that Elasticsearch was slow, but they wouldn’t be able to show if their Elasticsearch infrastructure was working or not. And, it was also challenging to see if specific queries or APIs were behind the problems.

Create a bigger picture

Instacart had a variety of tools that provided some part of the big code problem picture, but not the whole thing. They used Kibana to visualize cluster performance, New Relic and other APM tools to track app performance and error reporting tools to look at raw logs. For example, when a bad query hit, it would jam up the queue and all the other queries would slow down. It was hard to find the one at the root of the problem.

Meager led the development of a new type of Elasticsearch monitoring tool, called ESHero, to make it easier to diagnose which queries would create bottlenecks. The tool’s main insight was to provide a way to aggregate information across server applications that were Elasticsearch cluster clients.

They used a collection of Ruby applications that ran on each application server, pulled the data into a central repository and then use machine learning to make sense of it. The tool provided a way to instrument all the calls to the Elasticsearch cluster, and could be further explored via Elasticsearch queries.

An important element of ESHero was to find a way to identity particular queries. However, the challenge is that each query’s payload was slightly different. Meagher’s team found a way to strip out the dynamic information and replace it with an associated query ID with a specific application call. They also added in other data such as collection time and where in the code a query was called from.

Once they finished the first iteration, Meager was surprised to find that the Elasticsearch clusters were basically healthy. They problems, however, were mostly caused by the spillover impact of poorly coded queries.

These insights gave them a way to prioritize development on the worst-performing queries, and to think about ways to retry good ones. For example, a small number of queries dominate shopping patterns and when these stall, so does the user experience. So, the team decided to focus on aggressively retrying the stalled queries, but they found that an arbitrary number of retries is extremely dangerous. If a well-formed query experiences a 10% success rate, further retires can create more problems.

After Meagher’s team identified and fixed the worst code, Instacart went from about 60,000 time outs per day to about 2,000.

Before they started this work, the site went down almost every weekend. “Now when my partner went on paternity leave for three months, I was happy to be on call,” said Meagher. Instacart hasn’t open-sourced ESHero yet, but Meager said he would be happy to work with others interested in the deployment of similar tools in their own organization.


February 28, 2019  12:17 AM

A simple Java Supplier interface example for those new to functional programming

cameronmcnz Cameron McKenzie Profile: cameronmcnz

There are only half a dozen classes you really need to master to become competent in the world of functional programming. The java.util.function package contains well over 40 different components, but if you can garner a good understanding of consumers, predicates, functions, unary types and suppliers, knowledge of the rest of the API simply falls into place. In this functional programming tutorial, we will work through a Java Supplier interface example.

Consumer vs Supplier interfaces

Java’s functional supplier interface can be used any time a function needs to generate a result without any data passed into it. Now, contrast that with Java’s functional Consumer interface which does the opposite. The Consumer interface will pass data, but no result will be returned to the calling program.

Java’s Consumer and Supplier interfaces are functional compliments to one another. If a function needs to both pass data into it and return data as a response, then use the Function interface, because it combines the capabilities of both the Consumer and Supplier interface.

Before we jump into a Java Supplier interface example, it’s a good idea to first reference the JavaDoc in order to see exactly how the API designers describe how the component should be used:

Supplier JavaDoc

java.util.function.Supplier<T>

Type Parameters: T – the type of results supplied by this supplier

@FunctionalInterface
public interface Supplier<T>
This represents a supplier of results. There is no requirement that a new or distinct result be returned when the supplier is invoked.

This functional interface has a single method named get().

As you can see from the JavaDoc, a complete Supplier interface example can be coded simply when you implement the component and override the get() method. The only caveat is that the get() method takes no arguments and returns a valid object as a result. For this Supplier interface tutorial, we will demonstrate how the functional component works by creating a class named RandomDigitSupplier which does exactly what the name implies; it generates a random digit and returns it to the calling program. The range of the randomly generated numbers will be between zero and 10.

supplier interface example

A simple, Java Supplier interface example and test class without Lambda expressions.

Java Supplier interface tutorial

As you can see, the code for the class that implements Java’s Supplier interface is fairly simple. The only requirements are the class declaration and the implementation of the get() method. Within the get() method, the Random class from Java’s util package is used to generate the random digit, but that’s about as complicated as this Java Supplier interface example gets.

package com.mcnz.supplier.example;

import java.util.function.Supplier;
import java.util.*;

/* Java's Functional Supplier interface example */
class RandomDigitSupplier implements Supplier<Integer> {

  @Override
  public Integer get() {
    Integer i = new Random().nextInt(10);
    return i;
  }

}

To test the Supplier interface, we simply code for a loop which creates a new RandomDigitSupplier interface on each iteration, and prints out the random value.

/* Test class for the Java Supplier interface example */
public class SupplierExampleRunner {

  public static void main(String[] args) {
    for (int i = 0; i<10; i++) {
      RandomDigitSupplier rds = new RandomDigitSupplier();
      int randomDigit = rds.get();
      System.out.print(randomDigit + " :: " );
    }
  }
}

A test run of the Supplier interface example generated the following results:

5 :: 4 :: 0 :: 0 :: 9 :: 0 :: 4 :: 2 :: 4 :: 5 ::

Lambda and Supplier interface example

Of course, since Supplier is a functional interface, you can eliminate much of the overhead involved with the creation of a separate class that implements the interface and codes a concrete get() method. Instead, simply provide an implementation through a Lambda expression. This approach allows all of the code above to be condensed into the following:

package com.mcnz.supplier.example;

import java.util.function.Supplier;
import java.util.*;

/* Test class for the Java Supplier interface example */
public class SupplierExampleRunner {

  public static void main(String[] args) {
    for (int i = 0; i<10; i++) {
      //RandomDigitSupplier rds = new RandomDigitSupplier();
      Supplier<Integer> rds = () -> new Random().nextInt(10);
      int randomDigit = rds.get();
      System.out.print(randomDigit + " :: " );
    }
  }
}

As you can see, the use of a Lambda expression greatly reduces the ceremony required to write functional code.

When to use the Java Supplier interface

Upon first glance, developers without any functional programming experience wonder when a component that seems so simple and straight forward is useful. The first answer to that question lies within Java’s functional package, which defines subtypes such as LongSupplier, IntSupplier, DoubleSupplier and BooleanSupplier.

The second answer is Java’s stream package, which represents the heart and soul of functional programming in Java. For example, the of method of the Collector interface takes a supplier as an argument, similar to the ways of the collect and generate methods of the Stream class. As developers get deeper into the world of functional programming in Java, they discover that the Supplier interfaces, along with various other classes that implement it, are peppered throughout the Java API.

Functional programming in Java is a new concept to those entrenched in traditional, server-side development. But, if developers can familiarize themselves with fundamental components such as the Supplier interface, they can begin to understand more advanced concepts and implement them into their code in a relatively easy fashion.


February 20, 2019  4:35 PM

Here’s how to get by without Concurrent Mark Sweep

RamLakshmanann Profile: RamLakshmanann
Uncategorized

As part of JEP-291, the popular Concurrent Mark Sweep garbage collection algorithm has been deprecated by Java Development Kit 9. This decision was made to both reduce the maintenance burden of garbage collection (GC) code and to accelerate new development.

As a result, if you launch an application from Java 9 or later with the -XX:+UseConcMarkSweepGC argument to activate the Concurrent Mark Sweep GC algorithm, you will see the following warning:

Java HotSpot(TM) 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.

Why was Concurrent Mark Sweep deprecated?

If there’s a lot of baggage to carry, especially in Java code, it’s hard to move forward. With Concurrent Mark Sweep, this feature is exactly what led to its demise. It is a highly configurable, sophisticated algorithm and adds a lot of complexities to the GC code base in the Java Development Kit (JDK). However, it’s only when a JDK development team can simplify the GC code base, can it accelerate and innovate in the GC arena.

Let’s explore the number of Java Virtual Machine (JVM) arguments that can be passed to each GC algorithm to to demonstrate how complex they are:

  • Common to all: 50
  • Parallel: 6
  • CMS: 72
  • G1: 26
  • ZGC: 8

There are around 50 GC-related arguments that can be passed to any JVM. On top of these 50 arguments, CMS alone can pass 72 additional arguments, which is a significantly greater number than any other GC algorithms. This increase adds significant coding complexity for a JDK team to support all these arguments.

If you currently use CMS as your GC algorithm, what are your options?

The three most compelling options are:

  1. Switch to the G1 GC algorithm
  2. Switch to the Z GC algorithm (Early access in JDK 11, 12)
  3. Continue with CMS, and deal with the deprecation warnings

Switch to G1 GC algorithm

G1 GC has become the default GC algorithm since Java 9, and you may safely consider an application move to this algorithm. It could provide better performance characteristics compared to Concurrent Mark Sweep, and it’s much easier to tune this algorithm because it contains a smaller number of arguments. Also, it provides options to eliminate duplicate strings from memory. If you can eliminate these duplicate strings, it can help you reduce the overall memory footprint.

Switch to Z GC algorithm

Z GC is a scalable, low-latency garbage collector with a main goal to keep GC pause times less than 10 ms. Early access of the Z GC algorithm is available in Java 11 and 12, so if your application runs on one of those releases, you can consider this as a Concurrent Mark Sweep alternative.

Continue with Concurrent Mark Sweep

For certain applications, Concurrent Mark Sweet delivers better results compared to the G1 GC algorithm, even with a lot of tuning. If you’ve explored the alternatives and seen that Concurrent Mark Sweep provides the best result for your applications, you should just stick with your current GC algorithm.

There have been discussions about keeping Concurrent Mark Sweep alive in the OpenJDK JDK9-dev mailing list, so it may not be gone forever. Certain features and APIs that were deprecated 20 years ago in Java 1.1 are still used in modern day applications, which means that  deprecation isn’t always the end of the road. You can continue to run on Concurrent Mark Sweep, but be aware that it may be removed completely in any future release.

Note that each application is unique and different, and don’t get carried away by journals and other literature that talk about GC tuning and tweaking. When you instrument a new GC setting, you should complete thorough testing, denote benchmark performance characteristics and study key performance indicators to make an informed decision.


February 19, 2019  5:55 PM

A simple Java Function interface example: Learn Functional programming fast

cameronmcnz Cameron McKenzie Profile: cameronmcnz

If you want to master functional programming, the best place to start is with the Java Function interface. This example will show you four different ways to implement this functional interface in your code — starting with how to use an actual class, and how to create very concise code with a lambda function.

The Java Function interface is quite simple. It takes a single Java object as an argument, and returns a single Java object when the method concludes. Any method you can conjure up takes an object and returns an object that fulfills the Java Function contract.

How to use Java’s Function interface

For this Java Function interface example, we will provide a single method named “apply” that takes an Integer as an argument, squares it and returns the result as a String. Before we begin, let’s take a quick look at the official JavaDoc for java.util.function.Function:

java.util.function.Function
Java Interface Function<T,R>

Parameter Types:
T - the input given to the function
R - the result running the function

Popular Subinterface of Function: UnaryOperator<T>

The JavaDoc also indicates that the Function interface has a single, non-default method named apply which takes a single argument and returns a single argument:

R apply(T t)

People often mistake that there is something magical about the interfaces defined in the java.util.functions package, but there’s not. They are just normal interfaces, and as such, we can always create a class that explicitly implements them.

class FunctionClass implements Function<Integer, String> {
  public String apply(Integer t) {
    return Integer.toString(t*t);
  }
}

The FunctionClass defined here implements Function and provides an implementation for the apply method. We could then use this in any class with standard syntax rules.

Function<Integer, String> functionClass = new FunctionClass();
System.out.println(functionClass.apply(2));

When you run the above code, the output would be four.

Java Function interface example

Similarly, we can write a Java Function example that uses an inner class to implement the apply method:

Function<Integer, String> innerClass = new Function<Integer, String>(){
  public String apply(Integer t) {
    return Integer.toString(t*t);
  }
};
System.out.println(innerClass.apply(3));

When you run the inner class example, the output would be nine.

Java’s Function and lambda expression example

Of course, the whole idea of the functional interface is to incorporate lambda expressions into the mix. Here’s the Java Function example implemented with a rather verbose lambda expression:

Function<Integer, String> verboseLambda = (Integer x)-> { return Integer.toString(x*x); };
System.out.println(verboseLambda.apply(5));

This implementation will print out the value 25. Of course, this implementation is also very wordy. With a Java lambda expression, the object type isn’t required on the left hand side, and if the lambda expression is one line long, both the brackets and the return keyword can be omitted. So a more concise lambda expression that implements this Java Function interface would look like this:

Function<Integer, String> conciseLambda = (Integer x)-> { return Integer.toString(x*x); };
System.out.println(conciseLambda.apply(5));

When the code implements Java Function with a concise lambda expression runs, the program prints out 25.

That’s all you really need to know about the java.util.function.Function interface. It is a very simple component that simply requires the implementation of one method — named apply — that is passed in a single object, runs to completion and returns another Java object. It’s just that simple.

Don’t overcomplicate Java’s Function interface

Some people can find the simplicity of the Java Function interface to be a bit confusing. After all, a method that takes something and returns something else seems to be so incredibly vague and abstract that is almost seems meaningless.

However, seasoned Java developers know that sometimes the simplest of language constructs can turn out to be the most powerful, as is the case with object models designed with abstract classes and interfaces. Power through simplicity is exactly the point when it comes to the various components listed in the java.util.function package.

Java Function Example

The Java Function interface is used in many parts of the java.util.function API package

You will run into the Function interface in a variety of places, especially when you start advanced functional programming with the Java Streams API. Powerful methods such as map, reduce and flatMap all take a Java Function as a parameter. So if you plan on any map-reduce programming with Java, Functions will become one of your biggest friends.


February 13, 2019  5:37 PM

Don’t struggle to learn new programming languages

George Lawton Profile: George Lawton

Modern applications developers are often tasked to learn new programming languages and patterns to improve their skills. The classic do-it-yourself approach with books or tutorial videos is great, but it still requires the developer to set up a programming environment to out that newfound knowledge in a more practical setting.

Fahim Ul Haq and Naeem Ul Haq created a new interactive platform called Educative that makes it easier to learn new programming language skills inside of a pre-built development environment. I caught up with Fahim Ul Haq to find out what they have discovered about how developers try to learn new languages faster.

How did you come up with the idea for Educative?

Fahim Ul Haq: The idea for Educative evolved in a couple of phases. We’re obviously developers ourselves, so we really felt the struggle of trying to update our skills with the currently existing tools.

We first dabbled with interactive learning for developers when we launched a mobile app to teach developers as a side project. The app became popular and we would sometimes receive requests from developers to create more content like this. But with our day jobs at Facebook and Microsoft, that wasn’t possible.

Then in 2014, one of the largest publishers in America approached us to write a book for software engineers, building on the app we’d developed. We wanted to create a free companion website with interactive learnings, but the publisher wasn’t interested in that. Even though they rejected the idea, it gave us the inspiration to create a platform where developers could learn interactively.

Once we started exploring the idea and talking to potential authors, we got one unanimous piece of feedback: authors liked the idea of creating interactive training for developers, but it seemed like a lot of work, compared to making a video tutorial.

So we came up with Educative: a platform that provides interactive learning for software developers, powered by an authoring platform that makes it extremely easy to create content.

How does Educative build on the work of other interactive training programs or approaches to provision fully configured training environments like CodeEnvy or what Sensei has done with security training?

Ul Haq: I think all these different solutions come out of a simultaneous recognition of the same need: a developer learning resource that tracks with all the advances in technology we’re seeing today. That’s really the underlying theme here. Educative and the two tools you mentioned are applying similar approaches — but to different niches. That makes it somewhat difficult for us to build directly on their work, but we’re keeping a close eye on them and would love to see what we can learn from each other.

What are your thoughts on how to measure the results of this sort of training method to quantify the speed of learning compared to other approaches?

Ul Haq: This is something we want to do in the very near future. I think it’s just a case of the current metrics we have on demand for our product and the anecdotal evidence for its effectiveness being so strong, that we haven’t yet felt a need to objectively study its benefits. It will become more urgent as we scale up.

What have you learned about on how to organize software training programs that improve the process to learn new programming languages?

Ul Haq: There are no one-size-fits-all solutions for how anyone learns anything, particularly for how developers learn to program. This applies on two levels. The first one is more obvious: trying to learn to code through videos is just frustrating for so many people. There’s an assumption that people progress more or less at the same linear pace — so going back and forth in a video, re-watching parts or skimming through parts, is just so cumbersome. This is where our platform really helps.

However, the second level is that even on our platform there’s a need for different levels of difficulty on the programming problems. For example, some people will inevitably learn quicker, and just find our practice problems too easy. That’s why we’re planning to launch adaptive learning in the future — to put such people on personalized accelerated tracks. As they answer practice problems, we’re able to get data on how they’re performing and adjust the level of problems they’re served up accordingly.

What are the biggest stumbling blocks that developers face when they learn new programming languages?

Ul Haq: I would say they’re largely the same ones anyone faces when trying to learn a new skill. It’s time-consuming and puts you out of your comfort zone. It’s so much easier to just stick with what you know and not pursue further learning. This effect is magnified in the developer world, where the resources for learning new skills are sometimes highly technical and unfriendly. Those are roadblocks that we are working to overcome with Educative.

What advice might you give prospective course authors?

Ul Haq: Just keep it simple. Teach like you talk normally. Sometimes when someone knows a subject backwards and forwards, it’s very easy to forget what it felt like to be a new learner and just start speaking in this abstract jargon. High-level programming languages get very abstract, very quickly. We encourage authors to use real-world examples to put those abstract concepts in a more easy-to-understand context. Ideally, you want your learners to not just know something, but know how to apply it.

How do you expect the kinds of interactive training tools for developers to evolve, not just for Educative, but for software development in general?

Ul Haq: I expect that interactive tools like Educative will become the new norm, not just for customers, but for corporations as well. Too many smart people are investing too much time and money for outdated methods to survive forever. I think that in the future people will be taught by machines that know exactly what kind of content it takes to keep them engaged, and serve up personalized, interactive material to optimize for their growth. It sounds scary, but it just makes too much sense to not do.


February 4, 2019  7:29 PM

A quick look at inferred types and the Java var keyword

cameronmcnz Cameron McKenzie Profile: cameronmcnz

The biggest language change packaged with the Java 10 release, aka JDK 18.3, was the introduction of the inferred type. This addition, combined with the ability to use the long reserved Java ‘var’ keyword in your code, will have a significant impact on how programs are both read and written.

The case for the Java var keyword

Java has always had a weird syntax to declare variables. A manifest type declaration on the left side must polymorphically match up with the object type provided on the left hand side of the equation. This creates a somewhat verbose and dare I say it, clunky syntax for what is an exceptionally common task.

Java declaration

Java variable declaration without the var keyword

A Java var keyword example

As you can see from that simple code snippet, traditionally developed Java code lends itself to verbosity. But with the use of the var reserved work and the type inference, the code can be cleaned up quite a bit.

Java var keyword

The use of Java inferred types with the var keyword.

With this new syntax, the object type does not need to be explicitly declared on the left hand side of the initialization. Instead, the object type can simply be inferred if you look at the right hand side of the equation, thus the term inferred type. Of course, the right hand side of the equation always has the final say on what type of object is created, so this Java 10 feature doesn’t really change how the Java language works, nor will it have any impact on how code will be interpreted.

In the end, the language change simply drives towards the goal to make Java, a language often criticized for being far too verbose, more readable.


January 31, 2019  7:49 PM

Compare, contrast your image recognition tool options

YanaYel1na Profile: YanaYel1na
Uncategorized

If you’ve been presented with an opportunity to work with machine learning tools with advanced image recognition functionality, you’d be wise not to pass it up, even if you’re new to this technology. An array of high-profile tech giants have developed their image recognition tools for developer use, and without the need to build a neural network from scratch.

Here’s an overview of three mature image recognition and detection tools from some tech giants for you to consider, and help choose the optimal one to meet your development needs.

Google Cloud Vision

With Google’s visual recognition API, you can easily add advanced computer vision functionality to your application:

  • Face, landmark and logo detection helps recognize multiple faces and related attributes such as emotions or headwear (note that facial recognition is not supported here), natural and handmade structures as well as product logos within one picture. A user can perform image analysis on a file located in Google Cloud Storage or on the web.
  • Optical character recognition (OCR) can be used to spot and extract text within a file of various formats, from PDF and TIFF to PNG and GIF. The tool also automatically identifies a vast array of languages and can detect handwriting.
  • Label detection and content moderation allows a user to establish categories and also spot explicit material — such as adult or violent content — within an image.
  • Object localizer and image attribute functionality helps identify the exact place and type of object in an image as well as detect its general attributes such dominant colors or cropping vertices.

After you enable the Cloud Vision API for your project, a user can start to implement it with a variety of programming languages via client libraries. The image recognition tool also offers AutoML Vision, which lets you train high-quality custom machine learning models without the need for prior experience.

Clarifai

Clarifai’s API is another image recognition tool that doesn’t require any machine learning knowledge prior to implementation. It can recognize images and also perform thorough video analysis.

A user can start to make image or video predictions with the Clarifai API after they specify a parameter. For example, if you input a “color” model, the system will provide predictions about the dominant colors in an image. You can either use Clarifai’s pre-built models or train your own one.

Clarifai video analysis processes one video frame per second, and provides a list of predicted concepts for every second of video. The user will need to input the parameter to begin, and split a video into different components if it exceeds maximum size limits.

Clarifai also offers additional tools for further experimentation and analysis. Explorer is a web application where you can introduce additional inputs, preview your applications and also create and train new models with your own images and concepts. The Model Evaluation tool can provide relevant performance metrics on custom-built models.

Amazon Rekognition

Amazon Rekognition is another image recognition tool to consider. Rekognition provides similar functionality as its counterparts, and also adds in facial comparison and celebrity recognition from a variety of pre-built categories, such as entertainment, business, sports and politics.

With Rekognition Image, the service can measure the likelihood of a face appearing in multiple pictures, and also verify a user against a reference photo in near real time.

Apart from image recognition, Amazon also offers near real-time analysis of streaming video. The system automatically extracts rich metadata from Amazon Rekognition Video and outputs it to a Kinesis data stream to detect objects and faces, create a searchable video library and carry out content moderation.

Which tool should you choose?

Each tool provides its own set of features that can potentially meet your image recognition demands. Here is a chart that compares Cloud Vision, Clarifai and Rekognition on several important parameters.

  Google Cloud Vision Clarifai Amazon Rekognition
Face analysis
Facial recognition x
Object and label detection
Explicit content identification
OCR
Video analysis and scene recognition x
Activity detection x
Image attributes
Client libraries Python, Ruby, PHP, C#, Java, Go, Node.js, Objective-C, Swift Python, Ruby, PHP, C#, Java, JavaScript, Objective-C, Haskell, R Python, Ruby, PHP, Java,  JavaScript, Node.js, .Net
OS Linux, macOS, Windows, iOS, Android Linux, macOS, Windows, iOS, Android, IoT

 

Linux, macOS, Windows, iOS, Android, IoT
Multilingual

The image recognition tool space is crowded with tools that can potentially enhance your product. Weigh all of your options and compare their different features before you make a decision. If one of these tools doesn’t fit, consider some alternatives such as Watson Visual Recognition from IBM or Ditto Labs.


Yana Yelina is a Technology Writer at Oxagile, a custom software development company with a focus on building machine learning solutions. Her articles have been featured on KDNuggets, ITProPortal, Business2Community, to name a few. Yana is passionate about the untapped potential of technology and explores the perks it can bring businesses of every stripe. You can reach Yana at yana.yelina@oxagile.com or connect via LinkedIn or Twitter.


Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to: