Coffee Talk: Java, News, Stories and Opinions


July 3, 2017  11:26 AM

Advancing JVM performance with the LLVM compiler

cameronmcnz Cameron McKenzie Profile: cameronmcnz

The following is a transcript of an interview between TheServerSide’s Cameron W. McKenzie and Azul Systems’ CTO Gil Tene.

Cameron McKenzie: I always like talking to Gil Tene, the CTO of Azul Systems.

Before jumping on the phone, PR reps often send me a PowerPoint of what we’re supposed to talk about. But  with Tene, I always figure that if I can jump in with a quick question before he gets into the PowerPoint presentation, I can get him to answer some interesting questions that I want the answers to. He’s a technical guy and he’s prepared to get technical about Java and the JVM.

Now, the reason for our latest talk was Azul Systems’ 17.3 release of Zing, which includes an LLVM-based, code-named Falcon, just-in-time compiler. Apparently, it’s incredibly fast, like all of Azul Systems’ JVMs typically are.

But before we got into discussing Azul Systems’ Falcon just-in-time compiler, I thought I’d do a bit of bear-baiting with Gil and tell him that I was sorry that in this new age of serverless computing and cloud and containers, and a world where nobody actually buys hardware anymore, that it must be difficult flogging a high-performance JVM when nobody’s going to need to download one and to install it locally. Well, anyways, Gil wasn’t having any of it.

Gil Tene: So, the way I look at it is actually we don’t really care because we have a bunch of people running Zing on Amazon, so where the hardware comes from and whether it’s a cloud environment or a public cloud or private cloud, a hybrid cloud, or a data center, whatever you want to call it, as long as people are running Java software, we’ve got places where we can sell our JVM. And that doesn’t seem to be happening less, it seems to be happening more.

Cameron McKenzie: Now, I was really just joking around with that first question, but that brought us into a discussion about using Java and Zing in the cloud. And actually, I’m interested in that. How are people using Java and JVMs they’ve purchased in the cloud? Is it mostly EC2 instances or is there some other unique way that people are using the cloud to leverage high-performance JVMs like Zing?

Gil Tene: It is running on EC2 instances. In practical terms, most of what is being run on Amazon today, it is run as virtual instances running on the public cloud. They end up looking like normal servers running Linux on an x86 somewhere, but they run on Amazon, and they do it very efficiently and very elastically, they are very operationally dynamic. And whether it’s Amazon or Azure or the Google Cloud, we’re seeing all of those happening.

But in many of those cases, that’s just a starting point where instead of getting server or running your own virtualized environment, you just do it on Amazon.

The next step is usually that you operationally adapt to using the model, so people no longer have to plan and know how much hardware they’re going to need in three months time, because they can turn it on anytime they want. So they can empower teams to turn on a hundred machines on the weekend because they think it’s needed, and if they were wrong they’ll turn them off. But that’s no longer some dramatic thing to do. Doing it in a company internal data center? It’s a very different thing from a planning perspective.

But from our point of view, that all looks the same, right? Zing and Zulu run just fine in those environments. And whether people consume them on Amazon or Azure or in their own servers, to us it all looks the same.

Cameron McKenzie: Now, cloud computing and virtualization is all really cool, but we’re here to talk about performance. So what do you see these days in terms of bare iron deployments or bare metal deployments or people actually deploying to bare metal and if so, when are they doing it?

Gil Tene: We do see bare metal deployments. You know, we have a very wide mix of customers, so we have everything from e-commerce and analytics and customers that run their own stuff, to banks obviously, that do a lot of stuff themselves. There is more and more of a move towards virtualization in some sort of cloud, whether it’s internal or external. So I’d say that a lot of what we see today is virtualized, but we do see a bunch of the bare metal in latency-sensitive environments or in dedicated super environments. So for example, a lot of people will run dedicated machines for databases or for low-latency trading or for messaging because they don’t want to take the hit for what the virtualized infrastructure might do to them if they don’t.

But having said that, we’re seeing some really good results from people on consistency and latency and everything else running just on the higher-end Amazon. So for example, Cassandra is one of the workloads that fits very well with Zing and we see a lot of turnkey deployments. If you want Cassandra, you turn Zing on and you’re happy, you don’t look back. In an Amazon, that type of cookie-cutter deployment works very well. We tend to see that the typical instances that people use for Cassandra in Amazon with or without us is they’ll move to the latest greatest things that Amazon offers. I think the i3 class of Amazon instances right now are the most popular for Cassandra.

Cameron McKenzie: Now, I believe that the reason we’re talking today is because there are some big news from Azul. So what is the big news?

Gil Tene: The big news for us was the latest release of Zing. We are introducing a brand-new JIT compiler to the JVM, and it is based on LLVM. The reason this is big news, we think, especially in the JVM community, is that the current JIT compiler that’s in use was first introduced 20 years ago. So it’s aging. And we’ve been working with it and within it for most of that time, so we know it very well. But a few years ago, we decided to make the long-term investment in building a brand-new JIT compiler in order to be able to go beyond what we could before. And we chose to use LLVM as the basis for that compiler.

Java had a very rapid acceleration of performance in the first few years, from the late ’90s to the early 2000s, but it’s been a very flat growth curve since then. Performance has improved year over year, but not by a lot, not by the way that we’d like it to. With LLVM, you have a very mature compiler. C and C++ compilers use it, Swift from Apple is based on its, Objective-C as well, the RAS language from Azul is based on it. And you’ll see a lot of exotic things done with it as well, like database query optimizations and all kinds of interesting analytics. It’s a general compiler and optimization framework that has been built for other people to build things with.

It was built over the last decade, so we were lucky enough that it was mature by the time we were making a choice in how to build a new compiler. It incorporates a tremendous amount of work in terms of optimizations that we probably would have never been able to invest in ourselves.

To give you a concrete example of this, the latest CPUs from Intel, the current ones that run, whether they’re bare metal or powered mostly on Amazon servers today, have some really cool new vector optimization capabilities. There’s new vector registers, new instructions and you could do some really nice things with them. But that’s only useful if you have some optimizer that’s able to make use of those instructions when they know it’s there.

With Falcon, our LLVM-based compiler, you take regular Java loops that would run normally on previous hardware, and when our JVM runs on a new hardware, it recognizes the capabilities and basically produces much better loops that use the vector instructions to run faster. And here, you’re talking about factors that could be, 50%, 100%, or sometimes 2 times or 3 times faster even, because those instructions are that much faster. The cool thing for us is not that we sat there and thought of how to use the latest Broadwell chip instructions, it’s that LLVM does that for us without us having to work hard.

Intel has put work into LLVM over the last two years to make sure that the backend optimizers know how to do the stuff. And we just need to bring the code to the right form and the rest is taken care of by other people’s work. So that’s a concrete example of extreme leverage. As the processor hits the market, we already have the optimizations for it. So it’s a great demonstration of how a runtime like a JVM could run the exact same code and when you put it on a new hardware, it’s not just the better clock speed and not just slightly faster, it can actually use the instructions to literally run the code better, and you don’t have to change anything to do it.

Cameron McKenzie: Now, whenever I talk about high-performance JVM computing, I always feel the need to talk about potential JVM pauses and garbage collection. Is there anything new in terms of JVM garbage collection algorithms with this latest release of Zing?

Gil Tene: Garbage collection is not big news at this point, mostly because we’ve already solved it. To us, garbage collection is simply a solved problem. And I do realize that that often sounds like what marketing people would say, but I’m the CTO, and I stand behind that statement.

With our C4 collector in Zing, we’re basically eliminating all the concerns that people have with garbage collections that are above, say, half a millisecond in size. That pretty much means everybody except low-latency traders simply don’t have to worry about it anymore.

When it comes to low-latency traders, we sometimes have to have some conversations about tuning. But with everybody else, they stop even thinking about the question. Now, that’s been the state of Zing for a while now, but the nice thing for us with Falcon and the LLVM compiler is we get to optimize better. So because we have a lot more freedom to build new optimizations and do them more rapidly, the velocity of the resulting optimizations is higher for us with LLVM.

We’re able to optimize around our garbage collection code better and get even faster code for the Java applications running it. But from a garbage collection perspective, it’s the same as it was in our previous release and the one before that because those were close to as perfect as we could get them.

Cameron McKenzie: Now, one of the complaints people that use JVMs often have is the startup time. So I was wondering if there’s anything that was new in terms of the technologies you put into your JVM to improve JVM startup? And for that matter, I was wondering what you’re thinking about Project Jigsaw and how the new modularity that’s coming in with Java 9 might impact the startup of Java applications.

Gil Tene: So those are two separate questions. And you probably saw in our material that we have a feature called ReadyNow! that deals with the startup issue for Java. It’s something we’ve had for a couple of years now. But, again, with the Falcon release, we’re able to do a much better job. Basically, we have a much better vertical rise right when the JVM starts to speed.

The ReadyNow! feature is focused on applications that basically want to reduce the number of operations that go slow before you get to go fast, whether it’s when you start up a new server in the cluster and you don’t want the first 10,000 database queries to go slow before they go fast, or whether it’s when you roll out new code in a continuous deployment environment where you update your servers 20 times a day so you rollout code continuously and, again, you don’t want the first 10,000 or 20,000 web request for every instance to go slow before they get to go fast. Or the extreme examples of trading where at market open conditions, you don’t want to be running your high volume and most volatile trades in interpreter Java speed before they become optimized.

In all of those cases, ReadyNow! is basically focused on having the JVM hyper-optimize the code right when it starts rather than profile and learn and only optimize after it runs. And we do it with a very simple to explain technique, it’s not that simple to implement, but it’s basically we save previous run profiles and we start a run assuming or learning from the previous run’s behavior rather than having to learn from scratch again for the first thousand operations. And that allows us to run basically fast code, either from the first transaction or the tenth transaction, but not from the ten-thousandth transaction. That’s a feature in Zing we’re very proud of.

To the other part of your question about startup behavior, I think that Java 9 is bringing in some interesting features that could over time affect startup behavior. It’s not just the Jigsaw parts, it’s certainly the idea that you could perform some sort of analysis on code-enclosed modules and try to optimize some of it for startup.

Cameron McKenzie: So, anyways, if you want to find out more about high-performance JVM computing, head over to Azul’s website. And if you want to hear more of Gil’s insights, follow him on Twitter, @giltene.
You can follow Cameron McKenzie on Twitter: @cameronmckenzie

June 13, 2019  3:35 PM

How to troubleshoot a JVM OutOfMemoryError problem

RamLakshmanann Profile: RamLakshmanann
Uncategorized

There aren’t any magical tools that will fix an OutOfMemoryError for you, but there are some options available that will help automate your ability to troubleshoot and identify the root cause.

Follow these three steps to deal with this JVM memory error and get on the way to recovery:

  1. Capture a JVM heap dump
  2. Restart the application
  3. Diagnose the problem

1. Capture the heap dump

A heap dump is a snapshot of what’s in your Java program’s memory at a given point in time. It contains details about objects that are present in memory, actual data that is present within those objects, references how those objects maintain to others objects and other information. A heap dump is a vital step to fix an OutOfMemoryError, but they do present some challenges, as their contents can be difficult to read and decipher.

In an optimal situation, you want to capture a heap dump at the moment of or just prior to an OutOfMemoryError to diagnose the cause, but this isn’t exactly easy. However, you can automate this heap dump process. Tell the JVM to create a heap dump by editing the JRE’s startup parameters with the following variables:

-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/crashes/my-heap-dump.hprof

2. Restart the troublesome application

Most of the time, an OutOfMemoryError won’t crash the application, but it could put the application in an unstable state. A restart would be a prudent move in this situation, since requests served from an unstable application instance will inevitably lead to an erroneous result.

And, you can automate this restart process as well. Simply write a “restart-myapp.sh” script, which will bounce your application. Provide command line arguments to the JVM that trigger it to run the following script when you encounter the exception:

-XX:OnOutOfMemoryError=/scripts/restart-myapp.sh

When you pass this argument, the JVM will invoke “/scripts/restart-myapp.sh” script whenever OutOfMemoryError is thrown. Thus, your application will be automatically restarted right after it experiences an OutOfMemoryError.

3. Diagnose the problem

Now that you have captured the heap dump — which is needed to troubleshoot the problem — and restarted the application — to reduce the outage impact — the next step is troubleshooting.

As mentioned above, understanding the contents of a heap dump can be tricky, but there are helpful heap analyzer tools that help simplify the process. Some options include Eclipse Memory Analyzer (MAT), Oracle JHat or HeapHero.

These tools generate a memory analysis report, highlight the objects that cause the most memory and hopefully help identify objects that create a memory leak.

It can be extremely frustrating when your applications encounter a runtime error. You’ll need patience, a memory heap dump and the proper tools to analyze the problem to fix the OutOfMemoryError and other pesky exceptions of a similar ilk.


June 10, 2019  3:31 PM

How to deal with a remote code execution vulnerability

JudithMyerson Profile: JudithMyerson
Uncategorized

Visual Studio Code is a free source code editor developed by Microsoft for Windows, Max OS and Linux. On February 12, 2019 Symantec Security Center found a serious remote code execution vulnerability (CVE-2019-0728) in MS Visual Studio Code. This vulnerability ties into another one back in June of 2018, when an untrusted search path vulnerability (CVE-2018-0597) was reported.

In April 2019, Linux was made available as a snap that can be used to run across over 40 Linux distribution variations. The editor comes with Git built in to help developers manage version control in DevOps when the source code is ready for deployment to a production server. The source code is a type of server-side script that can only be compiled on the server.

Remote code execution vulnerability severity   

Both remote code execution vulnerabilities create a total loss of confidentiality, integrity and availability. They come with a Common Vulnerability Scoring System 3.0 rating of 7.8 on a 0-10 scale.

The first vulnerability could allow an unauthorized attacker to execute arbitrary code in the context of the current user. A successful defense of an attack would require a user to take some action before the vulnerability can be exploited, such as the installation of a malware extension to the code. Failed exploit attempts will likely result in denial of service conditions.

The second vulnerability could allow the attacker to gain privileges via a Trojan DLL in an unspecified directory.

A programming-savvy attacker could target the SecurityHeaders, which sends a report on HTTP security headers and server information back to a local browser. An attacker could exploit the default value broadcast for the IIS version in use as such:

Server:  Microsoft-IIS/8.0

where Server is the HTTP server header and Microsoft-IIS/8.0 is the default value.

The attacker could also exploit the preloading of the HTTP Strict Transport Security security header protocol. When the preload directive is added to the security header, all subdomains are included for a specified period of time. The main risk associated with this vulnerability is that the specified period of time setting could be up to a year. And, a developer wouldn’t be able to shorten this setting to 90 days to fix the subdomain problems, and an update may not be able to propagate until after the original maximum time directive expires.

Remote code execution vulnerability risk mitigation steps

Here are some recommendations on how to mitigate the latent remote code execution vulnerabilities.

Automatic downloads: Set up a default setting of automatic downloads of Visual Studio Code updates.

Access rights: Grant minimal access rights to individuals and team members — such as read only, read and write. Avoid allowing members, except the administrator leader, to have full access rights.

Network traffic: Run network intrusion detection system IDSs to monitor network traffic for malicious activity that may occur after an attacker exploits the Visual Studio Code vulnerabilities. Ensure IDSs are free of vulnerabilities as well.

Analysis report: After you implement the HTTP response headers as mentioned above, follow these three steps to receive an analysis report. First, transfer the latest version of a script from a local machine to a server. Second, enter any website address in a local browser to implement HTTP response headers in the script. And third, head over to Security Headers or another website to analyze the report sent back to the browser. An overall grade is included for all security headers, the report discloses server information by default and doesn’t provide warnings on the risks if you use the preloading list in a HTTP security header.

Server information: Avoid broadcasting default server information. IIS 8.0 software allows the developer to add a new value to the script like Web.config before it deploys to the server. The report from SecurityHeaders should show the new information like this:

Server: Hello World!

To suppress the HTTP server header from sending to a local browser, the developer should use IIS 10, which is shipped with Windows 10, Windows Server 2016 and other options. You only need one code line in the script to suppress the header, the removeServerHeader attribute, which can be set to true.

<security>

<requestFiltering removeServerHeader=”true” />

</security>

Compiler language used to run the script is one of the VB and C variants. Non-window platforms may not have the capability to remove or suppress the HTTP security header.

Preloading list: Exclude the preload directive from the HTTP Strict Transport Security header to avoid preloading a list of all subdomains.  The max-age directive is expressed in seconds (one year).

<customHeaders>

<add name=”X-Xss-Protection” value=”1;mode=block” />

<add name=”X-Frame-Options” value=”sameorigin” />

<add name=”Strict-Transport-Security” value=”max-age=31536000″ />

</customHeaders>

If a preloaded list is used, start with a lower maximum age expiry time — 30 days — to make sure all the subdomains have HTTPS support. It’s better to wait until the time frame expires in 30 days than in a year to fix a problem.

Alternatively, use an HTTPS front end for an HTTP-only server — which should be done before you secure the back-end server.


May 28, 2019  9:44 PM

Why is programming so hard to master?

BobReselman BobReselman Profile: BobReselman

Why is programming so hard? Because it’s no longer about programming.

Allow me to elaborate.

I wrote my first line of professional code back in 1987. It was an application written in BASIC that did lease calculations for computer rentals. (Yes, back then computers were so expensive it made sense to lease them by the month. Today, we practically give them away.) The program worked when you selected a computer from a list, provided the number of months for the lease terms and the program calculated the monthly payments. The program also had a feature that allowed you to print a hard copy of the results.

In terms of the work I had to do, 90% of my effort was the actual programming. The remaining 10% involved creating the executable file, copying it onto floppy disk and then installing the code on the computers of the other people in my office.

It took me about a week to write the program. Admittedly, it wasn’t exactly rocket-science programming and when I look back it, wasn’t very good programming either. But, it worked and I got paid — win-win, so to speak.

Fast forward 30 years to today. Last week I wrote a program for a class I teach. The program is called WiseSayings. It’s a web app that responds upon request with a random saying from a list of wise sayings.

Wisesayings connection

Connection to Wisesayings

It took me about 30 minutes to write the code, including application data retrieval and configuration. Yet, just programming the app wasn’t enough. Here’s just the beginning as to why is programming so hard. Containers are very popular these days, so I had to create the Dockerfile that allows users to run WiseSayings in a Docker container.

But, there was more. Not only did I need to create the Dockerfile, but I also had to post the container image on DockerHub to make it easier for others to use. This means an image build, followed by a push after I logged into my DockerHub account.

So far, so good, right? Wrong!

As an ambitious coder, I imagined that millions of people will want to use my app. So, I need to make it easy to scale, and that WiseSayings can be run under Kubernetes. I wrote a deployment.yaml to create the Pod and ReplicaSet so my containerized WiseSays app will run in the cloud and, at the least, a service.yaml to provide web access to the logic in the pods from outside the Kubernetes cluster.

If I want to provide security and routing, I’ll need to create a Kubernetes secret or two, a TLS certificate and an ingress.yaml to manage it all. I could go on. We haven’t even talked about web page creation to render my application’s response, nor have we talked about multiple-language support for the app. Who knows, maybe some of my anticipated millions of users will be in China.

How things have changed

My main point is this on why is programming so hard: 30 years ago, all I had to know to create a program was the programming language BASIC and how to structure code into subroutines — which is what we called functions and methods back then. Printing was a bit harder because printer drivers weren’t part of the operating system and your programs needed to know a whole lot about the printers they used. But, that was it. Most of my work revolved around how to express the specific application logic in code.

Today, to create my little WiseSayings app, not only do I need to know a programming language — in this case, JavaScript that runs under Node.js — but I also need to have a basic understanding of how the internet works, as well as how to fiddle with things such as status codes and all the other name-value pairs I can stuff into an HTTP header. Then, I need to know Docker and the basics of Kubernetes. I’d also like to add that there isn’t much in the basics of Kubernetes that’s actually basic. When you work with any Kubernetes API resource, it takes time to really master, even for something as fundamental as a pod.

Now you can really see why is programming so hard.

It still takes about half an hour to write the actual code and get it up on GitHub, but I now add hours to make my program available to my users. My old means of distribution involved copying the executable’s file on to a floppy disk and walking over to a user and copying that file from disk on to the desktop computer. What used to take minutes for a local code distribution has now transitioned into what now makes up the bulk of my “programming” activity, regardless of whether the code goes to a user on the other side of the office or halfway around the world.

Now, don’t get me wrong. Under no circumstance do I want to go back to the days of BASIC and floppy disks. The programs we make today go way beyond anything I could have imagined 30 years ago when I did BASIC programming on an IBM AT running DOS 3.3. I think it’s beyond cool that we’ve made it so you can point your cellphone camera at a newspaper and have the device read the text out loud to you in real time. I like watching the Merchant of Venice any time I want on YouTube with scene summaries available on my iPad. (Yes, sometimes I find it hard to follow the language of the Bard.)

These are amazing achievements, but they come at a price. While commercial software has always required the coordinated efforts of many, these days, even the simple stuff is hard and the implications are profound.

In the old days, knowledge of a programming language and a rudimentary understanding of software design was enough to get you on the playing field. Today, you need to know networking, deployment tools, automated provisioning, testing its variety of forms — from unit testing to performance testing on a distributed scale — and the details of a multitude of development frameworks.

To use a basketball analogy, in the past all you needed to play was a ball, a hoop and the ability to dribble, pass and shoot. Today you also need to know all of that, plus how to sell tickets and run the concession stands. It’s a lot of work.

Is it worth it? Of course. But, the added complexity makes the profession a lot harder to get into. Maybe this is a good thing. Medicine, engineering and nuclear physics have always been “hard to do” professions. Work in those fields has extraordinary benefits when done well and grave consequences when done poorly. Software development is now in that league.

Today, software runs more of the world. Soon it will run most of the world. Maybe it’s time to set a high bar and make it as hard as possible to play. Yet, it’s sad to think that when the next version of me comes along, that person will have to do a lot more than write a simple program in BASIC to get started. I was fortunate to have the opportunity to play and in doing so, software changed my life. Others might not be so lucky.


May 10, 2019  6:11 PM

Five ways to fix Git’s ‘fatal: repository not found’ error

cameronmcnz Cameron McKenzie Profile: cameronmcnz

There’s nothing worse than joining a new development team and eagerly cloning the existing source code repo only to run head first into Git’s ‘fatal: repository not found’ error. For those who struggle with that problem, here are five potential fixes to the frustrating repository not found error message.


1. You did not authenticate

If you attempt to connect to a private GitHub or BitBucket repository and fail to authenticate, you will receive the repository not found error. To ensure you are indeed authenticating, connect to the repository and include your username and password in the Git URL:

git clone https://mcnz:githubpass@github.com/cameronmcnz/private-github-repo.git

2. Your password has changed

Have you changed your password lately? If you connect from a Microsoft-based workstation, the Windows Credentials Manager may transparently submit an old password every time you clone, pull or fetch. Make sure your computer doesn’t cache any old, out of date passwords and cause your connection to be rejected.

3. You are not a collaborator

You may authenticate successfully against GitHub or GitLab, but if you haven’t been made a collaborator on the project, you won’t be able to see the repository and will again trigger the fatal: repository not found exception. If you’re not a collaborator on the project, contact one of the GitHub or BitBucket repository administrators and have them add you to that role.

4. Incorrect case or a word misspelled

If your source code management tool is hosted on a Linux distribution, the repository name may be case sensitive. Also watch out for creative repository spellings, such as a zero instead of the letter O, or a one in place of the letter L. If you can copy and paste the git clone command from provided documentation, do that.

5. The git repository has been deleted

If the repository was deleted or renamed, you’ll obviously hit a Git repository not found error when you attempt to clone or fetch from it. If all else fails, check with the team lead to ensure that the remote repository does indeed still exist. One way to fix that problem is to log into your DVCS tool as an administrator and actually create the Git repository.

If you have any more insights on why developers might run into Git’s ‘fatal: repository not found‘ error, please add your thoughts to the comments.


May 9, 2019  8:42 PM

What I learned from the Google I/O 2019 keynote address

BarryBurd Profile: BarryBurd

Before the start of the Google I/O 2019 keynote address, I wondered what I’d learn in my role as an application developer. But when the keynote begins, I find myself thinking more like a consumer than a developer. Instead of thinking, “What new tools will help me create better apps,” I’m thinking about the new features that will make my life easier as a user of all these apps.

The keynote starts with a demo of a restaurant menu app. You point your phone at a menu and the Lens app shows you an augmented reality version of it. The app indicates the most popular items on the menu and, if you’re willing to share some data, the app highlights items that you might particularly like. As a strict junketarian — someone who eats junk food — the app would highlight cheeseburgers and chocolate desserts.

If you tap on a menu item, the app shows you pictures of it along with pricing and dietary information. When you’ve finished your meal, the app calculates the tip and can even split the bill among friends.

Lens has always been a fascinating app, but now it offers translation capabilities. A user can point a phone at text in a foreign language, and the app will overlay the text with a translated version in your language that mimics the size, color and font of the original text. In addition to the visual translation, Lens can also read the text out loud in your native language.

Last year, the keynote speakers stunned attendees with a demonstration of the Duplex app. A computer-generated voice booked an appointment time with a live person at a hair salon. I was skeptical, and I said so in my report for TheServerSide. But this year, they’ve announced the upcoming deployment of Duplex on Pixel phones in 44 states.

Do I trust an app to schedule my appointment for a root canal? The app asks for my approval before it finalizes an appointment, but I’m not sure that I want to remind the app to avoid early morning appointments with a friendly reminder of “Hey Google, let me sleep as long as I want.”

More from Google I/O 2019

One of the overriding developments from the Google I/O 2019 keynote is the ability to perform speech recognition locally on a user’s phone. Google engineers have reduced the size of the voice model from 100 GB to half a gigabyte. In the near future, you’ll get help from Google Assistant without sending any data to the cloud and can talk to the Assistant with your WiFi and cellular data connections turned off.

Users will see a noticeably faster response time from Google Assistant. In a demo, the presenter used the Assistant to open the Photos app, select a particular photo, and share it in a text message. This happened so quickly that the conversation with Google Assistant seemed to be effortless. Best of all, there was no need to repeat the wake phrase “Hey, Google” for each new command.

Google Maps is becoming smarter too. When Maps is in walking mode, the app will replace its street map image with an augmented reality view of the scene. If you hold your phone in front of you, the app shows you what the rear-facing camera sees, and adds labels to help you decide where to go next. In the near future, maybe you’ll see people on the streets with their phones right in front of their faces. Who knows, maybe it’s time to revive Google Glass.

In the keynote and the breakout sessions, I’m surprised to hear so much about foldable phones. With Samsung’s Galaxy Fold troubles, I got the impression that folding phones were more than a year away. But the Google I/O speakers talked about Android’s imminent folding phone display features, and discussed models that will be available in the next few months.

Google I/O 2019 is a feast for the consumer side of my identity. I came to the conference as a cool-headed developer, but I’m participating as an excited, wide-eyed user.


May 1, 2019  1:45 PM

An example of UnaryOperator in functional Lambda expressions

cameronmcnz Cameron McKenzie Profile: cameronmcnz

The implementation of Java 8 Lambda expressions required an introduction to a number of new interfaces with esoteric names that can be somewhat intimidating to developers without any experience in functional programming. One such area is the functional UnaryOperator interface.

It may be academically named, but is incredibly simple in terms of its purpose and implementation.

The function of the UnaryOpertor

The function of the UnaryOperator interface is to take an object, do something with it and then return an object of the same type. That’s the unary nature of the function. One object type goes in, and the exact same type goes out.

For a more technical discussion, you can see from the UnaryOperator JavaDoc that the component extends the Function interface and defines a single method named apply.

java.util.function.UnaryOperator
@FunctionalInterface
public interface UnaryOperator<T> extends Function<T,T>

T apply(T t)
Applies this function to the given argument.

Parameter Types:
T - the input given to the function
T - the result running the function

For example, perhaps you wanted to strip out all of the non-numeric characters from a String. In that case, a String that contains a bunch of digits and letters would go into the UnaryOperator, and a String with nothing but numbers would be returned. A String goes in and a String comes out. That’s a UnaryOperator in action.

Definition of the term unary.

Definition of the term unary.

 

Implementation of the UnaryOperator example

To show you an old school, pre-Java 8 UnaryOperator example, we will create a single class named UnaryOperatorExample and provide the required apply method. The apply method is the single method required by all classes that implement the UnaryOperator interface.

We will use generics in the class declaration, <String>, to indicate this UnaryOpeartor’s apply method works exclusively on String objects, but this UnaryOperator interface is certainly not limited to just the text based data. You can genericize the interface with any valid Java class.

package com.mcnz.lambda;

import java.util.function.UnaryOperator;
// Create class that implements the UnaryOperator interface
public class UnaryOperatorExample implements UnaryOperator<String>{
  public String apply(String text) {
    return text+".txt";
  }
}

class UnaryOperatorTest {
  public static void main(String args[]){
     UnaryOperatorExample uoe = new UnaryOperatorExample();

     String text = "lambda-tutorial";
     String newText = uoe.apply(text);
     System.out.println(newText);
  }
}

When the class is executed, the result is the text string lambda-tutorial.txt written to the console.

Example UnaryOperator Lambda expression

If you implement the UnaryOperator interface with a complete Java class, it will create completely valid code, but it defeats the purpose of working with a functional interface. The whole idea of functional programming is to write code that uses very sparse and concise lambda expressions. With a lambda expression, we can completely eliminate the need for the UnaryOperatorExample class and rewrite the entire application as such:

package com.mcnz.lambda;
import java.util.function.UnaryOperator;
// A UnaryOperator Lambda expression example
class UnaryOperatorTest {
  public static void main(String args[]){
    UnaryOperator<String> extensionAdder = (String text) -> { return text + ".txt";} ;
    String newText = extensionAdder.apply("example-function");
    System.out.println(newText);
  }
}

One of the goals of the lambda expression framework is to simplify the Java language and eliminate as much ceremony from the code as possible. As such, it should come as no surprise to discover that we can simplify our lambda expression further by re-writing the highlighted line of code:

UnaryOperator<String> extensionAdder = (text) -> text + ".txt" ;

Java API use of the UnaryOperator function

With functions now tightly embedded throughout the Java API, interfaces such as the aforementioned Consumer interface, and the current UnaryOperator tend to pop up everywhere. One of its most notable usages is an argument to the iterate method of the Stream class.

static <T> Stream<T> iterate(T seed, UnaryOperator<T> f)

For the uninitiated, a method signature like this can be intimidating, but as this UnaryOperator example has demonstrated, the implementation of a lambda expression that simply takes and returns an object of the same data type really couldn’t be easier. And that’s the whole idea behind the lambda project — that in the end, Java programs will be both easier to read and easier to write.


April 25, 2019  2:34 PM

How to write a screen scraper application with HtmlUnit

cameronmcnz Cameron McKenzie Profile: cameronmcnz

I recently published an article on screen scraping with Java, and a few Twitter followers pondered why I used JSoup instead of the popular, browser-less web testing framework HtmlUnit. I didn’t have a specific reason, so I decided to reproduce the exact same screen scraper application tutorial with HtmlUnit instead of JSoup.

The original tutorial simply pulled a few pieces of information from the GitHub interview questions article I wrote. It pulled the page title, the author name and a list of all the links on the page. This tutorial will do the exact same thing, just differently.

HtmlUnit Maven POM entries

The first step to use HtmlUnit is to create a Maven-based project and add the appropriate GAV to the dependencies section of the POM file. Here’s an example of a complete Maven POM file with the HtmlUnit GAV included in the dependencies.

<project xmlns="http://maven.apache.org/POM/4.0.0"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <groupId>com.mcnz.screen.scraper</groupId>
  <artifactId>java-screen-scraper</artifactId>
  <version>1.0</version>

  <dependencies>
    <dependency>
      <groupId>net.sourceforge.htmlunit</groupId>
        <artifactId>htmlunit</artifactId>
        <version>2.34.1</version>
        </dependency>
    </dependencies>

    <build>
      <plugins>
        <plugin>
          <artifactId>maven-compiler-plugin</artifactId>
          <configuration>
            <source>1.8</source>
            <target>1.8</target>
          </configuration>
        </plugin>
      </plugins>
     </build>
</project>

The HtmlUnit screen scraper code

The next step in the HtmlUnit screen scraper application creation process is to produce a Java class with a main method, and then create an instance of the HtmlUnit WebClient with the URL of the site you want HtmlUnit to scrape.

package com.mcnz.screen.scraper;
import com.gargoylesoftware.htmlunit.*;
import com.gargoylesoftware.htmlunit.html.*;

public class HtmlUnitScraper {
	
  public static void main(String args[]) throws Exception {
		
    String url = "http://www.theserverside.com/video/IBM-Watson-Content-Hub-has-problems-before-you-even-start";
    WebClient webClient = new WebClient();
    webClient.getOptions().setUseInsecureSSL(true);
    webClient.getOptions().setCssEnabled(false);
    webClient.getOptions().setJavaScriptEnabled(false);

  }
}

The HtmlUnit API

The getPage(URL) method of the WebClient class will parse the provided URL and return a HtmlPage object that represents the web page. However, CSS, JavaScript and a lack of a properly configured SSL keystore can cause the getPage(URL) method to fail. It’s prudent when you prototype to turn these three features off before you obtain the HtmlPage object.

webClient.getOptions().setUseInsecureSSL(true);
webClient.getOptions().setCssEnabled(false);
webClient.getOptions().setJavaScriptEnabled(false);
HtmlPage htmlPage = webClient.getPage(url);

In the previous example, we tested our Java screen scraper capabilities by capturing the title of the web page being parsed. To do that with the HtmlUnit screen scraper, we simply invoke the getTitle() method on the htmlPage instance:

System.out.println(htmlPage.getTitleText());

Run the Java screen scraper application

You can compile the code and run the application at this point and it will output the title of the page:

Tough sample GitHub interview questions and answers for job candidates

The CSS selector for the segment of the page that displays the author’s name is #author > div > a. If you insert this CSS selector into the querySelector(String) method, it will return a DomNode instance which can be used to inspect the result of the CSS selection. Simply asking for the domNode asText will return the name of the article’s author:

DomNode domNode = htmlPage.querySelector("#author > div > a");
System.out.println(domNode.asText());

The last significant achievement of the original article was to print out the text of every anchor link on the page. To achieve this with the HtmlUnit Java screen scraper, call the getAnchors method of the HtmlPage instance. This returns a list of HtmlAnchor instances. We can then iterate through the list and output the URLs associated with the links by calling the getAttribute method:

List<HtmlAnchor> anchors = htmlPage.getAnchors();
for (HtmlAnchor anchor : anchors) {
  System.out.println(anchor.getAttribute("href"));
}

When the class runs, the following is the output:

Java screen scraper with HtmlUnit

Tough sample GitHub interview questions and answers for job candidates.
Cameron McKenzie.

Link: https://www.theserverside.com/video/Tips-and-tricks-on-how-to-use-Jenkins-Git-Plugin
Link: https://www.theserverside.com/video/Tackle-these-10-sample-DevOps-interview-questions-and-answers
Link: https://www.theserverside.com/video/A-RESTful-APIs-tutorial-Learn-key-web-service-design-principles

JSoup vs HtmlUnit as a screen scraper

So what do I think about the two separate approaches? Well, if I was to write a Java screen scraper of my own, I’d likely choose HtmlUnit. There are a number of utility methods built into the API, such as the getAnchors() method of the HtmlPage, that makes performing common tasks easier. The API is updated regularly by its maintainers, and many developers already know how to use the API because it’s commonly used as a unit testing framework for Java web apps. Finally, HtmlUnit has some advanced features for processing CSS and JavaScript which allows for a variety of peripheral applications of the technology.

Overall, both APIs are excellent choices to implement a Java screen scraper. You really can’t go wrong with either one.

You can find the complete code for the HtmlUnit screen scraper application on GitHub.

HtmlUnit screen scraper

HtmlUnit screen scraper application


April 23, 2019  8:34 PM

Top 5 software development best practices you need to know

DmitryReshetchenko Profile: DmitryReshetchenko
Uncategorized

Software is everywhere, but the process to create a new software product can be complicated and challenging. That’s why software development best practices are important and can help reduce costs and speed up processes.

Without goals, a software project doesn’t have direction. Projects should start with a clear definition of the planned software’s goals, a discussion of those goals with stakeholders and an evaluation of expectations and risks. Simultaneously, you should be ready for various challenges that can come up, and implement strategies to keep the development process on course.

Best practices aren’t always a revelation of thought. Sometimes they are obvious. But as obvious as they might be, they are often overlooked, and developers need to be reminded of them. These software development best practices are obligatory for all software development projects.

Top five software development best practices

  1. Simplicity

Any software should be created in the most efficient way without unnecessary complexity. Simpler answers are usually more correct, and this thought perfectly meets the needs of the development process. Simplicity coincides with minor coding principles such as Don’t Repeat Yourself (DRY) or You Aren’t Gonna Need It (YAGNI).

  1. Coherence

Teamwork is vital for big projects and it’s impossible without a high level of consistency. Code coherence stands for the creation and adherence to a common writing style for all employees who develop software. This will allow managers or other coders to tell who the author of a given fragment is. Yes, when the whole code has the same style, it’s coherent.

Consistency helps a lot because colleagues will be able to test, edit or continue the work of each other. Vice versa, inharmonious projects can confuse your team and slow down the development process. Here are some tools that will help you enforce a single style:

  • Editorconfig: A system for the unification of code written with different IDEs,
  • ESLint: A highly customizable linter based on node.js,
  • JSCS: A linter and formatting tool for JavaScript,
  • HTML Tidy: Another linter for HTML which also finds errors and;
  • Stylelint: A linter for CSS with various plugins.
  1. Testing

Testing is essential for any product and on any stage. From the very first test run to the final evaluations, you should always test the product.

Thanks to modern approaches and the rise of machine learning, engineers have access to powerful tools such as automated algorithms to run millions of tests each second. Strategic thinking helps when you have to choose a testing type: functional, performance, integration or unit. If you choose the tools and testing types carefully, you can find a host of bugs and others issues that can ideally be fleshed out before you deploy your product. But remember not to only focus on test-driven development, remember about users and their needs.

  1. Maintenance

Unlike physical entities, the software has the potential to be immortal. Nevertheless, this would only be possible with good maintenance including regular updates, more tests and analysis. You’ve probably seen a warning before about an application that isn’t compatible you’re your device. Elaborate maintenance can get rid of these alerts and keep apps compatible with any hardware.

This principle is a bit controversial as not all teams or developers want to waste time on product compatibility with everything. However, you should focus on maintaining fresh code to allow your software to work on new devices. Thus, your product will meet the needs of more customers and help old applications to remain useful.

  1. Analysis

Apart from the pre-launch evaluation conducted by QA engineers and dedicated software developers, let me suggest you focus on performance analysis post-launch. Even the most elaborate code that results in a seemingly perfect match with your client isn’t guaranteed to work properly. There are a number of factors that can affect these results. Ideally, you’d like to have an analytics department to evaluate your numbers, but outsourced specialists always will work.

Methodologies and best practices

Apart from the aforementioned approaches, there are some other software development best practices to consider. Minor principles such as these can help play a role in a successful deployment:

  • Agile: This approach can help optimize your work. It is based on several development iterations that involve constant testing and result evaluation,
  • Repositories: Platforms such as Git are helpful to track versions, move back to previous iterations, work synchronization, and merging,
  • Accuracy over speed: Focus on correct code instead of fast code. Later it will be easier to speed up processes than rewrite everything and;
  • Experience sharing: Consider exchanging ideas and results with other developers to get external reviews if your project isn’t confidential.

Finally, let me propose a bit paradoxical statement: you don’t have to blindly follow best practices all the time. Time-proven ideas work fine for traditional processes when developers want to create common software without unique features.

But game-changing apps or innovative projects require fresh thinking. Surely, these software development best practices are fairly obvious and cover the most basic practices, but it’s better to find or build a software development team with a perfect balance between best market approaches and new ideas.


April 22, 2019  2:58 PM

How to force Maven JDK 1.8 support through the POM file

cameronmcnz Cameron McKenzie Profile: cameronmcnz

Maven and Eclipse have always had a rocky relationship, and a common pain point between the two is how to force Maven JDK 1.8 support in new Eclipse projects. Without jumping through a few configuration hoops, the antiquated Java 1.5 version persistently remains the default.

The Eclipse IDE became popular before the Maven revolution really took hold, so the IDE’s support for the build tool always felt like an afterthought. Even with recent releases like Eclipse Photon, tasks such as importing Maven projects or creating anything more than a basic Maven project is less than graceful.

Maven JDK 1.8 use in Eclipse

A reminder of this rocky relationship is the fact that Eclipse and Maven have a habit of forcing Java 5 compliance on new applications, even if JDK 1.8 is the only JVM installed on the development machine. The Java 1.5 compliance issue means Lambda expressions, stream access or newer language features besides generics won’t compile.

Some developers try and change the Java compiler setting within Eclipse from Java 1.5 to Java 1.8, but that only works temporarily. As soon as a new Maven build takes place, the JDK version reverts back to 1.5. The only way to make this a permanent change is to edit the POM and force Eclipse and Maven to use Java 1.8.

Maven JDK 1.8

Eclipse Maven JDK 1.8 compliance and support

Of course, there is a fairly simple answer to this problem. That’s why everyone loves Maven. The build tool always has a simple resolution on offer. To force Eclipse and Maven JDK 1.8 compliance on your new projects, simply add the following to your POM file:

<!-- Force Eclipse and Maven to use a Java 8 JDK compiler-->
<properties>
  <maven.compiler.target>1.8</maven.compiler.target>
  <maven.compiler.source>1.8</maven.compiler.source>
</properties>

Maven Java 1.8 plugin support

An alternate, albeit slightly more verbose approach to tell Eclipse and Maven to use Java 8 or newer compilers is to configure the Apache Maven plugin:

<!-- Build plugin to force Maven JDK 1.8 compliance -->
<build>
  <plugins>
    <plugin>
      <groupId>org.apache.maven.plugins</groupId>
      <artifactId>maven-compiler-plugin</artifactId>
      <version>3.8.0</version>
      <!- Force Maven to use Java 1.8 -->
      <configuration>
        <source>1.8</source>
        <target>1.8</target>
      </configuration>
    </plugin>
  </plugins>
</build>

Eclipse Java 1.8 compiler setting

Once the maven-compiler-plugin change is made to the POM, you can open up Eclipse’s Java compiler properties page and notice that JDK compliance has changed from JDK 1.5 to 1.8.

Eclipse Maven Java 1.8

When the POM updates to Maven JDK 1.8 support, the Eclipse Java compiler page reflects the change.

It’s a tedious problem but it’s also one that is easily fixed. Furthermore, you only need to edit the POM file once on a project to force Eclipse and Maven to use Java 8. It’s really not all that big an inconvenience to overcome.


April 5, 2019  9:52 PM

How to install Tomcat as your Java application server

cameronmcnz Cameron McKenzie Profile: cameronmcnz

If you’re interested in Java based web development, you’ll more than likely need to install Tomcat. This Tomcat installation tutorial will take you through the prerequisites, show you where to download Tomcat, help you configure the requisite Tomcat environment variables and finally kick off the Tomcat server and run a couple of example Servlets and JSPs to prove a successful installation.

Tomcat prerequisites

There are minimal prerequisites to install Tomcat. All you need is a version 1.8 installation of the JDK or newer with the JAVA_HOME environment set up, and optionally the JDK’s bin folder added to the Windows PATH. Here is a Java installation tutorial if that prerequisite is yet to be met.

If you are unsure as to whether the JDK is installed — or what version it is — simply open up a command prompt and type java -version. If the JDK is installed, this command will display version and build details.

C:\example\tomcat-install\bin>java -version
java version "1.8.0"
Java(TM) SE Runtime Environment (build pwa6480sr3fp20-20161019_02(SR3 FP20))
IBM J9 VM (build 2.8, JRE 1.8.0 Windows 10 amd64-64 Compressed References 20161013_322271 (JIT enabled, AOT enabled)
J9VM - R28_Java8_SR3_20161013_1635_B322271
JIT - tr.r14.java.green_20161011_125790
GC - R28_Java8_SR3_20161013_1635_B322271_CMPRSS
J9CL - 20161013_322271)
JCL - 20161018_01 based on Oracle jdk8u111-b14

Download Tomcat

You can obtain Tomcat from the project’s download page at Apache.org. Find the zip file that matches your computer’s architecture. This example of how to install Tomcat is on a 64-bit Windows Xeon machine, so I have chosen the 64-bit option.

Unzip the file and rename the folder tomcat-9. Then copy the tomcat-9 folder out of the \downloads directory and into a more suitable place on your files system. In this Tomcat tutorial, I’ve moved the tomcat-9 folder into the C:\_tools directory.

Tomcat Home

Tomcat installation home directory

Tomcat environment variables

Applications that use Tomcat seek out the application server’s location by inspecting the CATALINA_HOME environment variable value. So, create a new environment variable named CATALINA_HOME and have it point to C:\_tools\tomcat

To make Tomcat utilities such as startup.bat and shutdown.bat universally available to command prompts and Bash shells, you can put Tomcat’s \bin directory on the Windows PATH, but this isn’t required.

CATALINA_HOME

Set CATALINA_HOME for Tomcat installation

How to start the Tomcat server

At this point, it is time to start Tomcat. Simply open a Command Prompt in Tomcat’s \bin directory and run the startup.bat command. This will start Tomcat and make it accessible through http://localhost:8080

example@tutorial MINGW64 /c/example/tomcat-9/bin
$ ./startup.bat
Using CATALINA_BASE: “C:\_tools\tomcat-9”
Using CATALINA_HOME: “C:\_tools\tomcat-9”
Using CATALINA_TMPDIR: “C:\_tools\tomcat-9\temp”
Using JRE_HOME: “C:\IBM\WebSphere\AppServer\java\8.0”
Using CLASSPATH: “C:\_tools\tomcat-9\bin\bootstrap.jar;C:\_tools\tomcat-9\bin\tomcat-juli.jar”

After you verify that the Apache Tomcat landing page appears at localhost:8080, navigate to http://localhost:8080/examples/jsp/ and look for the option to execute the Snoop servlet. This Tomcat example Servlet will print out various details about the browser and your HTTP request. Some values may come back as null, but that is okay. So long as the page appears, you have validated the veracity of the Tomcat install.

verify tomcat install

Tomcat installation verification

And that’s it. That’s all you need to do to install Tomcat on a Windows machine.


Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to: