A Kubernetes Tale: Part I — Building your Java project

A practical (and somewhat opinionated) guide to kick-off your project with Docker and Helm

Photo by Loik Marras on Unsplash

round three years ago while I was working at Debut, we started using Kubernetes. We made the decision after realising we needed a flexible space to deploy services that were critical for the platform to grow. It was a whole new world of containerisation on top of the already widely famous Docker. From the very beginning I’ve been learning how to tackle many aspects of running a production-ready cluster — this is a little bit of my experience so far that I wanted to share.

Besides any specific technology, the main points that you can draw out of this series are:

  • Building — wrapping a Spring boot application as a Docker-ready image, with a scalable environment-based configuration.
  • Running and orchestration — on a Kubernetes cluster
  • Configuration management — using Helm to manage releases and secrets
  • CI/CD pipelines — using CircleCI
  • [Appendix] Monitoring — using New Relic

The tech stack used for these examples is:

  • Spring Boot 2.3 + JDK 11+ Kotlin 1.3* + Gradle 6.4
  • Kubernetes 1.15 (on GCP)
  • Helm 3.2.4
  • New Relic agent 5.9.0

If in your project you’re using the same stack with different versions, it’s possible that the steps described here will not necessarily work as expected.

* No real reason to use Kotlin here. You could simply use good ol’ Java. Or you could try Kotlin. (but only after reading this guide)

Disclaimer: I actually started this article a year and a half ago, so there might be things that were true back then that are not anymore (like going to the pub for a pint). I doubled-tripled-10xed checked everything but something might have fallen through the cracks.

Generating our application

Our first step is to create an executable version of our application. In this case, we’ll be creating a Spring Boot application exposed through a REST API. To create an application, you can use Spring initializr.

For the purpose of this example, we’ll simply add two dependencies:

  • Spring Web: Enabling the application to generate the necessary REST controllers
  • Spring Actuator: Includes a set of production-ready features. We’ll use the health functionality later on.

Download and open your fresh new project in a terminal. First thing is to check that it runs correctly. To do this go to the directory where you unzipped it and run:

./gradlew bootRun

Gradlew stands for Gradle wrapper — it allows you to port your project without requiring any Gradle installation in your local environment.

Once your application starts run a curl to check it’s responding correctly. You should see something like this 👇

This endpoint is automatically exposed by Spring actuator. It analyses different the resources the application depends on to provide an easy way of assessing its health: DB access, free disk space, Hystrix circuit breakers, etc. [Spoiler alert: This endpoint will come quite handy for Part II]

Whilst in your local environment you can run Gradle for simplicity, in a Docker image you want to include as few dependencies as possible, so that’s why we’ll have to run it in a different way.

Running the application

Running the application standalone

To run this application in a Docker container we’ll have to use a fat jar (also known as uber-jar). This is a jar file that contains all our project classes and resources plus all its dependencies.

A full list of the subtasks run by the build task: assemble, bootBuildImage, bootJar, build, buildDependents, buildNeeded…
A full list of the subtasks run by the build task: assemble, bootBuildImage, bootJar, build, buildDependents, buildNeeded…
A full list of the subtasks run by the build task

When you run the ./gradlew build task it will trigger the build + test + jar subtasks. It actually runs a few more subtasks but you get the gist of it. You can get a full description of these under the little Gradle elephant tab in IntelliJ IDEA.

If then you check under build/libs/ you should see something like this:

If you want to check the contents of the jar you can run jar tf build/libs/api-0.0.1-SNAPSHOT.jar

Now check line 9. This is the jar we’ll use to execute in a Docker container — to run it simply execute:

java -jar build/libs/dockeriser-0.0.1-SNAPSHOT.jar

As you can see, this will fire the same startup we used before, this time without invoking Gradle.

Running the application with different flavours

We’ll leverage Spring profiles to have the flexibility of configuring the application in different ways, depending on the environment. So for this we’ll create 4 profiles:

  • Local — for our local environment
  • Dev — for development environments
  • Prod — for production environments
  • Container — to execute the application in a container

How are these profiles materialised in the app? By creating 4 additional application-*.yaml (or application-*.properties) files:

  • application-local.yaml
  • application-dev.yaml
  • application-prod.yaml
  • application-container.yaml

To tell Spring Boot to start using any of these profiles, we need to set the spring.profiles.active property to the given name we chose (multiple profiles can be used). So for example, to run our application with dev and container profiles, we need to run it as:

java -Dspring.profiles.active=dev,container -jar build/libs/dockeriser-0.0.1-SNAPSHOT.jar

Why a “container” profile?

A good use case for profiles is obviously environment-based configuration. Depending on this, you might need different DB access URLs, storage bucket IDs or simply a specific configuration you want to apply to an environment without affecting the rest (i.e.: logging verbosity). However, there might be cases where you don’t need to do this because you can either externalise that configuration or multiple environments share the same set of parameters. I’ll use an example in one of the next articles of the series and it’ll all make sense, I promise.

Perhaps you noticed I said configuration but I didn’t say credentials, and there’s a good reason for that. Besides this particular implementation, a good piece of advice: never, ever write sensitive information like usernames or passwords in your code, every time you do this there’s an infosec folk suffering somewhere in the world and the god of bad developers deletes all your semicolons from your code (OK, you’d be saved if you used Kotlin, but you still get the point).

If you’re lost, don’t panic: it’ll make more sense in Part IV using Helm.

The main course: creating a Dockerfile

Docker is a tool to build and run applications in containers — in case you’ve never heard of Docker, think of these containers as super-lightweight VMs, and of the Dockerfiles as the specs that define how they should run. By default these containers are stateless: meaning they can be destroyed and started from scratch based solely on their specs. Think of it as the shopping list of the container.

Another good feature they have (which took some time to get my head around) is that they enforce the Single Responsibility Principle: this means that each container runs only one application. The applications must run in foreground so Docker can detect it’s still running correctly. Each Dockerfile can depend on a parent Dockerfile, with usually defines a set of already predefined OS, tools, etc.

This is the Dockerfile we’ll be using (and as you can see, it just contains the bare minimum required to run), with some tweaks for Production already in there:

The JVM flags

The details of the flags used to run the application have some good practices I collected for this type of applications:

  • -server tweaks some aspects of the VM to support the type of traffic a server-side application handles.
  • -XX:MaxRAMPercentage is a new flag introduced in Java 11 (then backported to Java 8) that will limit the RAM used by the JVM as a fraction of the overall available memory (if you’re using older versions of Java, you might need to use MaxRAMFraction instead, which is not as flexible as this one but it should do the trick). Take into account that, while the heap is the part of the JVM that will take most of the RAM, there are some other parts, such as thread allocation and code cache that will require memory.

In a lot of documentation, you’ll probably see that -Xmx and -Xms are used (which set limits for the heap that’s going to be used), and usually that’s the case for older versions of Java. With the introduction of MaxRAMPercentage, it can be avoided, at least initially. MaxHeapSize that complement it in that case.

  • -XX:+UseParallelGC replaces the default JDK11 garbage collector (GC1) with an improved version for smaller, high-throughput applications. You might want to check out GC1 if this is not the case for you. For the full list of available GCs, you can refer to Oracle’s documentation.
  • -Dspring.profiles.active=container as I mentioned before, the container profile will allow us to have the same configuration that will be required when running our application in a container environment, so it makes sense to anchor it independently of dev/staging/prod. If you noticed, I’m not including the environment yet, so we can deploy the same Docker image in multiple clusters across envs. This configuration will be later externalised and managed by a setting in Kubernetes.
  • -Duser.timezone=Europe/London sets a default timezone regardless of the VMs locale. This can be useful when deploying the same application in different regions, without causing differences in the way your application interprets timezone when rendering/storing datetime fields. This on its own it might not be enough (i.e.: JDBC requires an extra setting to recognise this) but I’ll leave the long explanation for another article. Just keep this in mind! Below a really comprehensive article around this topic, just take the time to read it! (pun intended)

Special mention to properties that are not required but I think are important to point out:

  • -XX:+UseContainerSupport (enabled by default). Before the introduction of this flag, it was really tricky to limit the available memory available to the JVM when running on a container, so in older versions, you might need to activate it.
  • -Dspring.devtools.restart.enabled=false tells Spring to disable Livereload, useful during development for faster compilations when changing code but detrimental for a production instance. According to the docs, running the application the way we are means this is automatically applied (actually devtools is not even included as part of the dependencies)
  • -Djava.security.egd=file:/dev/./urandom tells the VM to use a non-blocking stream of higher entropy to generate random values. Huh? Yeah, I know, it seems like a concatenation of random words but I swear it’s not. It’s a stream that uses signal noise from different parts of your computer to produce strings that look “more random”. Want to see what it looks like? Just run cat /dev/urandom. If you’re on Windows, sorry mate, cannot help you there. There’s been a few fixes on this matter since JDK8, so you might want to check the SO post before just in case.

Important note: these values are never a silver bullet. The parameters you use need to be tweaked to the very specifics of your application: throughput, amount of exchanged data and concurrency patterns. The idea is that you know what aspects to take into account — put differently, pay more attention to the variable names that are being tweaked rather than the actual values. For a comprehensive list of the available variables, check this page out.

Building and running your Docker image

Once this Dockerfile is generated, it needs to be built (-t specifies a tag) into an image:

docker build -t dockeriser .

The first time it will take longer than all the following builds. Docker works with layers, where each instruction is treated and cached independently, making it particularly useful to speed up build times. In the Dockerfile above, there are 4 instructions: FROM, EXPOSE, ADD and ENTRYPOINT, so this image will have 4 layers. The next time you build your application the only layer that will have to be rebuilt is the one with ADD since your build generated a new jar file. All the rest will be reused.

If you want to see the available images in your local Docker repository, just execute docker images .

Once the Docker image is built, it can be run by executing:

docker run -P dockeriser:latest
A detailed list of the active running containers: image name, container id and age, bound ports, etc
A detailed list of the active running containers: image name, container id and age, bound ports, etc
If you run docker ps you’ll see a list of the running containers.

The -P flag tells the Docker daemon to publish any exposed container port. This will allow us to connect from our computer to the container. As you can see under PORTS, our local port 32768 is bound to the container’s 8080 port. You can try executing the same curl (pointing to the local port) we used before against it and you’ll be able to access it. If you don’t publish the container ports, they will be inaccessible from outside of that instance.

If you want to expose the port using a specific port of your local machine, use -p<local-port>:<container-port>, such as -p8000:8080.

In the example above, if you run the build command again, all the subsequent images that you generate will be tagged as latest. If you want to generate a new version with a new release, here’s a quirky way of doing that automatically.

docker build -t dockeriser:`date +%Y%m%d_%H%M` .

Automate, automate, automate

Lastly, one piece of advice: automate, automate, automate. It can seem like an obvious thing to say, but it’s not uncommon to see projects where there’s documentation (or sometimes not even that) with the detailed explanation of all the steps to follow to build and deploy an application. Don’t be that person. Please. I’m serious.

So as you can see, there are a few steps that we could easily automate with a bash script so far: building the Spring boot app, and building the Docker image.

./gradlew build
docker build -t dockeriser:`date +%Y%m%d_%H%M` .

You can find the full version of the shell file below. Even if these are the only two things you actually need to do, if you’re going to automate the process you want it to be fail-safe. What can go wrong in the example above? For starters, Java code build can fail if code doesn’t compile. Or if tests fail. So if you can’t generate a correct jar, your Dockerfile will fail (or worse, it could use a previous version of the one you might have already generated) But then a few other things can go wrong: what if the person that’s running this doesn’t have Docker installed? Or what if it’s installed and the daemon is not running? In neither case will this script work correctly.

You could argue “hey, DUDE, mr. automation, you forgot to include a script to actually run the project”. And you’d be right. I just don’t care about that step in particular, because I rarely ran an application like this with Docker. I ran them, as you’ll see in Part II, on a Kubernetes cluster (or minikube if you do it locally) for a couple of reasons.

Here’s the full project in case you want to clone it and play around with it (where you can also find a _helpers.sh file which contains the auxiliary functions used in the script above). You can check the commits section to see the different steps I described previously.

And voilà! There you go, a Docker image ready to be deployed and run in a container. Hope you’ve found this article interesting. Do you have a question? Feedback? You implemented this and you do it differently? You found an error? Don’t hesitate to leave a comment!

Technology enthusiast. Less is more.

Get the Medium app