Back to the main blog page

My first *real* Kubernetes cluster.

July 27, 2025

4 min read

My first real Kubernetes cluster

Hey there! You are reading this post from my kubernetes cluster! Crazy right? (Not really something out of this world…)

It has been some time since I was thinking and engineering how to pull this off. After I learned about Docker, then Docker Compose, later Docker Swarm, and heard about Kubernetes, I thought I wouldn’t need anything else. Docker Swarm always filled the need I had: balancing load, fail-overs, and safe update rollouts (or rollbacks).

The Need for Scale

It was after I dove deeper into CI/CD with Github Actions and saw the convenience of self-hosting them inside our VCN that things started to change. Deploying them was as simple as a docker stack deploy with a custom DOCKER_HOST set.

However, my team and I started waiting for the runs of other colleagues. I saw the need for multiple instances. The problem was simple to solve:

Add more instances!

But it didn't sound cool to always have 10 instances idling for some specific cases during a week where we would have 5+ runs happening simultaneously.

Discovering a Solution: ARC

Then I discovered ARC (Actions Runner Controller), which was community-created and had just been adopted by Github themselves. This made the documentation very confusing. I had three places to read about it, all with different instructions. As someone who had never used Kubernetes, I felt lost, but I knew what I had to do.

The best place to start learning about something you don't understand is, of course, Youtube!

I found some guides on setting everything up and installed the necessary tools:

  • minikube
  • helm
  • kubectl

In minutes, I had the environment set up correctly. Then I got the Github Runners configured and customized them to have a dind (Docker in Docker) environment. And that was it! It was working. We now always had one idle runner, and if we ever needed more, it would spin up another instance in just six seconds.

But I always felt that it was not enough for me. I had reached my goal, but I discovered a whole new system: Kubernetes. It's catchy, scalable, a keyword, a hot topic. And I felt I didn't know anything about it. I followed someone's instructions, but that didn’t mean I understood it. I had a thirst to learn more.

Building the "Real" Cluster

Oracle Cloud has a crazy free tier that I once signed up for to host a Minecraft server with some old friends: 4 OCPU and 24GB RAM. I normally used it as a single server for my projects.

One day I saw they also offer their OKE (Oracle Kubernetes Engine) module with a Basic cluster if you sign up for their "Pay as you go" plan.

So, on my PTO, I decided to bite the bullet! Oracle has an easy "Quick setup" that spins up a new VCN, Load Balancer, and a node pool. I decided to distribute their free tier across two nodes of 2 OCPU and 12GB RAM each, which is enough for me.

First, I set up my Docker registry to manage my own Docker images, like the one for this website! Then, I decided to set up ARC again (not gonna lie, I'm a big fan of Github Actions).

Adopting GitOps with ArgoCD

I started looking into how I could automate deployments and stuff. I wanted to do it right. After a bit of very intensive studying with Gemini, Claude, ChatGPT, Grok, and Google searches (it's hard to abandon bad habits), I found ArgoCD, which uses a GitOps workflow.

GitOps was exactly what I was looking for. And everything I read told me why it was a bad practice to always auto-deploy from the latest tag.

The First Hurdle: ImagePullbackError

I scraped up my website again, added some final touches, and created a blog where I could “blah, blah, blah…” stuff. Afterwards, I think it will be funny and cringe to look back at this in a couple of years.

And that was it. I created the project in ArgoCD, hit deploy, and…

ImagePullbackError

Great! The problem took some hours of debugging, but it was because Docker’s build-and-push Github Action, if set up to use a Docker registry for caching, does not use the /etc/hosts values. It was trying to push the file from the outside network.

The Real Bottleneck

Which brings me to present to you, the biggest bottleneck of this project: the load balancer!

Don't get me wrong, it's great and works out of the box, but Oracle only offers a 10Mbps bandwidth for free. Pushing a Docker image takes forever, even one that uses multi-stage building with alpine variants and a final Google’s distroless base. I tried everything—setting up custom hosts, DNS, internal routes—but it wouldn’t budge, to the point where I gave up on caching. It takes some minutes to build a new image, but I am happy with the final output.

Conclusion

And finally, I actually felt that I at least knew something about Kubernetes: deploying a cluster, managing it, and debugging it.

That’s it, folks. Thanks for reading my first actual real post. 🙂