Rant: AWS App Runners

Sun Nov 12 2023

Last weekend I attended HackNotts '84, a hackathon based at the University of Nottingham where you get 24 hours to build anything you want. For my hack, I decided to go with remaking Tetris, except making it multiplayer in that anybody can join one person's game and control their board, by scanning a QR code from the host to a website with controls. By about 9 pm, I had completed all of the Tetris logic and written a basic frontend and backend in Vue and Node.js, and required somewhere to deploy it to test.

Now, I shall preface this rant by saying I am not your typical AWS user. I only use AWS for backups to S3, and I never consider serverless for personal projects - I prefer setting up my own servers and deploying applications manually. However, I figured that as I am at a hackathon, I may as well try something new.

Trying something new

Having played with big cloud providers and serverless at work, I immediately set my eyes on AWS App Runners.

Waffle

The features which I was excited for and stood out to me were:

  1. Automatic container deployment - all I needed to do was push a container image to a container repository and it would provision a server for me
  2. No network configuration needed - not having to care about security or mess about with DNS = big win
  3. ClickOps - a change of pace of not needing to SSH into anything to deal with broken nginx configurations or to manually change docker parameters

Excited at the prospect of not having to tediously manage my own VPS, I rushed over to create an Amazon Elastic Container Registry and pushed a backend image to it.

Initial setup

After spending a few minutes clicking about on the cloud console, I had successfully managed to deploy my backend image to an App Runner, which at the time only had a single GET route. AWS provided me with a URL to access it, and it was publicly available on the Internet.

I quickly wrote a Dockerfile for the frontend, using the Vite dev server instead of creating a production build, just so I can get it up and running. I pushed it to ECR and created a new App Runner and... it worked!

Now, I just needed to set a couple of environment variables to have it communicate with the backend and had to update the Dockerfile to make it use a production build as opposed to the Vite dev server. Shouldn't be too hard, right?

Healthcheck hell

I set the new environment variables, updated the Dockerfile, pushed the image, and...

What?

Looking at the event logs, it seems that the issue was a failed healthcheck.

One of the steps involved in changing over to a production build is changing the exposed port in the container image. Vite uses port 5173 for the dev server (4173 for preview), and for the production build, I was using nginx to serve the website, which is port 80. It would make sense for the healthcheck to fail.

I figured it was a simple fix - I'd simply change the port on AWS from 4173 to 80.

I updated the port, hit apply, and...

It failed again. It would be another 20 minutes until the rollback completed, at which I was back where I started. By now, I had almost spent an hour trying to deploy this, and I was starting to lose my mind.

This time the issue was it was trying to check port 80, on the version using the Vite dev server, not the production build, as that deployment failed. Of course that was going to fail. As it turns out, I had arrived at an impossible situation to rectify. I cannot change the image, otherwise the healthcheck will run against the original port, and I cannot change the port, otherwise the healthcheck will run against the original image.

Now, the astute reader may think: why not disable the healthchecks, or change both the image and port at the same time?

I will address the former question first, and it has quite a simple answer. You simply cannot disable the healthchecks.

The only configuration options for healthchecks. There is no way to disable them.

For the second question, well it turns out the jokes on you. Because you cannot do them both at the same time! For some inexplicable reason, if you pause a service, you cannot update its configuration or deploy a different version of the application.

Having read this, I was getting a little bit frustrated. I deleted the app runner and re-created it with the production image, with port 80 set.

WebSockets

With the frontend now running and correctly pointing towards the backend, I was finally starting to feel a bit of respite.

It was, however, short-lived. I noticed immediately that my application was not working as intended.

First, a bit of background. The way it was meant to be used is when a user clicks a button on the frontend to request a new session on the backend. This user is known as the "host", and it is implemented with a single GET request. The host then receives a session code and displays a QR code for other players to join. Next, the host opens a WebSocket connection with the backend, with the session code, to receive real-time updates. This would enable the multiplayer Tetris gameplay.

When testing, I managed to get up to receiving a session code. However, when trying to open a WebSocket connection, it immediately failed.

I thought for a moment that this was an issue with my environment variables. Perhaps I entered the wrong value for one of them. I opened the management console and checked, and they were all correctly set.

A simple Google search later, I found my problem. It turns out that AWS App Runners do not support WebSockets.

Fantastic.

By now, it was 10:30 pm. At this point, as you can probably guess, I was suppressing the urge to launch my laptop out the window, and myself along with it.

Doing it myself

After this revelation, I took everything down, deleted the App Runners, and cleared the container registry. I logged into Hetzner, spun up a cheap CPX11 VPS, and installed everything manually.

In just under 10 minutes, I had managed to do it all myself, and it was working perfectly. I did not care enough to put a DNS record in to set up HTTPS, and I simply shared the raw IP address with people.

Closing thoughts

While my hack did not win any prizes, it was still a good experience to build. The key lesson for me is to never use AWS App Runners again. I always hear a lot of noise about serverless containers (or serverless computing in general), but this experience has once again put me off using them for personal projects.

It seems like the AWS services which present themselves as more "streamlined" are actually more annoying to use in production. A lot of the complexity is hidden by opaque settings, and the time it takes for changes to take effect just adds to the frustration.

In any case, I may give serverless another shot in the future. But, for now, I will stick to manually setting up my own servers for my personal projects.


built by panulat v1.3 - Tue, 24 Sep 2024 21:41:13 GMT