I wanted to put together a new site recently, as I'm thinking about looking for a job soon. This was a bit of a treat - a personal site is always an excuse to mess around with the latest Gee Whiz toolsets without worrying about whether they fit a client's needs, or whether they actually make any kind of sense.
My requirements were:
I checked the calendar and it's 2018, so obviously I decided to use React.
Why not Ghost, the excellent blogging platform? It's not static. I didn't want to be running a persistent server process just to deliver dumb unchanging content1.
Why not Jekyll, or Hugo, the excellent static blog generators? I wanted the freedom to write my own pages from scratch (like a portfolio page) rather than mess around with passing themes through an engine.
These would work if I was just delivering pages, but I wanted instantaneous transitions between pages, which means JS. And going the SPA route of forcing a user to download a big blob of JS, in order to download some data, in order to render a page in their browser is even slower and more wasteful than running their request through a serverside process.
The solution was a well-maintained and highly customizable build tool for React called React Static. It allows me to run a build command locally and upload the resulting assets to S3, so the initial document that hits the user's browser is already the prerendered HTML page they requested. React (and the data React will use to hydrate the other pages to which they may want to navigate) are downloaded in the background.
With the addition of a markdown-to-JSON library called jdown, that took care of requirements 1 through 4.
Unfortunately, URIs were still ugly. React Static seems to have been designed to be delivered by an Apache-style server, where the URI
/posts/ would be written to access the file
/posts/index.html through a module like
mod_dir. It generates an output
/dist/ folder with this kind of structure that isn't going to work on a system not running Apache/nginx - like any document store. The project actually recommends S3 as a hosting option (behind Netlify, which also looks nice), but ignores this problem.
Even if React Static did something more S3-friendly with the directory - generating
/dist/posts.html rather than
/dist/posts/index.html, for instance - that wouldn't really serve our purposes. S3 allows exactly two request destination rewrites for a bucket: one of them for the root bucket URL, and the other for errors. Using the default behavior would mean that users arriving directly at https://bonner.jp/work - or refreshing, or using the browser's back button - would not get the prerendered page they wanted.
Instead, their browser would show them the HTML for the index page (or error page, depending on how S3 was configured), and after React finished downloading and checking the URI, the desired page would be rendered. Unacceptable!
So customizing React Static was not a solution. Finding another storage solution that supported this structure might be, but I'm not aware of anything as cheap and powerful as S3/Cloudfront.
Luckily, AWS recently released a service called Lambda@Edge that allows you to execute arbitrary code at the cache rather than waiting for it to hit the server, and modify the request object accordingly.
So once the S3 content is behind a Cloudfront CDN, it's a simple step to write a regex that will append
/index.html onto an inbound page request, discarding a trailing slash if it's present, and ignoring assets.
A hastily written regex to exclude asset requests:
Now a user's request to Cloudfront for e.g. https://bonner.jp/work will grab the asset in S3 at
bonner.jp/work/index.html - returning the "Work" page, after which the JS needed to render everything else2 downloads in the background. Voila.
If you'd like to see how it turned out, well, click around. The source code is here on github. Thanks for reading.