While AWS remains the market leader for public cloud providers, I have personally found Azure to be significantly more security-conscious and pleasurable to work with. As part of the learning process, I’ve slowly been migrating and deploying services onto Azure – including this blog.
This post is going to focus on deploying a simple, static website onto an Azure storage account (AWS S3 equivalent). As part of this deployment, I will front the storage account with a content delivery network (CDN), enable a valid HTTPS certificate, configure reasonable caching defaults, and set up continuous integration for deployment via CircleCI. The goal of this project is to have a remarkably secure website with minimal time, energy, and resources committed to maintaining it.
I will caveat this post by stating that I am definitively not a web developer, and most of these web technologies are outside my proficiency.
So the first question we should answer is: why do we want a static web page? There are a few compelling reasons why static web pages are so attractive:
The static web page zeitgeist likely originated with the creation of Jekyll, a static site generator (SSG) which powers GitHub Pages functionality. Since Jekyll, SSGs have exploded in popularity and created a rich ecosystem of frameworks. While you can craft an artisanal static website by hand, these frameworks make it trivial to get started and deploy a new project.
There are a variety of SSG frameworks available, but Hugo, Jekyll, and Gatsby.js are perhaps the most well-known and popular. Each of these have their own language preferences, features, and benefits, but all serve the same purpose. A cross-comparison of these frameworks is outside the scope of this post (and outside of my depth of my knowledge), but I ultimately selected Jekyll for my personal blog.
Once you’ve selected a framework and found a free theme that appeals to you, you’ll need to get a local development environment ready.
As most of my devices run with application whitelisting enabled, I needed to spin up a local development environment. If you’re not foolish enough to use application whitelisting, run another operating system (e.g. MacOS), or already have a local developer environment, this section may not be useful for you. Feel free to skip it.
For Jekyll, we’ll need a few components installed:
In my instance, I spun up a new Windows 10 developer environment in Hyper-V, but you could just as easily do this on your host.
Install WSL using PowerShell.
Enable-WindowsOptionalFeature -Online -FeatureName Microsoft-Windows-Subsystem-Linux
Install Ubuntu 16.04 LTS via PowerShell and Reboot.
Invoke-WebRequest -Uri https://aka.ms/wsl-ubuntu-1604 -OutFile Ubuntu.appx -UseBasicParsing
Add-AppxPackage .\app-name-as-per-above.appx
Update WSL.
sudo apt-get update && sudo apt-get upgrade -y
Install Basic Tools.
sudo apt-get install gnupg2
sudo apt-add-repository ppa:brightbox/ruby-ng
sudo apt-get update
sudo apt-get install ruby2.5 ruby2.5-dev build-essential dh-autoreconf
gem update
gem install jekyll bundler html-proofer
At the end of this process, you should have a Jekyll-compatible environment ready and either a new site, or a templated site, ready for configuration. This is the part where you actually make your blog, configure your template, and add content.
Protip: To test your local changes, you can use the jekyll serve
command. This will open up a listener on http://127.0.0.1:4000 where you can preview your changes.
Now that we’ve got a rough skeleton for our blog, we’ll throw it in a GitHub.com repository.
First, we’ll create a local .gitignore file for your repository and add the following contents:
_site
.sass-cache
node_modules
.jekyll-cache/
.jekyll-metadata
This will allow us to version control the web site content without uploading the actual HTML pages. We’ll generate these from the source files as part of our CI/CD pipeline.
Next, we’ll commit everything and upload it to our repository. We can now version control all changes to our web site using our GitHub repository. An example of what this looks like is below.
Private GitHub repo for blog.dane.io.
We’ll come back and make some additional changes to our repository later but, for now, we’re going to move over to our Azure account and get that configured.
Protip: If you’re new to Azure, you can register for a free account and get $200 worth of credit and 12 months of some free services.
Double Protip: If you’re a Visual Studio subscriber, you get $50 a month in Azure credits in addition to software access (e.g. Windows 10, Server 2019) and other benefits. You might consider purchasing a subscription, or convincing your workplace to sponsor it, if you intend to play with Azure and the Windows platform long-term.
We’ll need an Azure account for hosting our web site. If you use Office365, you already have an Azure account. If not, you’ll need to get one. Go ahead and login or create your Azure account now.
Note: By default, all storage accounts in Azure are encrypted using server side encryption (SSE) by Microsoft. We don’t need to do anything special for encryption at rest.
Next, we’ll create our storage account for hosting our static website. As our website content is stored in a GitHub repository, we don’t need to worry about backups, redundancy, or other availability or integrity protection mechanisms at the storage account level.
Storage account configuration.
Static website configuration.
We now have a storage account ready for hosting our static website content. If you use the Storage Explorer, you’ll notice that a default $web container now exists and is ready to serve up our website.
Static website container ($web).
If we only wanted to serve out of the bucket itself, we could simply configure a domain name and stop here. However, we want to do a few more things before we can call this project finished. Let’s go set up our custom domain name.
If you’re into vanity domains (and who isn’t?), you might consider using a custom domain or subdomain for your website. As I use Azure for managing my DNS, I’ll configure it to use the blog.dane.io
subdomain.
Navigate to the newly created instance and grab the name server information:
Name server 1: ns1-06.azure-dns.com.
Name server 2: ns2-06.azure-dns.net.
Name server 3: ns3-06.azure-dns.org.
Name server 4: ns4-06.azure-dns.info.
Validation of DNS changes.
We are now using Azure to manage our DNS. We’ll be able to create the custom records for our CDN by creating a record set within the Azure DNS console.
We’re going to front our website with an Azure content delivery network (CDN) to improve speeds, reduce bandwidth usage of our bucket, and distribute our content to geographically distributed points of presence.
Custom origin information.
Next, we’re going to configure our DNS record to point to the CDN endpoint that we specified above (e.g. https://daneio.azureedge.net).
This will redirect any requests to our subdomain (e.g. blog.dane.io) to the Azure CDN endpoint.
Successfully validated DNS record.
Once the record has been created, we’ll need to associate the domain with our Azure CDN endpoint.
Once we have confirmed the DNS record, we can have Azure provision and manage a digital certificate for us.
Once this has been kicked off, it may take a few hours for the TLS certificate to be provisioned.
Successfully issued TLS certificate.
Next, we’re going to ensure that compression is enabled for content delivered via the Azure CDN. While images are likely already compressed, we can save some bandwidth and improve delivery speed by compressing other MIME formats.
By default, fonts, XML, plaintext, CSV, HTML, and other MIME formats will be compressed. You may add additional MIME signatures to this list to provide compression on-the-fly.
MIME types compressed during CDN delivery.
Next, we’re going to configure our CDN cache. This is especially important as assets cached via the CDN will be retained until the time-to-live (TTL) expires. If we fail to configure reasonable caching, updates to our website will be painful.
By default, Azure storage accounts set a cache on a per-object basis with a default of 7 days. While this is fine for static content (e.g. image assets, fonts), it will be a very poor experience for updates to HTML pages. While we could set the TTL for each object individually as we add it to the bucket, there is a really lazy way to solve this problem.
We’ll set the general CDN caching rule for our CDN:
Default caching behavior.
Next, we’ll create some custom cache rules using the rules engine. Our goal will be as follows:
Managing the cache in this way ensures that we can centrally adjust values instead of setting them on a per-object basis in the storage account.
To do so, perform the following:
In this instance, I have configured 2 specific rules:
/assets/
folder is given a 7 day TTL. We’ll use this folder for images, javascript, CSS, etc.This combination of short and long TTLs ensures that our CDN is only delivering compressed text (e.g. HTML, CSS) on a frequent basis, but all large and static assets (e.g. images, gifs, fonts) are cached. When we make production changes to our website, it takes around 5 minutes for the HTML CDN cache to expire and be refreshed, making it a seamless user browsing experience.
Custom caching behavior rules.
Note: Due to issues with the Microsoft CDN and Twitter card support, I switched over to Standard Akamai. Unfortunately, the Akamai CDN does not allow custom header manipulation. As such, I’m leaving this documentation for those who might still need to use the Microsoft CDN.
Next, we’ll want to configure a few basic security features:
While many of these are not strictly necessary given the static nature of the website, it’s fairly trivial to add and deploy. We’ll do it for completion’s sake.
We’ll start with HTTPS redirection. This is important as the CDN will not serve content over HTTP.
To do so, perform the following:
request protocol equals HTTP
, then URL redirect found (302)
to protocol HTTPS
.
Custom EnforceHTTPS rule.
Next, we’ll configure HSTS and a Content Security Policy for the website. HSTS ensures that browsers will only connect to the website over HTTPS, and the CSP will help prevent cross site scripting (XSS), as much of a rarity as that might be.
To do so, perform the following:
Strict-Transport-Security
with value max-age=315360000; preload
.Content-Security-Policy
with the value default-src 'self'; script-src 'self' 'unsafe-inline'; style-src 'self' 'unsafe-inline'
.
HSTS and CSP configured.
Next, we’ll prevent our site from being embedded on other websites (e.g. X-Frame-Options), prevent MIME sniffing (X-Content-Type-Options), and configure a referrer policy.
To do so, perform the following:
request method equals GET
, then modify response header
to append X-Content-Type-Options
with value nosniff
.modify response header
to append Referrer-Policy
with value strict-origin-when-cross-origin
.modify response header
to append X-Frame-Options
with value DENY
.
X-Frame-Options and HSTS.
We’ll go ahead and do a quick scan via Security Headers and validate things look good:
While not an A+, it’s good enough for Government work.
Note: It’s really easy to spill secrets via CircleCI and GitHub. I highly recommend you keep your repository private to reduce the likelihood of accidental misconfiguration.
Once we have built our website, configured the storage account, configured the Azure CDN, and have a valid TLS certificate, we’re ready to hook everything together. We’ll first configure a CircleCI project for our GitHub repository:
CircleCI now has a deploy key from the GitHub repository, and we’ve disabled building of forked pull requests. This is especially important if your repository is public, as adversaries can potentially steal secrets from environmental variables in your CircleCI node if these settings are enabled.
Also known as the “wreck my world” buttons.
Next, we’re going to go grab some credentials for our storage account in Azure:
This is your access key. Keep it safe; anyone with access to this key will be able to do whatever they’d like to your storage account. We’re going to go ahead and give it to CircleCI so it’ll be able to modify the bucket (and pray CircleCI never has a breach).
To do so, perform the following:
AZURE_STORAGE_ACCOUNT
with value daneio
(or whatever your bucket name is.)AZURE_STORAGE_KEY
with value <paste your key here>
.
Using environmental variables keeps credentials out of files in your repository.
Now we have CircleCI configured and ready to rock. The last step here will be generating a CircleCI YAML file for controlling when to build containers with our code. This is part art-form, part science, and may take a few (dozen) tries to get it right. I’ve included a copy of my current config.yml file below, which I’ll explain in further detail. Whether you use mine, grab a premade one, or make your own, you’ll need to throw it in your GitHub repository as .circleci/config.yml
.
version: 2
jobs:
build:
docker:
- image: circleci/ruby:latest
working_directory: ~/repo
steps:
- checkout
- restore_cache:
keys:
- rubygems-v2-\{\{ checksum "Gemfile.lock" \}\}
- rubygems-v2-fallback
- run:
name: Install Dependencies
command: |
bundle install --jobs=4 --retry=3 --path vendor/bundle && bundle clean
- save_cache:
key: rubygems-v2-\{\{ checksum "Gemfile.lock" \}\}
paths:
- vendor/bundle
- run:
name: Jekyll build
command: bundle exec jekyll build
- run:
name: HTMLProofer tests
command: |
bundle exec htmlproofer ./_site \
--allow-missing-href \
--allow-hash-href \
--check-favicon \
--check-html \
--disable-external \
--only-4xx
- run:
name: Cleanup filters
command: |
rm -f gulpfile.js jekyll-theme-clean-blog.gemspec LICENSE README.md package-lock.json package.json
- persist_to_workspace:
root: ./
paths:
- _site
deploy:
docker:
- image: circleci/python:latest
working_directory: ~/repo
steps:
- attach_workspace:
at: ./
- run:
name: Install Azure CLI
command: curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash
- run:
name: Upload to Azure bucket
command: az storage blob sync --source ./_site --container='$web'
workflows:
version: 2
Production Deployment:
jobs:
- build
- deploy
- deploy:
requires:
- build
filters:
branches:
only: master
This YAML file is configured with two specific jobs: build and deploy.
The build job runs against every commit to the repository and performs the following:
_site
folder locally within the CircleCI container._site
folder is saved for later use by the deploy job.The deploy job only runs against changes to the master branch and performs the following:
_site
folder.AzureCLI
tool.AzureCLI
tool to synchronize the storage account with the files we have locally.The final step of this project is configuring branch protection and status checks for our GitHub repository. This will force us to use pull requests for merging to master, and force successful CircleCI builds as part of that pull request. This will hopefully prevent us from pushing something broken into production by relying on our CircleCI build jobs as a gate.
To do so, perform the following:
Branch protections requiring a build CI Job to pass.
When properly configured, every commit to our website will automatically perform the build and identify any jekyll or HTML issues. When we feel comfortable with the final results and merge to the master branch, the deploy job will execute, updating our website on production. The synchronize command will manage all of our file uploads and deletes, making this CI job rather trivial to maintain.
To perform an update to the website, simply:
deploy
CircleCI job will run. You’re done.
Pull requests need to pass a CI job to deploy.
Successful build and deploy.
Lastly, how much does all of this cost? Well, so far, it’s cost about $0.25 for a handful of days. Most of the costs incurred have been from figuring out the services and experimenting with CircleCI jobs (e.g. syncing lots of files to storage.)
I anticipate that (a) it will typically cost between $10 and $15 per month, and (b) some clown will likely decide to try and drive up the costs substantially through malicious abuse.
Luckily, you can set a set a spending limit on Azure subscriptions to prevent costs from going through the roof. We’ll see how this shakes out after a month or two of operation.
Temporary pricing chart.