CORS Headers in Nginx

Update: Before going much further, there now is a much more comprehensive CORS walkthrough for nginx at enable-cors.org – so check that out before following the below.

If you’ve deployed even a mildly complex web application in the last few years, you’ve probably had to care about CORS headers. They allow webpages to make requests to another domain, or the same domain on another scheme. Without them, you’ll find that trying to request other assets will be forbidden by your browser, and things won’t load.

They’re relatively simple to implement. You just add a header:

Access-Control-Allow-Origin: https://www.alexjs.im

to the HTTP responses of assets you’d like to call in your webapp. Thanks to Michiel Kalkman’s gist you can easily achieve this in Nginx – with something relatively standards compliant, too.

The problem, it seems, is that despite the W3C spec and RFC 6454 prescribing the use of a list of origins, not all browsers (e.g. Firefox) support multiple domains in an Access-Control-Allow-Origin header:

Access-Control-Allow-Origin: https://www.alexjs.im https://www.alexsmith.org

The easiest solution is to use a wildcard:

Access-Control-Allow-Origin: *

However that can cause some security implications. The best compromise I’ve found to get around this was to implement a simple whitelist in the Nginx config and match against that. I’ve put this in a public gist – and I’m testing it for deployment now.

I’ve not yet done any performance testing, so I’m not sure how efficient the Nginx regex engine is and what the overall effect on throughput/capacity is. I’ll probably forget to update this post with a bit of information once that’s complete.

Update:

This has been in production for a couple of months now, and we haven’t had any performance issues. It seems that for the throughput we require (<10 req/s) we’re able to yield the load on a single m1.small comfortably, so I think the nginx regex engine’s pretty efficient.

Making nginx ignore query string parameters

When using nginx as a caching proxy, I found myself needing to ignore particular parameters for both the cache key and the values being passed to the backend. In this particular situation the value I wanted to ignore was ‘uid’. An example URI being:

http://myapplication.fqdn/foo.ext?env=bar&uid=baz&node=qux

or

http://myapplication.fqdn/foo.ext?uid=bar

To ignore this, in the top of my site configuration I put:

proxy_cache_key         "$scheme$host$uri$is_args$args";

in the server stanza:

if ($args ~ (.*?)(?:^|(&))uid=[^&]*(?:(\2.*)|&(.*))?) {
    set $args $1$3$4;
}
if ($args ~ (^w)) {
    set $args ?$args;
}

and the location stanza:

proxy_pass              http://appservers$uri$args;

So now my backend servers see:

GET /foo.ext?env=bar&node=qux

or

GET /bar.ext

and seldom few hits get through to there anyway, as the cache key flattens it appropriately.

Easy.

EDIT: The ‘easy’ bit is a lie, it seems. Thanks to @davidgl for pulling me out of regex hell. Several revisions here helped by him.