On Content-Security-Policy Headers

Content-Security-Policy is an HTTP response header that can act as an extra barrier to common site hack hijinks like XSS. Think of it as a whitelist for assets — scripts, styles, images, media, objects, fonts — all the things that can go rogue and turn your site into a Canadian pharmacy or attackbot.

Web browsers that support CSP will consult your rules before loading or executing the linked assets or embedded code in your document, and block anything that didn't make the cut.

This might seem redundant, but that's because it is redundant. I said "extra" barrier, not "only" barrier. Pay attention!

Threat #1: Known Unknowns

I don't have the figures in front of me, but something like 100% of all modern web sites blindly pull in software packages written by Athena-knows-who from Poseidon-knows-where. Those dependencies have their own dependencies which have their own dependencies, et cetera, ad nauseam.

This is made very easy by modern development tools like Composer, NPM, and Cargo. And it usually begins innocently enough.

Say you are making an e-commerce web site that needs to hook into the Authorize.net payment gateway. Following Authnet's official developer guide, you would proceed to install their PHP SDK library as follows:

composer require "authorizenet/authorizenet"

Composer, helpful assistant that it is, takes care of the details for you. Don't worry about it.

./composer.json has been created
Loading composer repositories with package information
Updating dependencies (including require-dev)
Package operations: 11 installs, 0 updates, 0 removals
  - Installing symfony/yaml (3.4.x-dev 25c192f): Downloading (100%)         
  - Installing phpoption/phpoption (1.5.0): Downloading (100%)         
  - Installing doctrine/instantiator (dev-master 8520afa): Downloading (100%)         
  - Installing doctrine/lexer (dev-master cc709ba): Downloading (100%)         
  - Installing doctrine/annotations (dev-master fe71864): Downloading (100%)         
  - Installing phpcollection/phpcollection (0.5.0): Downloading (100%)         
  - Installing jms/metadata (1.6.0): Downloading (100%)         
  - Installing jms/parser-lib (dev-master 6067cc6): Downloading (100%)         
  - Installing jms/serializer (dev-master 62c7ff6): Downloading (100%)         
  - Installing goetas-webservices/xsd2php-runtime (dev-master f62d40a): Downloading (100%)         
  - Installing authorizenet/authorizenet (dev-master 3d816a1): Downloading (100%)

Great! It all works!

Your project has now increased in size by 1,333 files, but that's only 76,098 lines of code. It shouldn't take long for you to thoroughly inspect, right?

Now, I know what you're thinking. "I am responsible and when I use Composer I make sure to set the --no-dev flag."

Boy, you sure are responsible! Here's what that does for you:

Loading composer repositories with package information
Updating dependencies
Package operations: 11 installs, 0 updates, 0 removals
  - Installing symfony/yaml (3.4.x-dev 25c192f): Downloading (100%)         
  - Installing phpoption/phpoption (1.5.0): Downloading (100%)         
  - Installing doctrine/instantiator (dev-master 8520afa): Downloading (100%)         
  - Installing doctrine/lexer (dev-master cc709ba): Downloading (100%)         
  - Installing doctrine/annotations (dev-master fe71864): Downloading (100%)         
  - Installing phpcollection/phpcollection (0.5.0): Downloading (100%)         
  - Installing jms/metadata (1.6.0): Downloading (100%)         
  - Installing jms/parser-lib (dev-master 6067cc6): Downloading (100%)         
  - Installing jms/serializer (dev-master 62c7ff6): Downloading (100%)         
  - Installing goetas-webservices/xsd2php-runtime (dev-master f62d40a): Downloading (100%)         
  - Installing authorizenet/authorizenet (dev-master 3d816a1): Downloading (100%)

Of course, in a perfect world, you would pay attention to details like these. If you stumble across some lazy piece of shit library like Authorize.net's SDK, you'd sigh and resolve to write it your own damn self. But even in a best case scenario, if you're using code you didn't write yourself — which for anything more complicated than a one-pager you probably are — the risk meter will tick every so slightly upward.

Threat #2: The Need for Speed

Perfectionist that you are, you've run your new site through a performance analyzer like Google's PageSpeed Insights. The results say CDN, CDN, CDN, CDN! The third-party Javascript frameworks you're using will load faster, it argues, if the scripts are served from a specialized Content Delivery Network. (Because your server sucks.)

You don't have a CDN subscription handy, but that's okay because just about every popular script in the world is already publicly hosted somewhere. Just run a quick Google search, copy-and-paste the URLs into your code, and you're rolling!

PageSpeed congratulates you, and you move on with your life.

The problem is, you linked to http://www.j-query.org/jquery-2.2.4.min.js and not https://code.jquery.com/jquery-2.2.4.min.js. All your jQuery code still works as expected, but the cuckoo library you're calling has a hidden payload, and it's watching everything you and your users do.

Again, in a perfect world, you would decide whether or not the use of a CDN makes sense for you, and if it does, you would carefully vet the source to make sure it is blessed by the library developers, or something you control yourself.

Threat #3: The Data Abyss

Before your shiny new web site can be launched, it must be connected to services like Google Analytics, Facebook Platform, Salesforce, and FullStory! Sure, you'll never actually review the mountains of data those companies are collecting, but it just feels right giving it to them.

Unfortunately these data mining services happen to mine data. Oh hey, you know what kind of data e-commerce sites have? Credit cards! You'd think there would be some sort of logic built into analytics and session-replay scripts used by major companies to avoid this sort of leak, but, well, not always.

Beyond trust, however, there's a bigger issue with these sorts of services: using them might necessarily require you to downgrade your site's security. Pardot forms posting to vanity URLs, for example, do not accept encrypted connections, so neither can your site. Google Tag Manager requires variable inline scripts, demanding either looser CSP protection or additional server load (kiss static cache goodbye!).

Once more to the perfect world: you know you don't actually have to add these tracking scripts and conversion pixels, right? I mean, if you rely on the data, that's one thing, but if not? Fuck it. Leave them out.

Threat #4: You've Been Hacked

Let's face it. Hacks happen. Maybe you were careless and missed a few critical updates, maybe you were just an unlucky early victim of a new Zero Day exploit, or maybe one small piece of code doesn't properly sanitize user input. Regardless, your site now includes a tidy little inline script that listens for keystrokes and sends all that data back to its evil master.

Eventually you'll figure this out and fix it, but until then, every visitor to your site is potentially exposed. No matter the response time, this is bad PR.

So About That Extra Protection…

Content Security Policies are redundant protection, yes. Your server should have a good understanding of the pristine content you intended it to serve, so it can pass that information along with whatever content it is actually serving. If magic content suddenly appears, the browser can shut it down before it can do any harm. This might break your site, but at least it will help prevent your site from breaking your visitors' computers.

Got it? Okay, let's move onto the rules.

One important thing to understand about Content Security Policies is that they are uniquely tailored to each site. There is no one-size-fits-all solution. A small site that is lovingly crafted by hand should be able to cover all its bases very explicitly, while a slapdash WordPress site with 50 plugins might have to settle for much looser policies.

The first step is to think about the kinds of assets your site might be serving. The main asset type directives for CSP are as follows:

style-src - Styles and CSS.
script-src - Javascript (of the text/javascript or application/javascript variety).
object-src - Relics of a bygone era like Java Applets and Flash.
media-src - Audio and video.
image-src - Images.
frame-src - Frames and iFrames.
font-src - Web fonts.

Now think about where those might be coming from. Are there any common patterns? If so, we'll use those to populate a default-src, which as you might have guessed serves as a sort of starting point or fallback rule for pieces that aren't otherwise specified.

Let's talk about source values. Each directive can contain any number of whitelisted sources. There are a few magic entries that apply to some directives and not others, but for the most part they're the same. A more comprehensive reference can be viewed here.

* - Wildcard, i.e. anything goes.
'none' - Kill 'em all!
'self' - Anything same-origin is OK.
data: - A data-URI, such as a base64-encoded image.
https: - Anything using the HTTPS protocol.
domain.com, *.domain.com, https://domain.com - domain.com (any protocol), all subdomains of domain.com (any protocol), domain.com (SSL) respectively. Note: *.domain.com does not match the top-level domain.com; if you need to whitelist a domain and its subdomains, you gotta use both.

For scripts and stylesheets specifically, there are a few additional magic values:

'unsafe-eval' - Allow scripts to run eval().
'unsafe-inline' - Allow all inline scripts and/or styles.
'nonce-XXX' - Allow inline or linked assets with the Nonce XXX. (Example below.)
'sha256-XXX' - Allow inline or linked asset with a base64-encoded SHA256 content hash of XXX. (Example below.)

default-src

For most sites, the default-src should include 'self' because if you can't trust yourself, who can you trust? It is probably also a good idea to hardcode your domain with an "https" if your site requires SSL, like https://domain.com. And lastly, if you treat "www" and "non-www" the same, something like https://www.domain.com or https://*.domain.com, the latter handling all subdomains in one elegant swoop.

Putting it all together, the default would look like:

default-src 'self' https://domain.com https://*.domain.com

If you specified no other directives, your site would be allowed to link to external images, styles, scripts, etc., so long as they're hosted by you or one of your subdomains.

Inline Headaches

Ah ha, but Content-Security-Policy has an important gotcha: inline scripts and styles, like <script>var foo='bar';</script> are not treated as originating from 'self'. That might seem counter-intuitive, but it is actually pretty critical as that's where most XSS happens.

For handling inline scripts, there are a few tricks:

Externalize Them

If the contents of an inline script or style tag are static, you can simply move them to an external file (hosted by you) and link to them the usual way. For the previous example, that would be like:

<script src="foo.js"></script>

With foo.js containing:

var foo='bar';

Nonce Them

If your site content and CSP header are always served dynamically, you can take advantage of the 'nonce-XXX' source value. When your CMS initializes, generate a unique alphanumeric Nonce. For the sake of argument, say the Nonce is "ABC123". The CSP header would contain 'nonce-ABC123' for script-src and/or style-src.

Then on the HTML side, for any inline scripts or styles you have, add this to their opening tags:

<script nonce="ABC123">...</script>
<style nonce="ABC123">...</style>

You can also Nonceify external scripts, if you'd prefer to whitelist one particular script instead of the whole host.

The main drawback to the Nonce approach is that you have to generate unique Nonces on every page serve, and the headers have to be set accordingly. This means that you probably won't be able to pass headers via Nginx or Apache, and you won't be able to serve statically-cached content.

Hash Them

If your site is serving static content and you want to leverage cache, whitelisting individual cryptographic hashes is a good alternative to Nonces. The best way to do this is to calculate a SHA256 hash of the full contents (including whitespace) of each script or style tag (minus the tag itself). The corresponding CSP value will look like 'sha256-HASHVALUE'.

As for calculating the hash, you can run the following:

echo -n "var foo='bar';" | openssl sha256 -binary | openssl base64
# epc1V+GyMXLoKNQnEDBB8Hp59TrPTTc5Bfl/pZo673E=

Putting it all together, the CSP source would be 'sha256-epc1V+GyMXLoKNQnEDBB8Hp59TrPTTc5Bfl/pZo673E='.

Alternatively, if you already have a CSP header blocking undeclared inline scripts, Google Chrome's Debug Console will tell you the expected hash.

Ditch Them

As mentioned earlier, chances are at least some of the scripts and styles being printed inline on a site aren't actually useful. Maybe they're from a plugin that was never uninstalled, or a freebie data gobbler like Analytics that is never consulted. In such cases, a little spring cleaning is satisfying.

Ignore Them

Sometimes there just isn't anything that can be done. If a site's content is CMS-driven, there could be any number of plugins or authors adding surprises that can't be programmatically accounted for. Or maybe a site is built with a Javascript framework and requires inlined data to render everything properly.

That, ladies and gentlemen, is what 'unsafe-inline' is for. This value will moot a large part of the XSS mitigation potential of CSP headers, but if you don't have a choice, it'll make shit work.

Okay, Let's Do It

Enough chat. Let's build a rule! The format is easy: <directive> <source> <source>; <directive> <source> <source>…

Here is an example, with comments and line breaks for readability only.

# The default is the current site, and any SSL connection to domain.com (including subdomains).
default-src 'self' https://domain.com https://*.domain.com;
# For scripts, maybe we want to whitelist Google Analytics, the Apocalypse Meow "noopener" script, and allow eval() so Vue.JS doesn't explode.
script-src 'self' https://domain.com https://*.domain.com https://www.google-analytics.com 'sha256-r0zZw59lcD47g8gJg/nnDgkfQ4Msa3nt2oWZEz6M/ZA=' 'unsafe-eval';
# Images also might need Analytics, but also Gravatar and base64-encoded data-URI streams.
img-src 'self' https://domain.com https://*.domain.com https://www.google-analytics.com https://*.gravatar.com data:;
# Maybe inline styles aren't really a concern. Most CSS-related exploits work best against browsers that don't know CSP anyway.
style-src 'self' https://domain.com https://*.domain.com 'unsafe-inline';
# Objects? No. This isn't 1995.
object-src 'none';

You'll notice font-src and a few others were left off. As such, they'll just inherit the default.

In general, you'll probably want to set up a good default rule, specific rules for the obvious pieces, and leave the rest open-ended, filling them in as you come across them.

Goddamn It, How Do I Set the Header?!

Right! So, the header name is Content-Security-Policy. The value would be everything in the block above, minus the comments and line breaks.

For dynamic sites, the headers could be sent along with the generated page. For example, in PHP you would do:

header("Content-Security-Policy: default-src 'self' https://domain.com https://*.domain.com; script-src 'self'…", true);

Headers could also be sent directly from the server software, though it can be much more difficult to send different headers for different requests this way. For Nginx servers, you would do this:

add_header Content-Security-Policy "default-src 'self' https://domain.com https://*.domain.com; script-src 'self'…";

That's it!

One More Thing

Once you have some content security policies in place, be sure to go back through your web site. Policy errors should be logged to the Debug Console, so just keep an eye out and make adjustments as needed.

You might also want to specify a directive I failed to mention earlier: report-uri /my-reporting-page. If set, a browser will report any errors it encounters to the specified URL. The data will be in JSON format, something like the following:

{
  "csp-report": {
    "document-uri": "http://example.com/signup.html",
    "referrer": "",
    "blocked-uri": "http://example.com/css/style.css",
    "violated-directive": "style-src cdn.example.com",
    "original-policy": "default-src 'none'; report-uri /my-reporting-page"
  }
}

This can be very helpful, but can also be very spammy, so please use with caution.

	Josh Stoik 14 January 2018
Previous	The Art of Database Partitioning
Next	PHP 7.2 on Debian Stretch