Matt Andrews

Web cookies & the E-Privacy Directive: alternatives and workarounds

24 Apr 2012

A sample of a website's cookies

It hardly sounds like the most stimulating of legal documents. As a title, Directive 2002/58 on Privacy and Electronic Communications lacks the punch of, say, SOPA or PIPA, although its potential impact on European society could be measurably similar.

Having gained the dubious privilege of a catchy pseudonym (the E-Privacy Directive), it’s already been marked out as a subject of discussion and controversy. Put simply, the directive aims to regulate, for the first time, unsolicited spam, personally identifiable website traffic data, and most controversially, cookies.

If you’ve managed to read this far it’s probable you’re already familiar with cookies, but for those of a less technical bent, a cookie is essentially a small text file that sits on your computer storing information linking you and a specific website. A website can set a cookie on your computer which contains information (say, for example, the date you last logged in), and that same website (and only that website) can read it back out again when you visit in the future.

So why the controversy? The directive aims to require websites to ensure their users are “made aware of information being placed on the terminal equipment they are using”. While it makes a few exceptions (specifically for shopping cart applications where it’s reasonable to assume some tracking of purchased items), in general it aims to protect users by ensuring they “have the opportunity to refuse to have a cookie or similar device stored” on their computer.

Remember me

For the privacy conscious, this all sounds reasonable and straightforward. Why wouldn’t you want to be informed when a website was tracking information about you? From the perspective of web developers, however, this becomes significantly more troublesome. Currently, a cookie can be set completely transparently, with users ideally being unaware that the process is taking place. This isn’t for nefarious, sneaky data-stealing purposes -- it’s to ensure the user’s experience as they browse is seamless and streamlined. Almost every time you tick a “remember me” button on a login screen, you’re implicitly asking for a cookie to be stored on your computer to achieve this functionality.

This new legislation requires, somewhat vaguely, that websites provide a “[method] to request consent”. Some sites have already begun to experiment with how this could work: a popup box requesting permission to set cookies is currently one of the more popular ideas, although it’s unclear how this aligns with the directive’s further requirement that this experience “should be made as user-friendly as possible”. Many of the more complex websites today set dozens of cookies on every page impression.

You may be wondering when this new law comes into effect. The answer, somewhat bizarrely, is May 25, 2011. The law was passed, and, well, nobody did anything. In the developer community there was initially an air of disbelief and confusion about whether this was really going to happen. The government were equally stumped, with legal experts unable to offer advice on dealing with the implementation since nobody had worked out how to actually implement it. The Information Commissioner’s Office (ICO) quickly agreed that an effective start date for the new law would be one year later, eg. May 25th 2012. As that date approaches, website owners are beginning to scramble into action to avoid becoming the subject of potential litigation.

It should be clear already that nobody wants to pepper their website with permission dialogs and request popups. While some guilty culprits may baulk at the idea of revealing just how much tracking they’re doing, others may be more concerned by the impact to their business. The prospect of running, say, an online shop, competing against US and other markets where the directive doesn’t apply, seems particularly unfair.

There are some possible ways out, though. The advent of the much-misunderstood HTML5 offers some new techniques that may be used to circumvent the need for cookies. This is still uncertain, however, as the directive is deliberately vague about the technologies it applies to. If web developers simply switch over to other, similar techniques, it seems likely the ICO and others will simply clarify the scope of the law and request permission for them, too.

The ICO website's effort at asking users to approve cookies

One such workaround is the sessionStorage and localStorage APIs, known together as “web storage” (or “DOM storage”). These are features supported by many newer browsers which allow developers to store small “key/value pairs” inside the web browser, which can be read back either during that particular browsing session (eg sessionStorage) or in a “persistent” state, meaning the data can be read after the browser has been closed and reopened. A “key/value pair”, since you ask, is essentially how a cookie works currently. The key is the name of the data being stored, eg “last_login_date”, and the value is simply the specific data associated with that key, such as “2012-05-25”.

Both of these web storage functions differ from traditional cookie-based tracking in a number of ways. Firstly, and perhaps most significantly, they do not come equipped with expiry dates. Normally, when a cookie is set, the developer specifies a date that the cookie becomes invalid. This means that after 30 days, your email provider may require you to login once again, after your cookie expires. In the case of web storage, the data is held indefinitely, so although the web app can delete items stored locally, they won’t expire naturally like many cookies do. Secondly, this data can only be set and read back on the user’s browser, rather than by a program running on a web server. While this may not be as significant for a user, it can introduce major challenges for developers planning to simply switch over from using cookies to web storage. Latterly, a website can now store up to 5MB of data in web storage. A traditional cookie offers a paltry 2KB.

There are arguments currently taking place in the web community about the value of the web storage tools, with some developers criticising their performance issues and suggesting we move towards newer technology like IndexedDB or SQLite, but browser support for these kind of tools is even less reliable than the web storage ones. There are other, older methods, like falling back to browser sessions (which requires the rather ugly tactic of appending long text strings to URLs in order to track sessions without cookies) or using databases (more complex and perhaps overkill for storing small pieces of data). It looks like we might just have to keep using cookies, and do whatever the law says we have to do.

News websites' tracking usage

A cynic might suggest that most website owners aren’t keen on the new law because it will force them to reveal just how many tracking cookies they set. Indeed, the Guardian (my employer) has been accused of having too many cookies and tracking tools on its pages. Most of the data stored here, though, is pretty innocuous, and almost always unintelligible for the end user. Much of it is advertising related, so partners can see how many people have seen their ads, and increasingly the Guardian uses techniques like A/B testing, which allows it to show different things to different visitors to the site so we can measure how well they do. We use cookies in order to track who saw what, and make sure people’s browsing experience is consistent. There are also a couple used to track when users have logged into the site, whether they have any settings for what order to display comments in, and other assorted bits of functionality.

Large websites like Facebook, Twitter, Google and others use techniques like this constantly. Facebook and Google in particular are famed for their rigorous user testing process, where they roll out new features and designs to a small percentage of their userbase, measure the performance, then roll the best ones to everyone. As the Guardian develops its online products further, we use more and more of these kind of methods too. While critics can suggest there are too many tracking cookies or data being stored, the majority of it is used to improve the user experience and even enhance the site so you get new or experimental features before everybody else.

Cookies have their flaws, too. There’s no way to set a cookie across more than one device (or even more than one browser), so some webapps struggle to give users a consistent experience if they use a version of the app on their mobile, desktop and tablet, for example. While this new legislation is forcing developers to examine the alternatives, it does have the benefit of pointing out the limitations of the technology and perhaps giving the community a pointer in the kind of direction we should be moving towards in terms of tracking and analysing users.

Usage figures for the ICO's website when they introduced opt-in cookies

Analytics and advertising are two aspects of the web that are never going to go away. We can complain about invasions of privacy and ads following us from site to site, but the reality is that when the E-Privacy Directive takes effect, European users are going to have an irritating and degraded web experience. Being pragmatic, is this a price worth paying? Traffic hugely dropped off from the ICO’s website after they implemented their example of how permission-based cookies could work. It’s certainly worth highlighting to website owners that they should evaluate the amount of tracking they do and ensure it’s appropriate, but the reality is that this law isn’t going to magically improve users’ privacy and stop evildoers tracking people across the web. It’ll inconvenience casual web users, irritate web developers, and potentially cause European websites to drop in traffic as overseas visitors get sick of dealing with irritating hoops to jump through every time they want to visit a page.

Cookies aren’t perfect, but policing the cookie jar isn’t the solution to the problem either.