The Future of Open Information: Open Government Data and Beyond

January 26, 2011

‘Information is power’.  With that saying in mind, at times it must seem like government holds all the cards. The public sector produces massive amounts of information. Budget and spending information for Whitehall alone could keep a data analyst busy for months. But governments don’t always share information, at least willingly, and that’s something we should take seriously in a democracy. Take the MP expenses scandal for example – it originated with investigative journalism and some hard number crunching over five years against a largely unwilling institution, mainly through the efforts of one woman, Heather Brooke, and a pile of FOI requests. This one investigation led to several MPs standing down, at least one conviction, and a complete shift in the British conversation about transparency and accountability for our representative institutions.  

It was an honour to participate in the panel on open information in the public sector as part of SCL’s Future of Open event. As part of the event, we put on our ‘future gazing hats’ to look at current trends such as government information. Some of the most exciting areas of activity are around Open Government Data – initiatives making government data proactively published and much more accessible to the public without having the need for journalists to dedicate five years to forcing it out.  

Here are some examples. The London Data Store has opened up many datasets across the Greater London Authority at www.data.london.gov.uk. Openly Local has focussed on pulling multiple datasets from the local councils across the UK and collecting this information in a single site. Individual cities have started their own open data efforts, such as Open Manchester. Data.gov.uk has collected national and local data across the UK in a single site, similar to the data.gov initiative in the US. 

Further abroad, institutions such as the World Bank and the EU have open data initiatives. Even the UN together with the foreign aid departments of many nations (including the UK) work together under the IATI banner to publish open aid information to improve aid spending. The Guardian has collected together a data store for government data across the world. These are just the proverbial tip of the iceberg, as new initiatives come online all the time. 

From a lawyer’s perspective, Open Government Data is the application of open licensing techniques to government information, and this article tours the developments of this new open movement and the licences available.

Open Government Data Defined  

‘Open data’ often gets referred to in two different contexts: 

?        Open as in ‘accessible’ – the practical aspect of being able to access and play around with the data.

?        Open as in both legally and practically accessible – often as defined under established standards such as The Open Definition.  

This article examines the legal side of openness. From a legal perspective, open data is the application to data and databases of many of the same principles and philosophies as Free and Open Source Software (FOSS) and open content projects such as certain Creative Commons licences.  OKF uses the Open Definition, which is similar to other definitions such as Free Cultural Works and the OSI’s Open Source Definition. In general the Open Definition requires licences to allow use, reuse, and redistribution with very few restrictions – only attribution or share alike (copyleft or reciprocal licensing) meet the definition’s terms. Of course, where there are no IP restrictions at all – a public domain approach – this meets the definition of ‘open data’ because users are free to use it for anything. 

Thus for Open Government Data we mean information produced by public bodies that can be freely used, reused, and redistributed by anyone, including all levels of government, from local councils through to the devolved governments and up to Westminster, central government, and the various EU institutions. 

FOI and Open Government Data

You cannot discuss government information without at least comparing Freedom of Information (FOI) legislation. The two are very different. 

FOI = reactive. Government reacts to requests as a tool by the public to force disclosure of information. 

Open Government Data = proactive. Instead of waiting for requests by the public, government publishes core information that the public can access any time. 

What are the advantages of a proactive OGD approach over FOI approaches?  Some potential advantages include: 

  • Cost savings over FOI – Staff can spend time on their usual activities instead of having to spend  time responding to requests for information, and sometimes even multiple requests for the same information.
  • Greater transparency – This is fundamental to a democracy. Greater transparency not only helps the public understand government, but can also help other government departments get information from each other more quickly and easily.
  • Greater civic engagement – Greater transparency and accountability can lead to greater engagement with the political process as it can make a complicated area much more accessible. 

These reasons just begin the list, and those more interested in this area should read through the UK’s Public Sector Transparency Board’s Public Data Principles at: http://data.gov.uk/wiki/Public_Data_Principles  to witness the UK’s changing attitude to open information. No matter what their reason(s), governments have been embracing the open approaches and mass collaboration pioneered in open source software as an aspect of government policy towards public sector information. 

Open Government Data Licensing  

Here is a quick summary of the state of the key points on open government data licensing in general: 

  1. legal rights can cover data & databases
  2. legal rights require legal solutions
  3. legal solutions for open data exist. 

I will address each of these points in turn. 

1. Legal rights do cover data and databases

This first one will be no surprise to C&L readers, but corrects a common misconception by a significant portion of the public – data and databases are not a ‘rights free’ area where no intellectual property rights apply. TRIPS, for example, requires that WTO members provide legal protection for databases. The protections include copyright, database right, contract, and other rights such as trade secret, unfair competition. In addition, just as in many other areas, patents (whether software patents or not) and trade marks (over the database name, for example) can be an issue.  

2. Legal rights require legal solutions 

All of these legal rights produce a ‘rights thicket’ protecting databases and data and can form a significant obstacle for the use and re-use of data. They create a default setting for users of having to ask for permission to use data, which means a licence (or dedication, as noted below) will be necessary on the part of the rightsholder. If the goal of a rightsholder-government is to encourage use and reuse of their data (hence their interest in open government data), then addressing at least some of these legal issues in a licence will be necessary, especially for commercial users to feel comfortable. 

Thus Open Government Data requires a public domain dedication or an open licence to meet the definition above and explicitly allow for use and reuse. Compliant open licences must meet the terms of the Open Definition, which does mean that  some available public licences, such as Creative Commons licences containing non-commercial (NC) or no-derivatives (ND) clauses are not applicable to open government data.[1]  

3. Legal solutions for open data exist: Open data licensing 

As with open source or open content, open data licensing grants users certain rights but imposes certain obligations.  

Creative Commons 

Creative Commons offer several tools that may be appropriate for open government data depending on the jurisdiction: 

CCZero – allows for dedicating a work to the public domain if you are the rightsholder 

Public Domain Mark – this tool allows users to assert that a third party work is in the public domain (such as ‘marking’ a Shakespeare text as public domain. 

CC-BY and CC-BY-SA – these two out of the six main CC licences meet the open part of open government data. 

A full legal analysis of these tools is outside of the scope of this article, however I will note that the CC-BY and CC-BY-SA licences take varying positions on the sui generis database rights and were not written with databases in mind, and so may not be appropriate for all licensors.[2] 

Open Data Commons 

I helped start the Open Data Commons project in 2007 to provide legal tools for sharing data (at www.opendatacommons.org). This project is now hosted by the Open Knowledge Foundation, a not-for-profit organisation promoting open knowledge (full disclosure: I’m on the OKF’s board). Fast forward to today, and we have three main legal tools available: 

Open Database licence (ODbL) – this works solely on the database layer and requires both attribution and share alike. Equivalent in many ways to the Creative Commons BY-SA licences or GPLv2.   

Open Data Commons Attribution (ODC-BY) – this licence is much like the ODbL, except it requires only attribution. It is not a share-alike licence. 

Public Domain Dedication and licence (PDDL) – this gives up all copyright and database rights over a database and its contents, and so places a database into the public domain.  This tool works with our Community Norms document to provide for a set of non-binding expectations for how the data should be treated (such as norms of citation and attribution). 

All our tools were developed after long public comment periods and are available for use by the public. They are designed to be able to operate worldwide. 

UK’s Open Government Licence 

The UK government recently announced a new Open Government Licence for a broad range of public information. The OGL is effectively an ‘attribution-only’ style of licence and applies to a broad range of government produced information while specifically taking into account some of the unique aspects of open (government) data. The licence is explicitly compatible with both the CC-BY and Open Data Commons – Attribution licences. 

Which Approach Should Governments Take? 

Let me end this article with some final thoughts on which licensing approaches most fit Open Government Data. A slight disclaimer applies as these thoughts are my own. I firmly believe that, in the interest of democracy and transparency, the default setting for government data should be as open as possible unless the case is made for ‘closing’ that data. Thus I believe that public domain or at most Attribution-only licences should be used for Open Government Data. 

Share-Alike clauses ringfence a work to produce a community around that work. In software in particular, these clauses are a way to avoid free-riders that could ultimately discourage contributors to a project to the point of killing it. Governments however have in-built communities (citizens and residents), have incentives to produce data (as a side-effect of their government work), and have funding to produce data (taxes). I struggle to see in many cases the need for Share Alike restrictions by governments on their data, as it provides few benefits and only further restricts what the public can do with the data. 

Finally, the problem of licence pollution should be considered. Drafting workable open licences is hard work — but the good licences make it look easy, which is perhaps why so many people take on the task of writing their own terms.  These range from the practically public domain WTFPL, and all the *ware licences: sisterware, catware, beerware, tacoware, all the way to much more restrictive and complicated homegrown licences. 

Each tweaked term and added addendum effectively creates a new licence. A perfect example is the BSD & MIT-style licences, which have seemingly limitless variants.  Each change increases the chance that you are defeating legal interoperability – as opposed to technical interoperability. This means that you may have the perfect technical solution — the best data set or functioning code for the job — but the licence doesn’t allow its use. We sometimes build these ‘licence silos’, even within the open licensing community, that prevent use and reuse even between other open licences. This situation frustrates both lawyers and techies alike. Using established public licences such as Creative Commons or Open Data Commons can help ensure legal interoperability, and can even increase user uptake by lowering the transaction costs of users having to learn new licences.  

In preparing for my contribution to the SCL Forum, I was asked ‘Has the open information revolution run its course or just begun?’  After reviewing all the exciting developments and changes afoot with open government data, I hope you’ll agree with me on the answer: The revolution has just begun.  

More resources 

Interested readers may like to review: 

Open Knowledge Foundation www.okfn.org 

Open Data Commons www.opendatacommons.org  

Open Government Data www.OpenGovernmentData.org  

Where Does My Money Go? www.wheredoesmymoneygo.org/  

Panton Principles http://pantonprinciples.org/  

Science Commons Open Access Data Protocol http://sciencecommons.org/projects/publishing/open-access-data-protocol/

 

Jordan S. Hatcher, JD, LLM is a Texas lawyer living in the UK and working on IP and technology law issues here. Jordan sits on the Board of Directors of the Open Knowledge Foundation, a non-profit organisation promoting open knowledge in all forms, from genes to geo-data, and from sonnets to statistics. He’s also the co-founder and principal drafter behind the set of open data licences at OpenDataCommon.org: www.jordanhatcher.com



[1]    Out of the six main CC licences, only the Attribution (CC-BY) and Attribution Share-Alike (CC-BY-SA) licences are open licences.

[2]    Note that the six main licences have been versioned across time (the current being 3.0) and have been ‘ported’ to 50+ jurisdiction-specific adaptations, such as for Scotland, Germany and so on. Thus even for these ‘six’ the specific version and jurisdiction should be reviewed for appropriateness for the particular use.