Open, Shared, Closed — understanding data sharing

Open Data can be used by anyone for anything for free (e.g. Creative CommonsOpen Government Licence)

Shared Data is data with a preemptive licence (e.g. ‘data as a service’ that can be used with certain restrictions)

Closed data requires a user-specific custom licence/contract for use (e.g. ‘bilateral contract’ for a specific project)

Framing data sharing around ‘how can I use it?’ & value exchange

Everyone thinks their data is valuable—it is. But how we measure and exchange value is something we need to explore. The way we attribute value is embodied in how we license data: it’s we codify value.

When thinking about data-sharing, rather than saying just ‘open up all the data’, we recommend starting with the question ‘how am I allowed to use it’. This links us to the ‘so what’ question—what problems are we trying to solve?

The phrase ‘open data’ can mean many different things to different people. We have spent many years as a community, as well as through organisations such as the Open Data Institute (of which I was CEO), the Open Knowledge Foundation (of which I was a non-executive director) and others, trying to ensure we all had one definition ≔ Open Data can be used by anyone for anything for free.

Examples of open data include public data such as the human genome, a bus timetable or any of the 55,000 data sets here (you may be surprised that it took quite a bit of effort to get bus timetables to be open in the UK). We also defined that personal data is not open data—for many reasons—and that data is now covered by regulations such as GDPR. We made decisions that the ‘value exchange’ in making certain data open that has been funded by the taxpayer should be open because, as taxpayers, we’ve already paid for it.

Reciprocity = fair value exchange

Value comes in many forms. If we share data with you and you provide something back, we may choose to exchange it without a cash payment. Value, however, is exchanged. From an economic perspective, we can move the costs of certain value exchange to marginal cost (a cost we absorb as part of our usual business operations).

If there is reciprocity, with value flowing in both directions—even if it’s not a direct 1:1 exchange—we can see the broader benefit to the market in which we are operating and that can benefit our business through operational efficiency, risk management or opportunity generation.

Open banking is an example of market-wide reciprocity

Open Banking is a perfect example of this. Firstly, it mandates that banks publish their product information as Open Data. This makes it easier to find and analyse products that might fit our needs. The value exchange (reciprocity) is that by making it easier for users to find products that suit their needs, banks will get a better fit of customers-to-products which can increase the likelihood of having a happy customer. This is a win-win.

Open Banking also mandates that personal financial data (e.g. bank statements) can also be transferred between banks by the customer without a financial cost. But this data is not open and it’s not ‘free’. Firstly, it is either personal or commercially sensitive data, so it cannot be open.  Secondly, it is not free as there is a material cost to provide that scale of data management.

However, all the banks have agreed to this because (aside from it being regulated) there is a mutual benefit. The market as-a-whole benefits, the costs all ‘balance themselves out’ — there is reciprocity. Further, while some banks used to feel that holding on to their customer’s data was paramount, it’s not the customer-value point that they should be competing upon. Furthermore, with GDPR, the data is controlled by the end customer. 

The rules that govern this data exchange are encoded into the Open Banking Standard. It covers everything from the rights surrounding the data to the liability transfer as data flows. It is a commercially focussed framework that allows data-sharing. These rules are now both common and shared across the whole market. It effectively defines the rules for sharing in advance.

If we frame this as a ‘license’ (a set of rules) to share data then we can define Shared Data ≔

Shared Data has a preemptive licence for a specific use

After this, what’s left? Either data that you don’t want to share outside of a specific group (e.g. people contracted to work for your company), or that is only shared using bilateral contracts, where each contract needs to be unique. We define Closed Data ≔

Closed Data requires a user-specific custom licence/contract for use

For example, a bilateral contract for a specific project, or access to information enabled via an employment contract. 

Data increases in usage & value the more it is connected

Creative Commons defined a step-change in thinking. It enabled us all to say “it’s okay to use this image for free” in advance. As of May 2018, there were an estimated 1.4 billion works licensed using a CC licence.

With Shared Data, if stakeholders published their data descriptions and their licensing options per type of use (aka ‘preemptive licensing’), then other stakeholders can just access it — compliant to their respective licensing requirements. This can enable people to create different types of value exchange, including granular payment structures for different types of use.

We must also have clear Open Data descriptions of the Shared Data and how it might be used (how it is licensed).

Publishing open data that describes the shared data will enable search engines (and therefore you) to find it. If the licensing is clear, then the friction between discovery and usage is reduced.

Doing so will increase the size of the observable dataverse and help to unlock innovation while protecting the interests of individuals, organisations and countries to use it for both public and private good.

An interesting example is the UK Open Banking Standard, which preemptively defines and mandates ways to share personal and business data—it is now regulated with every UK high street bank engaged and over 300 fintech companies in its ecosystem.

You can read more about the evolution of the Data Spectrum here and more about the web of data here.