The Good Oil on ‘the New Oil’

It is often
said that data is ‘the new oil’.
However, this catchy epithet is misleading. The methods of valuation of most things,
including of scarce commodities such as oil, gold and clean water, do not
capture many unusual characteristics of data. These unusual characteristics taken together make
data unique, different to other forms of property, both tangible (chairs, dogs
and pencils) and intangible (software, books, trade marks and patents).

The greatest
paradox of data is that it is the most important asset class of the 21^st
century, but it is not an asset class at all.
Indeed, data cannot be ‘owned’ in the traditional sense of proprietary
rights recognised in common law (i.e. my chair, dog and pencils) or conferred
by statute (i.e. copyright).

And although
data is not given a value under generally accepted accounting principles, the
market capitalisation of both ‘unicorns’ and ‘data giants’ show that public
financial markets and venture capitalists see value otherwise.

There are many
other odd features of data.

Most property derives its value through
being closely held, an aspect of managed scarcity. But today data often derives value through
sharing. At least some data sharing
within multi-party data ecosystems is required today to deliver almost all
online services (and particularly internet of things applications) and most
offline supply of products and services.

Data can be
infinitely reproduced and shared at effectively zero replication and sharing cost. Data does not derive its value through
scarcity. Data often derives value
through created value of ‘discoverability’: transforming raw data to enhance
capability to find it and to link it to other data to share it and explore
combined data sets for correlations, such as attributes of a user to whom the data
relates, or to link to that user’s other transactions. Often in data analytics projects 75% of the
cost is cleansing and transforming raw data to make it discoverable: the high-end work of then analysing it is the
smaller part of the program budget.
Discoverability may be created within a privacy protected environment:
in many cases, substantial data value can be created whether or not particular
individuals are identifiable.

Increasingly, data derives value not
through direct application, instead informing development of methodologies and algorithms
for use on other data, such as the use of ‘training data’ to inform machine
learning applications.

Data has little inherent value. Data value is derived not by what it is, but by what can be done to create value with it and then endurably capture that value by denying
others the ability to do those things while not also exciting regulatory
intervention that strips value. Possible
value depleting interventions include enforcement of competition (antirust) laws,
statutory creation of new ‘consumer rights’ over data, and enforcement by individuals
of rights of access to, or portability of, personal information about them as held
by businesses.

A legally enforceable right to deny
others the use of data that is controlled but not owned may be created by contract,
through application of principles of equity as to protection of confidential
information, or by exercise of statutory rights in trade secrets. However, these are rights to deny others
access to or use of data not proprietary rights.

So what is
‘data value’? Data derives value to the
extent that a data custodian has both technical capability and legal right:

?
to capture useful data points
of sufficient granularity and number in a readily usable form, and

?
to bring together that data (ie
make the data ‘discoverable’), and

?
to transform and then analyse
the data, to derive meaningful insights and to enable actionable decisions to
be made by the data custodian (or by another person with whom the custodian
shares the information); and

? through first mover or other advantages and operating under the
cloak of confidentiality or trade secrecy safeguards buttressed by contractual
provisions, to deny others the ability to do those things, while not exciting
regulatory interventions.

To summarise
the current state of data science as applied to customer data:

?
The value of data is derived
through increased availability of data capable of aggregation and merger with
other data sets to provide information in a form readily assimilated and used
by humans (such as visualisations and other value-added presentations).

?
Through data analytics methods
of increasing sophistication, data acquires additional value. For example, algorithmically generated
customer segmentation analyses and tools enable service providers to improve
their ability to define and then target increasingly granular customer segments
and to differentiate as to price and other terms offered to those customer
segments.

?
Data is also more readily
available for analysis because it is more ‘discoverable’ as data taxonomies are
standardised and as data extraction tools refined. There is more data that can be discovered and
used. There is a rapidly expanding
better range of already developed and tested data analytics tools and
methods.

?
Data is more readily available
because businesses interact with each other and with consumers through
increasing flows of consumer data and because each businesses is increasingly
algorithmically driven in its own operations (therefore requiring better
integration and availability of data across the business).

?
Data value can be realised
through improved ability of business to understand characteristics (attributes,
preferences and interests, inter-relationships) of their customers and to infer
characteristics of ‘lookalike audiences’ of prospective customers.

?
Tools and methods for analysis
and presentation of actionable insights are now widely readily available,
including powerful tools readily available to consumers in the form of apps on
smart phones or access to service comparison engines.

As new
competitors and intermediaries emerge, every service provider will need to
make, and regularly review, a number of the key strategic decisions about customer
data that they collect.

?
To what extent should the
provider share the value that the provider can derive through data: for example,
by empowering the provider’s customers to make better, more informed decisions,
and meeting demands by customers for enhanced access to data to facilitate
customers analysing that data themselves?

?
Should a provider respond only
to regulatory compulsion, or should a provider strive to differentiate itself
from its competitors by addressing demand by (some) customers for greater
access to, and control over, data about them?

?
Is it viable to keep customer
data within a provider-controlled data ecosystem? To what extent will customer demand or
regulatory intervention be such that certain customer data should be
proactively made available into a more open data environment? If so, how much data, how made available, and
to whom?

?
How does a provider make data
more available while adequately protecting its customers from security
risks? In what circumstances should
customers be taken to know and assume the consequences of such risks? More specifically, how does a provider verify
that a customer properly understands the risks of conferring agency upon an
intermediary to conveniently access sensitive financial data about a
customer? How does a provider verify
that an intermediary (such as a price comparison engine) purportedly authorised
by a customer to access certain sensitive financial data about that customer is
so authorised and will handle that data with due care?

?
Where data is shared, and particularly
where value is created through joint endeavour, is it jointly ‘owned’ (in the
legally incorrect sense of control and ability to deny to others) and if so,
what does this actually mean (given that business people, commercial lawyers within
jurisdictions, and different legal jurisdictions, exhibit radically different
views as to attributes of ‘joint ownership’).

Achieving good information management and negotiating fair and balanced
data deals is not easy.

That is why transparency of rights and use and good information
management will be a key differentiator of industry leading data businesses of
the future, regardless of the trajectory and pace of development of global
privacy laws.

Operators with corner-cutting or slapdash data management processes will
wither away through actions by private litigants or regulator action, mistrust
of business partners or erosion of consumer trust.

The business stakes are too high not to do data deals and data management
really well.

Peter
Leonard is a data, content and technology business consultant and lawyer and
principal of Data Synergies. He was a founding partner of Gilbert + Tobin. Following his retirement as a partner in 2017
he continues to assist Gilbert + Tobin as a consultant.

Upcoming events

Policy Forum 2025: a model for powerful AI legislation

Lunch and Learn: Digital Transformation in practice – Gaining the edge from drafting

AI Law: what every business (and their lawyers) needs to know

Data Protection Conference 2026

Tech Disputes Masterclass 2026