Data Protection: The New Technical and Political Environment

In 1965 Intel founder Gordon Moore first noted the continued doubling of the density of transistors on integrated circuits.[1] ‘Moore’s Law’ predicted this growth would continue roughly every 18 to 24 months. This prediction has proven to be remarkably accurate – in the four decades since, raw computer power has increased a million-fold. Storage capacity and Internet bandwidth are increasing at an even faster pace. Disk information density is doubling annually.[2] One fibre cable can now carry 160 wavelengths, with photonic integrated circuits capable of carrying 1.6 terabits per second[3] — the equivalent of over 10,000 television channels.

The plunging cost of storing, transmitting and processing personal data mean that little technological incentive remains for system designers to minimise the collection of personal data. Information can be transferred around the world at low cost, and duplicated across databases and portable computing devices. Ongoing increases in processing power allow more information to be extracted about individuals from this mass of data, using data mining algorithms to discern and record patterns of behaviour that in turn affect how individuals are treated. Effective data protection now depends upon the robust application of principles such as purpose limitation and the minimisation of personal data collection, as required by the EU Data Protection Directive and many other nations’ privacy laws.

Surveillance technologies and ubiquitous computing

As computers and sensors shrink, they are proliferating rapidly. A good illustration of this trend is RFID tags: very low-cost markers (costing only cents in quantity) typically attached to shipping containers, consumer goods and access control cards. Similar tags are now included in many nations’ passports under an International Civil Aviation Organization standard, and are also being used for road toll payment systems, public transport ticketing such as London’s ‘Oyster’ card, library book management, and in new contactless payment cards such as MasterCard’s ‘PayPass’ and Visa’s ‘Paywave’. There has been strong public concern at the use of tags to monitor individuals and their possessions.[4]

CCTV cameras have seen widespread deployment in countries such as the UK, where around £500m was spent on systems between 1996 and 2006, resulting in an estimated 4m devices.[5] This deployment occurred despite criminological research showing that CCTV had little impact on crime levels outside specific contexts, such as closed car parks.[6]

Many digital technologies are designed to produce detailed logs of their usage by individuals, which are then accessible for surveillance and marketing purposes. Mobile phones send location information to network providers so that calls may be forwarded to the correct cell, and more recently to enable location-based services such as contextual advertising and mapping. They can track movement using accelerometers, recognise objects and individuals and monitor other sensors emitting Bluetooth signals. Debit and credit card payment systems record amounts spent and stores visited. Store loyalty cards enable databases of purchases to be compiled, sometimes across multiple retailers as in the case of the UK Nectar scheme.

Online activities are particularly carefully monitored. Even where users are not required to provide identifying information to services, logs can be linked to individuals through the Internet Protocol (IP) address of their computer, and often through digital ‘cookies’ or electronic identifiers left on their browser by Web sites.[7]

Biometrics

Biometrics such as photographs and iris scans can be used to check an individual’s identity against records in a database (or on a chip), and are increasingly included on national identity cards for this purpose. They can also be used to identify individuals or samples within population-scale databases such as the UK National DNA Database (containing over 5m samples) and National Fingerprint Database (with records of 7.5m individuals).[8] Recent International Civil Aviation Organisation passport standards[9] require that fingerprint and facial images be included on chips within new ‘e-passports’.

The EU is building a central Visa Information System to store for five years fingerprints and photographs from the 20m applicants each year for visas to enter the Schengen area. This data will be available for immigration purposes but also for the investigation of serious criminal offences.[10] The US takes photographs and all ten fingerprints from each arriving visitor under its US-VISIT programme.

Facial recognition software has been used to match photographs and video footage of individuals against databases of criminal suspects, so far with limited success. It has proven difficult to replicate successful laboratory systems in real-world environments, not least the poor quality of much of the imagery recorded by deployed CCTV systems.[11]

Data mining and profiling

Profiling and data analysis algorithms are increasingly used on very large databases to spot patterns and identify individuals and behaviours ‘of interest’. E-commerce stores can see not just their customers’ purchasing behaviour, but every product customers consider and for how long before deciding whether or not to buy. Service providers can store all information provided by a user, such as search terms. Companies use this transactional data to target advertising and special offers at customers, and to find ways to provide slightly different products at different prices so as to maximise revenue. Using Customer Relationship Management, software firms also focus on identifying and retaining high-value customers while reducing service levels to less profitable individuals.[12]

Companies such as Google and Phorm monitor customers’ activities across a range of sites, displaying advertising based on customers’ activities across their whole portfolio. Many online businesses are almost entirely dependent on revenues from advertising, and so have a strong incentive to maximise its effectiveness.

Many countries (including the US and UK) require telecommunications companies to facilitate interception and to store data on their customers’ activities. The Bush administration worked with large US telecoms companies including AT&T to intercept and monitor the communications records of millions of Americans without the judicial approval required by US law.[13] Telephony and Internet Service Providers are required to retain communications data for periods of up to two years for law enforcement access under the EU Data Retention Directive.[14] In 2008 the European Court of Human Rights found that UK government powers to intercept international communications were overly broad and hence breached the right to privacy contained in the Convention.[15]

The EU Council ‘Future Group’, developing its justice and home affairs strategy for 2009-2014, looked forward to a ‘digital tsunami’ of personal data, with the Council Presidency stating:

Every object the individual uses, every transaction they make and almost everywhere they go will create a detailed digital record. This will generate a wealth of information for public security organisations, and create huge opportunities for more effective and productive public security efforts.[16]

However, the effectiveness of data mining in both commerce and law enforcement may have been overstated. Marketing specialists have recently developed methods for privacy preserving data mining[17] and data summary analysis[18] that show that firms can obtain most of the information they need to target advertising and estimate factors such as customer lifetime value without collecting detailed transaction histories.

The US National Research Council recently published a report on counter-terrorism technologies that concluded: ‘there is not a consensus within the relevant scientific community nor on the committee regarding whether any behavioral surveillance or physiological monitoring techniques are ready for use at all in the counterterrorist context given the present state of the science.’[19] This is a fundamental difficulty based on the extremely high number of false positives thrown up by searches for potential terrorists and the ease with which terrorists can adapt their behaviour to mask their intentions.

There are also concerns that data mining can lead to automated discrimination, where individuals are treated unfairly based on assumptions made about their behaviour based on previous transactional data.[20]

Social networking and user-generated content

‘Web 2.0’ technologies allow users to share text, audio and video on blogs, photo and video sites such as Flickr and YouTube, and the now-ubiquitous social networks such as MySpace and Facebook. Combined with the still and video cameras present in most mobile phones, this has allowed individuals to share information about themselves and those around them to an unprecedented degree. Users can search for (possibly live) video feeds from specified locations or containing specific individuals. Supposedly ‘anonymous’ data can increasingly be linked back to individuals based on their profiles and set of ‘friends’ revealed by these services.[21]

Social networks now have hundreds of millions of members around the world, while high-profile blogs like Boing Boing have readerships to rival national newspapers. Universities and companies have admitted checking the social network profiles of potential students and employees. Solicitors have warned that insurers are now monitoring social network sites before paying claims.

In general, it seems that individuals (especially young people) are often privacy-unaware or prefer short-term gains from providing personal data over longer-term less tangible concerns over the resulting privacy impact. While many social networking sites feature highly configurable privacy settings that allow information to be restricted to specific groups of friends, users rarely alter defaults allowing widespread sharing. It is likely to be many years before social norms fully adjust to these new possibilities for self-disclosure and the use of personal information by potential employers, educators and insurers.[22]

E-government and healthcare

E-government systems often contain large quantities of sensitive personal data on entire populations. ‘Back office’ systems focus on more effective processing of data and enabling new services (including fraud detection and prevention related to benefit payments and tax returns) out of the citizens’ gaze. ‘Portals’ enable citizens to interact online with the government, supplying information such as tax returns and applying for services without the cost to either party of face-to-face or telephone conversations and manual form processing.

Healthcare is now a highly technologised field, with diagnostic equipment such as fMRI scanners producing gigabytes of information each time they are used. Proponents of healthcare informatics claim that the computerisation of clinical assessment and treatment will lead to significant improvements in both the quality and efficiency of care. Databases will facilitate the sharing of patient records for clinical and administrative purposes and also provide access to medical researchers without the consent of patients.[23]

Plummeting costs mean that the sequencing of patient genomes is likely to become routine, allowing individuals to check their susceptibility to diseases with genetic factors and clinicians to prescribe medication and other interventions tailored to patients. The availability of genetic data to third parties such as employers and insurers is likely to become a major policy debate.

The ageing of the baby boomer demographic in North America and Europe is also likely to produce strong cost pressures for the out-patient treatment of chronic health conditions in older citizens. Ubiquitous computing and telemedicine researchers have been particularly active in this field as a result, developing a range of health sensors that would allow individuals to live at home with remote support from carers and clinicians. We are likely to see much more detailed information automatically gathered on physiological indicators and more general lifestyle data for the elderly and the less well.[24]

Conclusion

Technical constraints on the processing of personal data have almost disappeared. Surveillance of specific individuals used to be extremely resource-intensive; blanket surveillance was impossible. In the near future, it will be so easy to put everyone under surveillance that there is a danger this will become the default position. It will be easier and cheaper to track everyone, in almost every respect – finance, health, communication, travel, information gathering and other activities – than to be selective.

Societal concerns over terrorism, child pornography and international organised crime have driven the development of large-scale surveillance systems. Exploding budgets for health care, education and social welfare also worry governments. In some countries such as the UK, politicians seem to believe that the more personal data officials can get, and share, the better they can tackle social ills —whether teenage pregnancy, obesity or terrorism.

It would be unfortunate if meaningful privacy simply disappeared into a technological void. Its continued existence will only be technically and politically feasible with sustained effort from data protection authorities, constitutional courts, and most importantly citizens concerned for dignity and autonomy in the information age. Technology can be designed in a way that provides efficient personalised services, and proportionate protection for society from serious criminals. It is up to us to make sure governments develop and enforce the data protection principles that can make this a reality.

Dr Ian Brown is Senior Research Fellow at the Oxford Internet Institute (www.oii.ox.ac.uk) and author of the Blogzilla blog (http://dooooooom.blogspot.com/).

[1] Gordon Moore (1965) Cramming more components onto integrated circuits, Electronics Magazine 38(8)

[2] E. Grochowski & R. D. Halem (2003) Technological impact of magnetic hard disk drives on storage systems, IBM Systems Journal 42(2)

[3] Fred A. Kish et al. (2007) Ultra High Capacity WDM Photonic Integrated Circuits, Optical Fiber Communication and the National Fiber Optic Engineers Conference

[4] European Commission (2008) Early Challenges regarding the ‘Internet of Things’, Commission staff working document

[5] Surveillance Studies Network (2006) A Report on the Surveillance Society, Information Commissioner’s Office p.19

[6] M. Gill & A. Spriggs (2002) Assessing the impact of CCTV, Home Office Research, Development and Statistics Directorate, 43, pp.60-61.

[7] Article 29 Working Party (2007) Opinion 4/2007 on the concept of personal data

[8] R. Anderson, I. Brown, T. Dowty, W. Heath, P. Inglesant & A. Sasse (2009) Database State, Joseph Rowntree Reform Trust p.24

[9] International Civil Aviation Organisation (2006) Machine Readable Travel Documents – Machine Readable Passports – Specifications for Electronically Enabled Passports with Biometric Identification Capability, 6^th Edition

[10] OJ L 218, 13.8.2008, p. 60–81

[11] Mike Krause (2002) Is face recognition just high-tech snake oil? Enter Stage Right, 14 January; H.U. Keval & M. A. Sasse (2008) Can we ID from CCTV? Image quality in digital CCTV and face identification performance, SPIE Mobile Multimedia/Image Processing, Security, and Applications

[12] Carl Shapiro & Hal Varian (1998) Information Rules: A Strategic Guide to the Network Economy, Harvard Business School Press

[13] John Cary Sims (2006) What NSA is Doing… and Why it’s Illegal, Hastings Constitutional Law Quarterly 33(2) pp.101—136

[14] OJ L 105, 13.4.2006, pp.54–63

[15] Liberty and Others v. the United Kingdom, no. 58243/00, 1 July 2008

[16] Republic of Portugal (2007) Public Security, Privacy and Technology in Europe: Moving Forward, Concept paper on the European strategy to transform Public security organizations in a Connected World. Available at http://www.statewatch.org/news/2008/jul/eu-futures-dec-sec-privacy-2007.pdf (last visited 3 Mar 2009)

[17] R. Agrawal & R. Srikant (2000) Privacy-Preserving Data Mining, Proceedings of the SIGMOD International Conference on Management of Data, Dallas, Texas, May 15–18 pp. 439–450.

[18] P. Fader, B. Hardie & K. Jerath (2007) Estimating CLV using aggregated data: The Tuscan Lifestyles case revisited, Journal of Interactive Marketing 21 (3) pp.55—71

[19] Committee on Technical and Privacy Dimensions of Information for Terrorism Prevention and Other National Goals, National Research Council (2008) Protecting Individual Privacy in the Struggle Against Terrorists: A Framework for Program Assessment, The National Academics Press p.4

[20] Oscar Gandy (2002) Data Mining, Discrimination and the Decline of the Public Sphere. Dixons Public Lecture given at the London School of Economics & Politican Science, 7 November.

[21] Paul Ohm (forthcoming 2010) Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization, 57 UCLA Law Review

[22] Alessandro Acquisti & Ralph Gross (2006) Imagined Communities: Awareness, Information Sharing, and Privacy on the Facebook, Lecture Notes in Computer Science 4258 pp.36—58; Lilian Edwards & Ian Brown (2009) Data control and social networking: irreconcilable ideas? In A. Matwyshyn (ed.) Harboring Data: Information Security, Law and the Corporation, Stanford University Press

[23] House of Commons Health Committee (2007) The Electronic Patient Record, HC 422-I, Sixth Report of Session §§37—39

[24] Ian Brown & A. Adams (2007) The ethical challenges of ubiquitous healthcare, International Review of Information Ethics, 8, pp.53-60

Upcoming events

Software Escrow For Legal Professionals

Operational Resiliency And Technology Contracts

Lunch and Learn: Digital Transformation – the lawyer’s role in running complex procurements

Women in Tech Law Annual Gathering, with special guest Fran Halsall, Triple Olympian

Consent or Pay Webinar