I’ve been lucky enough to get to know the work of Jeni Tennison and The Open Data Institute a little during the past year or so. They are thinking very deeply about how to build trusted institutions and systems around data.
I believe this is vital work. A world where citizen-consumers understand the value and risks of data may get us to a higher energy level where we can use data to tackle causes of disease, unhappiness & inequality while maintaining a healthy balance of power between the individual, family & community and business. One where we (and policymakers) don’t understand the range of ecosystems we can build around data (and the rules that govern them) lends itself to our lurching with ill-informed knee-jerk responses to real or imagined data abuses. Equally, a more nuanced understanding of the economics of data will allow better investments by firms and, in economist's terms, result in welfare gains.
I had hoped Jeni would cover for me during my summer vacation but diaries didn’t work out. But I have learnt so much from the ODI, that I thought the opportunity to hear from Jeni while I am actually in London and able to write Exponential View was too good to pass up :)
Enjoy her Exponential View,
I’m Jeni (@JeniT), CEO of the Open Data Institute (ODI), which is an independent, international non-profit founded in 2012 by Sir Nigel Shadbolt and Sir Tim Berners-Lee. We work with companies and governments to build an open, trustworthy data ecosystem. This includes research and development to create practical tools and training for organisations who work with data; running sector programmes such as OpenActive that coordinate organisations to tackle social and economic problems with data; and bringing together people in peer networks to learn from each other.
Like many people who work at this intersection between data and society, I have a mixed background in fields such as psychology, AI (back in the days it was called knowledge engineering), software development (although I wouldn’t call myself a computer scientist), international standards and board game design. I got introduced to the public policy implications of data through working on legislation.gov.uk and data.gov.uk. Now my work, and that of ODI, really focuses on how people, organisations and communities can use data to make better decisions and be protected from any harmful impacts.
How do we enable people and organisations to access data while preserving trust, particularly but not only with personal data? How do the rights of the individual interact with the rights of organisations and societies? This is my focus for this week’s Exponential View.
Dept of the near future
🕳️ Training data for machine learning means AI can be sexist and racist — for example in AI-driven dermatology, maternal healthcare or falsely matching black members of Congress to criminal mugshots. “Biases in data can reflect deep and hidden imbalances in institutional infrastructures and social power relations.” Or sometimes deliberate fixing. Dataset creators can document how they sample data, but algorithm designers also have a responsibility to try to find ways to adjust. Want more? Read Made By Humans by (ODI alumnus) Ellen Broad.
📃 What if a society wants to embed bias in algorithms? Reportedly, Google plans to launch a censored search engine in China that will blacklist search terms about human rights, democracy, religion, and peaceful protest. Google employees object. Would a code of ethics for data scientists help formalise the responsibility of those who create algorithms to think and act ethically?
🔮 In the UK, the Treasury has issued a discussion paper on the economic value of data, leading some to be concerned about a retreat from the UK’s commitment to open public sector data when trading funds are making money off it. The relevant organisation here is Ordnance Survey, the UK’s mapping agency. There were baby steps to more open geospatial data earlier in the year and a current consultation on making the UK a world leader in geospatial. Want more on the economics of data? Bill Gates recently reviewed Capitalism without Capital by Jonathan Haskel and Stian Westlake which touches on data as an example of an intangible good. (Jonathan was one of the speakers at a recent workshop which we ran at ODI with Diane Coyle from the Bennett Institute (who also curated one previous EV dedicated to productivity in the digital age) exploring how both organisations and society should value data.)
Dept of data ownership
Seems like no one is happy with big tech having (sole) access to lots of personal data. Data about us can and should be used for our (individual or collective) good, but what institutions and regulations do we need to make that happen?
Health data gives good examples of the challenges and experimental approaches. The genomic data of people who gave DNA testing company 23andMe permission to pass it on for research of donated for research will be used by GlaxoSmithKline to design new drugs. Savvy is a co-operative where members share in the profits created from sharing data about their health. Some think it would help to give users easy access to data about them and let them port it to competing services. EFF thinks data portability and interoperability are anti-monopoly medicine. Others are concerned about the privacy impacts.
Should data about me be property that I can sell access to (managed by blockchains!)? Our only source of income when AI steals our jobs? A recent analysis by Charles Jones & Christopher Tonetti puts some economic theory and numbers behind the observable maximisation of data use by firms when they own it. It suggests individual data ownership will create a perfect market that balances economic benefit with privacy concerns. At ODI, we’re sceptical. Data is seldom just about us. Society should benefit from it too. And it’s pretty impossible to understand the implications of the controls over data we’re already given; having data ownership won’t change that.
Mariana Mazzucato argues for a national(ised?) data repository that sells personal data to tech companies and uses the profits to fund the digital economy. Aral Balkin objects to state as well as capitalist surveillance, and calls for a combination of regulation and support for ethical alternative services. Evgeny Morozov thinks a middle path is city-level personal data stores. The UK’s AI review last year proposed data trusts. We’re trying to work out what that means.
These and more approaches have been discussed in Helsinki at the MyData conference this week.
Short morsels to appear smart at dinner parties
You’re a nerd, right? Recite the story of ‘the great calculator race’. See what fits (or doesn’t) in women’s jeans pockets and discuss your favourite statistics and how Amazon’s pricing algorithms make high street prices change more rapidly too.
📈 Obligatory Bitcoin/blockchain facts: The Bitcoin network processes $1.3 trillion of transactions, more than Paypal. (But Paypal handled 7 billion transactions in 2017, compared to Bitcoin’s 104 million.) And Bitcoin now consumes about 1% of global energy generation, about the same amount of energy as Austria.
🕹️ Want to play a game? Experience both the frustrations of referenda and occasional reminders of humanity’s essentially cooperative nature by playing EmojiTetra or play through the post-summer Brexit options.
💤 And finally: a guaranteed way to fall asleep in 120 seconds.
Azeem's end note
I am spending a bit of time over the coming months thinking about data in the context of the new information age. I'm particularly curious about the full spectrum of issues from what personal rights around data usage should be, how those rights should be expressed and protected, especially in the context of derived attributes or characteristics the emerge from aggregated or population level data. I'm curious about collective data institutions like data exchanges, data trusts and data commons. I wonder about how business strategies might evolve beyond data network effects. And I'm intrigued by how data might be used in political & deliberative processes. And I want to understand better how we describe and unleash the economic value of data.
Happy to have discussions and, naturally, if you are working in a startup or token network looking at this problem, delighted to talk to you.
Have a fantastic September!
P.S. I'll see you next week.