I spoke to Ansgar Koene, the Global AI Ethics and Regulatory Leader at EY, who has spent most of his life investigating what is now known as Explainable Artificial Intelligence (AI). We talked about incoming regulation across the world, generational differences in terms of privacy concerns, reward hacking AI systems, and the difference between instrumental and core principles.
Elliot Leavy: First of all, I wanted to get your opinion on what is driving regulation across the world today. Is it ethics or is it simply business? The cynic in me thinks that it might mostly be because a lot of regions are wanting to play catch-up with big tech.
Ansgar Koene: Both play into this. And there is a political element currently which is arguing that the big tech companies have been allowed to operate in a minimally regulated space for quite a long time. These companies have in many ways been pursuing self regulation – always insisting that they shouldn’t be regulated because they needed space to innovate – but frankly, the regulators don't feel that this has worked. So people understandably are not too happy with developments such as the use of dark patterns to manipulate people and the ongoing failures around content moderation, which has only driven the demand further for regulation.
That's not to say that (at least) some of the tech companies haven't seriously been trying to deal with this. But these are very difficult challenges and content moderation is hugely difficult. This is the first time we’ve seen a global discourse where there isn’t a pipeline wherein (for example) you would have a journalist who intends to write something for the public and who then funnels it through a process involving an editor. Instead it’s anyone writing anything anywhere. This is a new and huge challenge, and a difficult one at that.
Elliot Leavy: Can regulation work?
Ansgar Koene: So the Digital Services Act is going to be more or less a test case to see how well outside regulation works. For instance, the European Commission has identified that one of the big problems with the implementation of GDPR has been inconsistency in how regulators in different countries have been enforcing this. And so now, for the Digital Services Act, the Commission will actually be doing the regulatory activities themselves. If they can show they can do this properly then it will likely have significant implications for the dynamics of Member State versus Commission-driven implementation of regulations. But if they fail, it will equally have a huge impact on people’s opinion on the reputation of the Commission as being a potential lead in this space.
This is an area that has also been identified outside of the Commission. The ICO in the UK for example has identified that it needs actual experts to understand this new space – meaning we are seeing a lot of governments reaching out to academia to solve such questions as Explainable AI.
But there is a huge skill shortage. So while the regulators are now seeing that the public expects them to start to play an active role in making this online environment a positive environment for people to engage in, they don't have the skill power to really do the task at at hand.
This is where another huge problem comes in: a dearth of data talent – we are seeing whole academic departments being bought up by big tech companies meaning that it is increasingly difficult to find the right skills needed to tackle the problems arising in terms of data protection and regulation.
Elliot Leavy: It's a tale of the ages: technology outpacing regulation. But when we talk about Explainable AI, could you give an overview of what that is, and where it sprang from?
Ansgar Koene: When we're talking about an AI that is different from traditional data analytics, what we mean is systems where the underlying ‘If This Then That’ model is not defined by the human creator of the system, but rather the human puts in place a system that has the capacity of finding patterns in the data and establishing these kinds of input/output transformations themselves. But because those patterns are learned from the statistics in the data it means the creator of the system doesn't really have a clear insight as to how a particular pattern was recognised.
We’ve seen challenges with this. For example if you're just feeding in certain data, input and output data sets – we will refer to supervised learning because it's the most easily understandable kind of learning – the algorithm is going to look for particular patterns: how can I reproduce this output from an input? What might be the multidimensional signals that could lead me to identify this as the right kind of output?
And the challenge with this is that we are using these tools because the input is complicated and difficult to explain – otherwise we we would have just written a traditional kind of program to run this – and so with such a rich pattern of input it's difficult to know if the program is really picking up on the things that we would have thought are the important signals. Perhaps instead it is picking up on something else entirely.
Elliot Leavy: What do you mean by that?
Ansgar Koene: Let’s take image processing. Is it the foreground object that the AI is focussing on or is it something else? We might think that the AI is focussing on the foreground because that is what we were taking a picture of, but the computer system just analyzes a bunch of pixels — it doesn't know that this is the foreground versus this is the background. So how do we know that it's actually making decisions based on the object of interest versus something else? Which is the classic example when getting an AI to identify dogs versus wolves in a dataset, but actually it was identifying images with snow in which incidentally also had the wolves in it.
This is why we need explainability. Because we need to know whether or not the system is actually doing the tasks that we set it out to do and make sure that is it is doing what we wanted it to be doing. Or is it actually doing something else which is also referred to as reward hacking. Because in the training system it gets a reward when the outcome is correct, but sometimes a system can figure out a way of getting a correct outcome without actually doing the tasks that we wanted it to do, it does something else that is an efficient way of maximizing rewards.
So explainability helps us to understand if the thing is actually doing what we want to do, which also helps us understand if is it going to fail, perhaps in unexpected ways. This is one of the challenges with these types of systems, because we didn't create them in a way where we clearly understand this is the rule that it's running so that means we also don’t understand when the output isn’t what we wanted to occur either.
These systems can do very well on tasks that we would consider to be difficult – because it's not actually looking at the tasks that we thought it was looking at – and it can fail on things that we thought are easy.
Explainable AI is a way of mitigating this, as it helps us to understand why a system is doing what it is doing and allows us to identify potential weaknesses and so we can then ensure that safeguards are in place.
Elliot Leavy: It all comes down to being able to trust these systems doesn’t it?
Ansgar Koene: Exactly. These systems are often built for use as a service for other people, and so they need to understand how these systems work also. Trust is key to every interaction we do. But trustworthiness has two components. One is the ability to deliver — I’ve said I will do this and so you will get that. But the other one is value alignment — when I said I'm going to do this, was I lying to you? And this is why regulators are stepping into this because there has been a huge breakdown in trust online, and people want to know what these services are getting in return for their data — often because it is unclear.
Explainability then is a key part in establishing the trust relationship with customers, be them businesses, governments or anyone.
Elliot Leavy: One thing that has always stuck out in my mind is the generational difference in privacy concerns. Gen Z have privacy concerns very, very low down their list of concerns for the future. Is this because of circumstance? Or a reflection of the wider economic climate or even just their upbringing in a data-driven landscape? Will this even be an issue in 20 years time?
Ansgar Koene: Throughout my work and especially with regard to data privacy, there's always been this paradox between people claiming that they have a huge interest in concerns around privacy versus how they're actually performing and sharing their data. So perhaps one element is just aligning these things.
But a useful perspective — which comes from Cansu Canca, the founder and head of the AI ethics lab — is when we are talking about AI principles there are two different types. First there are what we can call core principles, and then secondly there are instrumental principles.
So instrumental principles are basically just things that we're interested in not for their own sake, but because they help us achieve the core principles. And so privacy is actually not a core principle, but an instrumental principle. We are interested in privacy not so much for the sake of privacy but in order to make sure that we maintain our own agency and maintain dignity.
It is agency and dignity that are the core principles, and so privacy is a means to that end. So one question could be perhaps they're seeing that there are other ways of achieving better agency or maybe it is a if they do not give up their privacy it will cost them their agency: try living your life with all of the privacy settings on your phone turned on — the chances are this will remove you from access to most apps on your phone, and suddenly you will won’t even be able to use your GPS to navigate the world.
So there's trade-offs are constantly something that we are always engaging with, either with technology or be it with other people also of course, we've always been doing this. But even if people say they are they're more willing to do some data sharing, that doesn't mean that if they get a sense that you're collecting data for purposes that aren't actually going to lead to doing the service that you're going to do for them that that isn't going to degrade the trust relationship and their willingness to engage with you.
Of course, another element in this digital space, and one of the drivers for regulators to step in has also been the fact that we have this ‘winner take all’ dynamic. Once you become a leader platform in a certain space it means you attract more people and so a lot of the services work because there are people there already — you wouldn’t use a social media platform with no-one else on it. So you've got this reinforcing loop which leads to a ‘winner take all’ kind of dynamics. But the winner take all dynamic means that traditional marketplace competition as a way for ensuring that people get services of the right kind of type no longe works, because confronted with the idea that if this online market is collecting my data and doing things that I don't actually agree with I am unable to switch a different marketplace that marketplace cannot provide any of the services I actually want.
So you are pushed into giving up more and more of the things that you actually didn't want to give up in return for a service you might not ethically agree with, and so understandably this increases the mental friction of engaging with it — yet you still have to use it because there is no other option. This is why now regulators are saying that this is not in the benefit of our citizens, and so increasingly businesses are having to prove that the data they are collecting is actually being used for the reasons it stipulates, and explainable AI is ultimately the only way to do that.