If this was applied in the private sector it would probably fail horribly. But in the public sector, what are its strengths and limitations?
As I’m going to spend a lot of time looking at difficulties, it’s only fair to start with positives:
- It’s simple – it doesn’t attempt to be unnecessarily complex
- It introduces the idea that people can look at impacts, rather than just code
The second of these looks like an interesting way forward. I might suggest an alternative, which would be a ability to evaluate the overall impact of full result sets – thus enabling you to investigate biases in the round, rather than focusing on individual results, some of which will always be wrong!
Definition of an algorithm and data
This is probably the first challenge. When is something an algorithm? When does it use data? If I have a set of actions that I always take in a certain order (for example, I open the city’s parks in a specific order) is that an algorithm? Even that simple example impacts people, as park A is open 5 minutes before park B…
And what is data? Does it have to be in a computer? How about if it’s written down?
Generally I’m in favour of opening decisions to scrutiny, but I firmly believe it should be all decisions, not just computer decisions!
What is source code?
A naive reading of this would suggest that source code could be as simple as pointing at the R code behind an approach. Or it could mean publishing the actual model.
The first is easy to do, the second isn’t. Trained models aren’t necessarily designed to be published in an interpretable way (a bit like the difference between compiled and uncompiled code) – so should we limit approaches to ones where a ‘raw’ or interpretable model could be generated? Even if we could, it might not mean much without the software used to run it. In essence it might be worse than useless.
Another challenge is where does a model begin? A lot of time is spent preparing data for a model. Up to 80% of the time when generating a model. If you just show the model, without describing all of the steps that are needed to transform the data, then you are going to severely mislead.
But what about allowing users to submit data and see the impact? This is an interesting idea. But it too has some interesting consequences.
What would you do if someone actively used this to game the system? Because they will. Yes, you could monitor use, but then you end up in another area of potential moral difficulty. And it’s one thing if someone is using it to understand how to maximise the benefits they receive (actually I kind of approve of this), but what if they are using it to understand how to avoid predictive policing? And can the police use this data to change their predictive policing approach?
Another interesting problem is that often a single result doesn’t tell you much. Yes I can see that my zip code has a predictive policing score of 10. But unless I know the predictive scores of all other zip codes, plus a huge range of other things, that doesn’t tell me much.
And how do you stop people from using it to spy on their neighbours? Entering in other people’s data to find out things about them?
Finally, some thoughts about unintended consequences. Will this discourage effectiveness and efficiency? After all, if I use a dumb human to make a bad decision, then I won’t be held accountable in the same way as if I use a smart algorithm to make a good decision. And this is important because there will always be mistakes.
Will there be any attempt to compare the effectiveness against other approaches, or will there be an assumption that you have to beat perfect scores in order to avoid legal consequences?
Will vendors still be willing (or even able) to sell to NYC on this basis?
I think this is an interesting approach, and certainly I’m not as negative about it as I was originally (sorry, shouldn’t tweet straight off a plane!). Thanks to@s010n and@ellgood for pointing me in this direction…