Email), Joe Peskett"> Email)">
14:30 - 15:30
Contributed Paper Session
Room: JENK
Chair:
Per Nymand-Andersen, European Central Bank, Germany, (Email)
Discussant:
Elise Coudin, INSEE, France, (Email)
Organiser:
Jacopo GRAZZINI, European Commission - Eurostat, (Email)
Transparency and reproducibility of models and algorithms: examples from the UN Global Platform
Joni Karanka, (Email), Joe Peskett, (Email)
Office for National Statistics, UK, Newport
Public administrations, organisations and citizens depend on evidence from the statistical community to take informed decisions. The impact of the evidence and analytics that the community provides deeply depends on the public trust that we are able to generate, and is directly in competition with other sources of information. Currently we face three immediate threats to our trust: a) a media environment in which unchecked facts are widespread in social media (‘fake news’), b) a data science and technology evolution towards large, unstructured datasets and less transparent algorithms and models, c) the ‘closeness’ of many statistical datasets due to personal or commercial concerns. Shortcomings on the transparency of algorithms and data led to the ‘replicability crisis’ in social sciences (see the Reinhart & Rogoff error in economics); which we argue that the official statistics community is not immune to. Two of the main drivers of the UN Global Platform are the provision of trusted methods and trusted data. The provision of trusted methods, algorithms and models depends on a number of factors, such as: a) openness of the code-base, b) openness of the data (and training data), c) description of the logic of the algorithm, d) reproducibility of the algorithm by other researchers. We argue that the provision of a description of the methodology and availability of the code are the minimum expected from national statistical organisations, but that efforts should be made to provide fully reproducible algorithms. To further enhance trust, we will provide examples of how UN Global Platform algorithms have been made more transparent by: a) provision of a public endpoint and execution for the algorithm, so that researchers can apply it to data they are familiar with, b) synthetic datasets to explore the workings of the algorithm / model, and c) notebooks with demonstrators of the algorithm in publicly available environments so that citizen users can understand the workings and implications of the algorithm.


Reference:
CPS10-001
Session:
Open, transparent and reproducible
Presenter/s:
Joe Peskett
Presentation type:
Oral presentation
Room:
JENK
Chair:
Per Nymand-Andersen, European Central Bank, Germany, (Email)
Date:
Thursday, 14 March
Time:
14:30 - 15:30
Session times:
14:30 - 15:30