From raw telco data to final statistics: a modular process with synthetic network event data
We present an end-to-end process from raw telecommunication data to final outputs (basically population counts). The process is illustrated with synthetic network event data from a simulator developed in the ESSnet on Big Data. This is part of a strategy to develop the ESS Reference Methodological Framework for Mobile Network Data even in the blocked scenario of access to real data. The process deals with aspects of the estimation (geolocation, deduplication, aggregation, and inference to the target population). Each step is developed with an R package implementing the methodology as a proof-of-concept. Steps are integrated via the total probability theorem.