Click here to flash read.
This work investigates the potential of Federated Learning (FL) for official
statistics and shows how well the performance of FL models can keep up with
centralized learning methods. At the same time, its utilization can safeguard
the privacy of data holders, thus facilitating access to a broader range of
data and ultimately enhancing official statistics. By simulating three
different use cases, important insights on the applicability of the technology
are gained. The use cases are based on a medical insurance data set, a fine
dust pollution data set and a mobile radio coverage data set - all of which are
from domains close to official statistics. We provide a detailed analysis of
the results, including a comparison of centralized and FL algorithm
performances for each simulation. In all three use cases, we were able to train
models via FL which reach a performance very close to the centralized model
benchmarks. Our key observations and their implications for transferring the
simulations into practice are summarized. We arrive at the conclusion that FL
has the potential to emerge as a pivotal technology in future use cases of
official statistics.