Statistics: Novel approach at LMU allows to transfer survey results across contexts

20 Jan 2022

LMU statistician, working with colleagues from the US, has developed a method for reliably adapting data from a given sample to a different context.

The gold standard for gleaning statistically valid conclusions from data is random sampling from the population. Collecting properly randomized data, however, is often a major challenge, so modern statistical methods aim to enable valid inferences when random sampling is not feasible.

In the world of politics in particular, it would be useful to know, for example, whether a measure that was successful in a certain metropolis would have similar results in another city. And in medicine, it would be a great advantage to know if a drug that proved itself in a study with selected participants would work just as well for the general public.

Therefore, the question is: How can scientific research findings be transferred from one context to another? With the statistical methods commonly used before now, this could not be done efficiently. Now a team led by Frauke Kreuter, Professor of Statistics and Data Science in the Social Sciences and the Humanities at LMU, in cooperation with computer scientist colleagues from Stanford and Berkeley (USA) have presented a solution for this problem.

Target-independent approach for correct inferences


Big data are no substitute for personal input in surveys

Read more

The new research approach offers a major advantage: Inferences based on data from a specific source population can be reliably used for a different target population. “The new method allows the universal adaptation and correction of results from one context to another. To do this, we use a machine learning algorithm that can recognize systematic errors in an uncorrected source model,” explains Frauke Kreuter.

In this way, it makes the model robust against the change in the data basis. If the result is systematically distorted for various groups, the scientists work on the source model and correct it so that it works well for all possible subgroups.

“The method is an exciting example of how new methodology from computer science and statistics can complement and improve commonly used approaches in survey research for using imperfect data. It is designed to anticipate changes in the composition of data and therefore facilitates a much more robust use of prediction models,” adds Christoph Kern, postdoctoral researcher and co-project leader at the University of Mannheim, who played a leading role in developing the new approach.

For the paper, see: Universal Adaptability: Target-Independent Inference that Competes with Propensity ScoringMichael P. Kim, Christoph Kern, Shafi Goldwasser, Frauke Kreuter, Omer Reingold. In: PNAS 2022

What are you looking for?