Artificial Intelligence and Big Data: Who decides what data counts?

9 Nov 2021

On 16 November, LMU statistician Frauke Kreuter will speak at the KI Lectures about how huge amounts of data are used for AI applications.

In 2020, around 64 trillion gigabytes of data were generated and processed worldwide. The information content: unimaginable. No wonder, then, that the business of Big Data has become a billion-dollar industry. Data is the basis for AI applications and the key to technologies that are revolutionizing everyday life - from GPS navigation to the automation of administrative and corporate processes.

In her KI Lecture, Frauke Kreuter outlines current developments in the use of AI and Big Data in business and social research. She explains what pitfalls there are in the application and how science can get to grips with issues around ethics and privacy without having to forego reproducibility and subsequent use of data.


16 Nov

KI Lectures: Who decides what counts? AI and Big Data

Read more

Prof. Dr. Frauke Kreuter: "Who decides what counts? AI and Big Data: Applications in economic and social science research"

Tuesday, November 16, 2021, from 6:15 p.m. - 7:45 p.m.

Register here

More information about the "KI Lectures" can be found here.


Please note: The lecture will only be held in German. A VOD with English subtitles will be available shortly after the event on YouTube.

Three questions for Prof. Dr. Frauke Kreuter

Prof. Dr. Frauke Kreuter

Frauke Kreuter


What areas of application do you see for artificial intelligence in your field?

Frauke Kreuter: Artificial intelligence is already being used in many administrative processes. The Federal Employment Agency, for example, uses it to check diplomas, resumes and other documents. In Austria, there was already an initiative in which algorithms were to provide information based on historical data about which job seekers have the best chances of being reintegrated into the labor market.

How do you assess this development: Should individual fates be placed in the hands of algorithms?

Frauke Kreuter: At this point, we should rather ask ourselves whether the status quo is actually better. People make mistakes and are influenced by the soft skills of their counterparts or their own prejudices. There is interesting research on how decisions are made. Different factors play a role: whether the decision is made in the morning or in the evening, how many decisions had to be made beforehand. All of this leads to different outcomes and sometimes discrimination. The hope is initially that by automating procedures, you can eliminate human error and that in turn will lead to more equality of opportunity.

But it must also be said: AI systems are not yet perfect in this respect. Historical data with which they are trained often contains outdated patterns that no longer fit a modern, changing world. For example, if an algorithm learns that certain positions in a company have been white, middle-aged men for decades, it will want to continue assigning white men to these positions when selecting applicants, for example. However, such problems can be solved. Algorithms can be trained to give more weight to younger data, or a random component can be added in isolated cases.

AI applications feed on data. The more the better. Do we need to change our attitude toward data protection so that AI can develop further?

Frauke Kreuter: Incomplete data sets are a major threat to AI processes. The way the GDPR is currently knitted, in most cases it is up to individuals to decide how they want to handle their data. However, this in itself can lead to imbalances in the data situation. Depending on their technical understanding, certain groups of people are more inclined or reluctant to share their data. Algorithms are optimized in this way primarily for people who participate in the creation of the data stream.

In this context, we in Germany should rethink our basic attitude toward data. On the one hand, there needs to be an awareness in all places where data is produced that it must be of good quality and reliable. On the other hand, there needs to be an awareness on the part of legislation that data can be used and shared - that there are no longer commercial monopolies of a few technology companies. The reality is: data is a valuable resource and we are at a point where entire administrative systems are built upon it. The question now is: how do you make them better?

Prof. Dr. Frauke Kreuter holds the Chair of Statistics and Data Science in the Social and Human Sciences and is co-director of the Data Science Centers at the University of Maryland and the University of Mannheim.

What are you looking for?