Language technologies for digital inclusion

“Draw a balloon with three colors!” That is easily said and should present little difficulty even for young children. But getting a computer to do it is a challenge that requires a lot of work. “The computer has to understand what’s meant; it has to establish the relationships between the words before it can generate an image.”

Teaching computers to understand natural language is one of the research interests of Professor Barbara Plank. The computer scientist, who came to LMU last year from the IT University of Copenhagen, researches in the field of natural language processing (NLP) at the Center for Information and Language Processing.

She has worked, for example, on improving algorithms for text search in job ads. The goal here was to make the algorithms more robust so that the job notices would display very specific job criteria or requirements more quickly and with greater precision and jobseekers would get precisely matched job ads. For this research, she received data from a variety of sources, including the Danish employment agency.

She emphasizes: “There is a vast number of possible applications out there for NLP.” This includes many potential uses in cultural and social contexts, which brings us to another important area of research for Plank: so-called minor languages in modern language technologies.

Barbara Plank im Senatsgang der LMU. Sie lächelt, trägt eine Brille und ein grünes Jacket

Professor Barbara Plank

Focus on small languages and dialects

“No more than one percent of the 7,000 languages worldwide are covered by NLP. Integrating them is tremendously important for reasons of digital inclusion alone,” says the South Tyrolean, who herself comes from a distinctly multilingual region. People from different cultures should be able to benefit from the technological progress in language processing – for example, through the design and implementation of corresponding language assistance systems. She sees huge potential in relation to language preservation in particular. Moreover, the scientific work at the institute will incorporate non-standard languages and dialects.

Her recently launched “Natural Language Understanding for non-standard languages and dialects” project is supported by a Consolidator Grant from the European Research Council (ERC).

“This is very important to me also on a personal level, as I’m a dialect speaker myself,” says Plank. The goal is to train algorithms to handle the colossal body of data. As written texts are of little help in the case of dialects and are in any case not available in sufficient quantities, Barbara Plank wants to tap into the enormous expertise that is available at LMU.

I see the interaction between humans and machines as a big challenge. I think that people can solve problems much better in conjunction with machines and that machines also learn alongside humans.

Professor Barbara Plank

Interaction between humans and machines

The abundant opportunities for networking at LMU, especially in the area of artificial intelligence and language research, were – alongside the proximity to her home region – the main reason behind her decision to come to Munich. “For me, it’s not about going home,” she says. “But moving a little closer is nice all the same.”

Barbara Plank studied computer science in an international master’s program jointly implemented by the Free University of Bozen-Bolzano and the University of Amsterdam. Having obtained her master’s degree, she completed a doctorate at the University of Groningen in the Netherlands. Before coming to Munich, she was a professor in Copenhagen and the Netherlands.

The practical relevance of her work is one thing, but she also wants to put humans at the heart of what she does. “I see the interaction between humans and machines as a big challenge. I think that people can solve problems much better in conjunction with machines and that machines also learn alongside humans.”

There are numerous possible approaches here, from the investigation of cognitive aspects of reading behavior through eye tracking to applications that analyze opinions on climate change or in the health sector.

However, the data on which systems are trained is decisively important. Specifically in the context of human-machine interaction, researchers not only need the raw data, but also annotated data. “In response to the question, ‘Would you like to have dinner with me tonight?’ machine systems can only answer ‘yes’ or ‘no.’ We need to teach the computer that an answer like ‘I’m tired’ can also mean ‘no.’ And computers must also be able to deal with ambiguities when it comes to things like arguments and differences of opinion. Currently, computers work with just one opinion or answer. Generating the data is the great – human – task upon which all projects in automatic language processing are founded, including in the latest technologies such as ChatGPT. After all, without people, without researchers like Barbara Plank, these projects cannot succeed.

Language technologies for digital inclusion

Focus on small languages and dialects

Interaction between humans and machines

What are you looking for?