Juri Opitz

Logo

Researcher, Ph.D.

Email (click)
My GitHub Page

Blog

Juri Opitz

Hi there! I’m a researcher interested in machine learning, NLP (Natural Language Processing), and computational linguistics. I obtained my Ph.D. from Heidelberg University, where I was advised by Anette Frank.

Recent news:

Overview of some work and interests 🔍

Meaning representations, Explainability, and Decomposability 🧐

I like to study representations and their ability to meaningfully capture data (e.g., text, images, etc.). Representations of interest can range from explicit formal semantic structures to artificial neuron weights and activations (e.g., from large language models), to human mental representations. A goal is to improve their representation power or efficiency and possibly find some structural interlinks between the diverse representations.

Example work: How can we capture who does what to whom? A meaning representation (MR) tries to express the answer to this questions in a structured and explicit format, such as a graph. NNs provide us with useful embeddings and generations, but tend to blend and intermingle everything in ways that we have great difficulties to understand. On the other hand, MRs describe the meaning with a crisp representation that is explicit and decomposable. In this paper we refine neural sentence embeddings with meaning representations to decompose them into different interpretable aspects. It’s keeps all efficiency and power of the neural sentence embeddings while getting some of that cool explainability of the crisp meaning representations! Check out this repository for the code.

System evaluation 😵‍💫

Even in the simplest of all evaluation tasks (classification evaluation) it’s not so easy and there’s even confusing “double-terms” like Macro F1 and macro F1 (no typo!). For an intuitive analytical overview and comparison of more classification metrics, check out this paper. Also, evaluation issues typically get compounded when looking at tasks where we don’t generate class labels, but generate artificial text, or other structured predictions, such as semantic graphs. Here’s some work on generation evaluation (click) and semantic parsing evaluation, introducing standardiziation and discussing other issues.

Other interests ✨:

NLP for history / humanities: By now there’s huge digitized historic data sets. How can computers help us make sense of tremendous amounts of such data? I have written some code for computing large-scale statistics in collections of historic texts, exploring the European medieval ages across time and spatial axes, extracting historic events. For instance, we have automatically reconstructed coordinates and movement patterns for thousands of medieval entities (🤴👸🧑‍🌾…), starting from the time of the Carolingian dynasty (ca. 750 CE) to Maximilian I. (ca. 1500 CE). Of course, “automatic” also means that there’s much room for reducing the error of the resconstructions – If you’ve got a nice idea for reducing the error in such approximations, here’s all code and data.

Publications 📜

See Google Scholar

Teaching

At Heidelberg University

At TU Darmstadt

Invited talks