Juri Opitz

Logo

Researcher, Ph.D.

Email (click)
My GitHub Page

💫Blog

Juri Opitz

Hi there! I’m a researcher interested in machine learning, statistics, NLP (Natural Language Processing), and computational linguistics. I obtained my Ph.D. from Heidelberg University, where I was advised by Anette Frank. Now I’m based in Switzerland, working at the University of Zurich’s CL department.

Also check out my blog, where I share some tidbits that I think are interesting, and other thoughts that crossed my mind.

Overview of some work and interests 🔍

Understanding AI usage and impact 🦙

“AI” (“KI” in German), “LLMs” (Large Language Models), or “ChatGPT”, are terms that are by now familiar to millions, and oftentimes they’re used as synonyms! For better or worse, the associated technology is here to stay. In fact, it has an icreasing impact on various parts of society and social interaction. Since this technology is rather new, it seems specifically important to understand its impact on human society and economy, leveraging its strengths, but also mitigating false conceptions and learning about strategies for fair and safe usage. Some thoughts about the impact and emerging relations of this technology on/to (computational) linguistics are contained in this piece.

System evaluation 😵‍💫

Even one of the simplest of all evaluation tasks (classification evaluation) is far from trivial. For an intuitive analytical overview and comparison of classification metrics such as Macro F1, Weighted F1, Kappa, Matthews Correlation Coefficient (MCC), check out this paper at MIT press or arxiv. Then evaluation issues typically get compounded when looking at tasks where we don’t generate class labels, but generate artificial text, or other structured predictions, such as semantic graphs. Here’s some work on generation evaluation (click) and semantic parsing evaluation, introducing standardized and fine-grained matching.

Meaning representations, Explainability, and Decomposability 🧐

I like to study representations and their ability to meaningfully capture data (e.g., text, images, etc.), find ways to improve their representation power, efficiency, and interlinks.

Example: Who does what to whom? A meaning representation (MR) tries to express this in a structured and explicit format, such as a graph. In this paper we refine neural sentence embeddings with MRs to decompose them into different interpretable aspects. It keeps the efficiency and power of the neural sentence embeddings while adding some valuable explainability! Check out this repository for the code.

Other interests ✨:

NLP for history / humanities: Nowadays we got huge digitized historic data sets at our fingertips. How can computers help us make sense of tremendous amounts of such data? In a project, we’ve tried automatically reconstructing coordinates and movement patterns for thousands of medieval entities (🤴👸🧑‍🌾…), starting from the time of the Carolingian dynasty (ca. 750 CE) to Maximilian I. (ca. 1500 CE). Of course, “automatic” also means that there’s much room for reducing the error of the resconstructions – If you’ve got an idea for reducing the error in such approximations, here’s code and data.

Selected works 📜

🍄 Natural Language Processing RELIES on Linguistics. Available at: MIT press

🍄 A Closer Look at Classification Evaluation Metrics and a Critical Reflection of Common Evaluation Practice. Available at: MIT press, arxiv

🍄 SBERT studies Meaning Representations: Decomposing Sentence Embeddings into Explainable Semantic Features. Available at: ACL anthology, arxiv

🍄 SMATCH++: Standardized and Extended Evaluation of Semantic Graphs. Available at: ACL anthology, arxiv

For other publications, see Google Scholar.

Teaching and Supervising

I have taught courses on diverse topics, and supervised Bachelor and Master theses. Please see this page for more information.

Invited talks

I have given invited talks on these topics: