PyData Global 2022

Is it possible to have entities within entities within entities?
12-03, 11:00–11:30 (UTC), Talk Track I

Named entity recognition models might not be able to handle a wide variety of spans, but Spancat certainly can! Within our open-source library for NLP, spaCy, we've created a NER model to handle overlapping and arbitrary text spans. Dive into named entity recognition, its limitations, and how we've solved them with a solution-focused talk and practical applications.


The standard approach to named entity extraction becomes problematic when dealing with a wide variety of spans, like long phrases, non-named entities, and overlapping annotations. Whereas named entities normally have clear boundaries and syntactic properties, spans can be completely arbitrary, posing a problem for some entity extraction applications.
I'll start by talking about NER models, how and why they're used, and some of their limitations. Then I'll introduce Spancat, our solution to the problem of arbitrary and overlapping spans that we've implemented in spaCy, our open-source NLP library for machine learning. You'll leave with an understanding of what named entity recognition is, how a span-labeling model works, and a real-world application to these complex problems.


Prior Knowledge Expected

No previous knowledge expected

Victoria recently graduated from UC San Diego with a degree in linguistics and a passion for natural language processing. She got involved with coding and Python after creating several applicational NLP projects like a playlist recommender based on a user-inputted quote.

She is just starting her career as a Developer Advocate for Explosion, the makers of spaCy! In this role, she takes care of the NLP-focused community around spaCy through example projects, videos, visuals, and posts.

She is in love with learning more about NLP and ensures that the open-source community has everything they need to do the same. Besides running marathons, making fun projects, and challenging her understanding of the world, she devotes all of her passion and motivation to educating the community around Explosion.