12-02, 11:30–12:00 (UTC), Talk Track II
Transformer models became state-of-the-art in natural language processing. Word representations learned by these models offer great flexibility for many types of downstream tasks from classification to summarization. Nonetheless, these representations suffer from certain conditions that impair their effectiveness. Researchers have demonstrated that BERT and GPT embeddings tend to cluster in a narrow cone of the embedding space which leads to unwanted consequences (e.g. spurious similarities between unrelated words). During the talk we’ll introduce SimCSE – a contrastive learning method that helps to regularize the embeddings and reduce the problem of anisotropy. We will demonstrate how SimCSE can be implemented in Python.
Brief Bullet Point Outline
• Introduction (1 min)
• A refresher on Transformer model (3 min)
• What is anisotropy? (3 min)
• Contrastive learning – what and why? (5 min)
• Embeddings and SimCSE in Python (13 min)
• Q&A (5 min)
Prerequisites
People of all backgrounds and experience levels are invited to participate in the talk. However, to get the most out of the presentation, the following skills are recommended:
• Familiarity with Python & Huggingface library
• Good understanding of basic NLP concepts, in particular embeddings
• Basic understanding of Transformer architecture
• Solid basics of supervised and unsupervised learning
Previous knowledge expected
Aleksander Molak
R&D Machine Learning Engineer and Researcher at Ironscales and independent Machine Learning Researcher at TensorCell. Before joining Ironscales Alex has built end-to-end machine learning systems for Fortune Global 100 and 500 companies.
International speaker, blogger, currently working on a book on causality in Python. Interested in NLP, causality, probabilistic modeling, representation learning and graph neural networks.
Loves traveling with my wife, passionate about vegan food, languages and running.
Website: https://alxndr.io