Aleksander Molak
Aleksander Molak
R&D Machine Learning Engineer and Researcher at Ironscales and independent Machine Learning Researcher at TensorCell. Before joining Ironscales Alex has built end-to-end machine learning systems for Fortune Global 100 and 500 companies.
International speaker, blogger, currently working on a book on causality in Python. Interested in NLP, causality, probabilistic modeling, representation learning and graph neural networks.
Loves traveling with my wife, passionate about vegan food, languages and running.
Website: https://alxndr.io
Sessions
Transformer models became state-of-the-art in natural language processing. Word representations learned by these models offer great flexibility for many types of downstream tasks from classification to summarization. Nonetheless, these representations suffer from certain conditions that impair their effectiveness. Researchers have demonstrated that BERT and GPT embeddings tend to cluster in a narrow cone of the embedding space which leads to unwanted consequences (e.g. spurious similarities between unrelated words). During the talk we’ll introduce SimCSE – a contrastive learning method that helps to regularize the embeddings and reduce the problem of anisotropy. We will demonstrate how SimCSE can be implemented in Python.