PyData Global 2022

Quan Nguyen

Quan Nguyen is a Python programmer and machine learning enthusiast. He is interested in solving decision-making problems that involve uncertainty. Quan has authored several books on Python programming and scientific computing. He is currently pursuing a Ph.D. degree in computer science at Washington University in St. Louis where he does research on Bayesian methods in machine learning.

The speaker's profile picture

Sessions

12-03
19:30
30min
Bayesian Optimization: Fundamentals, Implementation, and Practice
Quan Nguyen

How can we make smart decisions when optimizing a black-box function?
Expensive black-box optimization refers to situations where we need to maximize/minimize some input–output process, but we cannot look inside and see how the output is determined by the input.
Making the problem more challenging is the cost of evaluating the function in terms of money, time, or other safety-critical conditions, limiting the size of the data set we can collect.
Black-box optimization can be found in many tasks such as hyperparameter tuning in machine learning, product recommendation, process optimization in physics, or scientific and drug discovery.

Bayesian optimization (BayesOpt) sets out to solve this black-box optimization problem by combining probabilistic machine learning (ML) and decision theory.
This technique gives us a way to intelligently design queries to the function to be optimized while balancing between exploration (looking at regions without observed data) and exploitation (zeroing in on good-performance regions).
While BayesOpt has proven effective at many real-world black-box optimization tasks, many ML practitioners still shy away from it, believing that they need a highly technical background to understand and use BayesOpt.

This talk aims to dispel that message and offers a friendly introduction to BayesOpt, including its fundamentals, how to get it running in Python, and common practices.
Data scientists and ML practitioners who are interested in hyperparameter tuning, A/B testing, or more generally experimentation and decision making will benefit from this talk.
While most background knowledge necessary to follow the talk will be covered, the audience should be familiar with common concepts in ML such as training data, predictive models, multivariate normal distributions, etc.

Talk Track I