This workshop aims to present a broad overview of the feedback types being actively researched, highlight recent advances and provide a networking forum for researchers and practitioners.
While online learning has become one of the most successful and studied approaches in machine learning, in particular with reinforcement learning, online learning algorithms still interact with their environments in a very simple way. The complexity and diversity of the feedback coming from the environment in real applications is often reduced to the observation of a scalar reward. More and more researchers now seek to exploit fully the available feedback to allow faster and more human-like learning.
Online learning, in its broad sense, is the task of continuously learning from feedback gathered about an environment. Reinforcement learning (RL) and bandits are prominent examples which have attracted considerable attention in the past years. Learning online might be a necessity if the environment of the algorithm changes and the behavior to be learned changes with it. It is also a framework which has been used to sequentially learn to act in non-changing settings: learning to act optimally in games can be done by RL, as famously illustrated by AlphaGo.
The standard task abstraction in online learning is the maximization of reward, which is also the feedback to the algorithm: the learner performs an action, observes whether it got a high reward, and improves its behavior based on that feedback. However, this model oversimplifies feedback available in complex real-world applications, where observables beyond the reward abound. Examples include the actions of other players in games. Feedback can further result from the interaction of several past actions, or be delayed. Moreover, the reward might not be observable: the algorithm could learn from indirect signals like preferences instead. The result of an action can be incompletely observed, as in auctions. The algorithm might want to learn from examples or guidance provided by humans.
Topics will cover a variety of unconventional feedback encountered in various real world applications ranging from economics to music recommendation tools.
Andreea is a Ph.D. student at University of California Berkeley, working with Anca Dragan in the InterACT Lab. Her interests lie at the intersection of machine learning, robotics, and human-robot interaction, with a focus on robot learning with uncertainty.
Ciara is a lecturer at Imperial College London. Her research interests include multi-armed bandits, online learning, and reinforcement learning. In general, she is interested in sequential decision making under uncertainty and potentially limited feedback.
Nicolò is professor at the University of Milan. His main research areas are: design and analysis of machine learning algorithms; algorithms for multiarmed bandit problems with applications to personalized recommendations and online auctions; graph analytics with applications to social networks and bioinformatics.
Thorsten is a Professor at Cornell University. His research intersets include machine learning methods and theory, learning from human behavioral data and implicit feedback, and machine learning for search engines, recommendation, education, and other human-centered tasks
Vianney is a professor at the Centre de recherche en économie et statistique (CREST) at the ENSAE since october 2019. Mainly focusing on the interplay between machine learning and game theory, his themes of research are at the junction of mathematics, computer science and economics. He is also part-time principal researcher in the Criteo AI Lab, in Paris, working on efficient exploration in recommender systems.
Julien is a research scientist at DeepMind. His main research interests include game theory and reinforcement learning.
Alex is a Principal Researcher at MSR New York City. His research interests are in algorithms and theoretical computer science, spanning learning theory, algorithmic economics, and networks. He is particularly interested in online machine learning and exploration-exploitation tradeoff, and their manifestations in socioeconomic environments.
The full schedule is available on the icml conference website: Schedule .
In addition to the 7 presentations by our invited speakers, we selected 6 contributions for a talk of 15 minutes. These contributions are
All contributed works will be showcased during the poster session, from 3:00 PM to 4:30 PM (local time).
Our goal is to reach participants from across machine learning, focusing on the transverse theme of the feedback from which algorithms learn. We accept two forms of contributions:
Submissions can be both of theoretical and empirical nature, and should focus on the subject of feedback in sequential learning, which includes but is not limited to
Contributions from outside the online learning community will be very welcome, as long as they provide interesting feedback encountered in real-world problems.
Submit here (CMT website): https://cmt3.research.microsoft.com/CFOL2022/
|Submission site opens||May 6, 2022|
|Submission deadline||May 27, 2022 - 11:30PM Pacific Time|
|Decisions announced||June 13, 2022|
|Video submission due||July 1, 2022 - 11:30PM Pacific Time|
|Day of workshop||July 23, 2022|