This event has passed.

Universal Learning for Decision-Making | Moise Blanchard

Name: Universal Learning for Decision-Making | Moise Blanchard
Start: 2024-01-26T11:00:00-08:00
End: 2024-01-26T12:30:00-08:00

January 26, 2024 @ 11:00 am - 12:30 pm

We provide general-use decision-making algorithms under provably minimal assumptions on the data, using the universal learning framework. Classically, learning guarantees typically require two types of assumptions: (1) restrictions on target policies to be learned and (2) assumptions on the data-generating process. Instead, we show that we can provide consistent algorithms with vanishing regret compared to the best policy in hindsight, (1) irrespective of the optimal policy, known as universal consistency, and (2) well beyond standard i.i.d. or stationary assumptions on the data. We present our results for the classical online regression problem as well as for the contextual bandit problem, where the learner’s rewards depend on their selected actions and an observable context. This generalizes the standard multi-armed bandit to the case where side information is available, e.g., patients’ records or customers’ history, which allows for personalized treatment. Precisely, we give necessary and sufficient conditions on the context-generating process for universal consistency to be possible. Surprisingly, for finite action spaces, universally learnable processes are the same for contextual bandits as for the supervised learning setting, suggesting that going from full feedback (supervised learning) to partial feedback (contextual bandits) came at no extra cost in terms of learnability. We then show that there always exists an algorithm that guarantees universal consistency whenever this is achievable. In particular, such an algorithm is universally consistent under provably minimal assumptions: if it fails to be universally consistent for some context-generating process, then no other algorithm would succeed either. In the case of finite action spaces, this algorithm balances a fine trade-off between generalization (similar to structural risk minimization) and personalization (tailoring actions to specific contexts).

Bio: Moïse Blanchard is a final year PhD student at MIT, working with Prof. Patrick Jaillet. He obtained his MSc in applied mathematics as valedictorian of Ecole Polytechnique. His research focuses on algorithms for decision-making and statistical learning. His work has been recognized with a best student paper runner-up award at COLT and a best student paper award from the Informs TSL society.

Details

Date: January 26, 2024
Time:
11:00 am - 12:30 pm

Series:

Special Seminar Series

Event Category: Seminar
Event Tags:Statistics

Venue

3234 Matthews Ln
La Jolla, CA 92093 United States + Google Map

Organizer

HDSI General

Other

Format: Hybrid
Speaker: Moise Blanchard
Event Recording Link: http://bit.ly/HDSI-Seminars

Contact Us

Find us

Email us

Phone support

Universal Learning for Decision-Making | Moise Blanchard

Details

Venue

Organizer

Other