TILOS Seminar: Large Datasets and Models for Robots in the Real World
Abstract: Recent progress in AI can be attributed to the emergence of large models trained on large datasets. However, teaching AI agents to reliably interact with our physical world has proven challenging, which is in part due to a lack of large and sufficiently diverse robot datasets. In this talk, I will cover ongoing efforts of the Open X-Embodiment project–a collaboration between 279 researchers across 20+ institutions–to build a large, open dataset for real-world robotics, and discuss how this new paradigm is rapidly changing the field. Concretely, I will discuss why we need large datasets in robotics, what such datasets may look like, and how large models can be trained and evaluated effectively in a cross-embodiment cross-environment setting. Finally, I will conclude the talk by sharing my perspective on the limitations of current embodied AI agents, as well as how to move forward as a community.
