Abstract: "Modern database systems aim to support a large class of different use cases while simultaneously achieving high performance. However, as a result of their generality, databases often achieve adequate performance for the average use case but do not achieve the best performance for any individual use case. In this talk, I will describe my […]
Abstract: We reveal an intriguing and prevalent phenomenon of diffusion models which we term as ``consistent model reproducibility'': given the same starting noise input and a deterministic sampler, different diffusion models often yield remarkably similar outputs while they generate new samples. We demonstrate this phenomenon through comprehensive experiments and theoretical studies, implying that different diffusion […]
Abstract: Large generative models such as ChatGPT have led to amazing results and revolutionized artificial intelligence. In this talk, I will discuss my research on advancing the foundation of these models, centered around addressing the architectural bottlenecks of learning from everything. First, I will describe our efforts to remove context size limitations of the transformer […]
Abstract: "Despite ML models' impressive performance, training and deploying them is currently a somewhat messy endeavor. But does it have to be? In this talk, I overview my work on making ML “predictably reliable”---enabling developers to know when their models will work, when they will fail, and why.
To begin, we use a case study of adversarial inputs to show that human intuition can be a poor predictor of how ML models operate. Motivated by this, we present a line of work that aims to develop a precise understanding of the ML pipeline, combining statistical tools with large-scale experiments to characterize the role of each individual design choice: from how to collect data, to what dataset to train on, to what learning algorithm to use."
Abstract: Deep learning is used across a broad spectrum of applications. However, behind its remarkable performance lies an increasing gap between the demand for and supply of computation. On the demand side, the computational costs of deep learning models have surged dramatically, driven by ever-larger input and model sizes. On the supply side, as Moore's […]
Abstract: How can models with more parameters than training examples generalize well, and generalize even better when we add even more parameters, even without explicit complexity control? In recent years, it is becoming increasingly clear that much, or perhaps all, of the complexity control and generalization ability of deep learning comes from the optimization bias, […]
Roy Fox, Assistant Professor and Director of the Intelligent Dynamics Lab at UC Irvine HDSI 123 and Zoom (Link below) Abstract: If learning from data is valuable, can learning from big data be very valuable? So far, it has been so in vision and language, for which foundation models can be trained on web-scale data to […]