Roy Fox, Assistant Professor and Director of the Intelligent Dynamics Lab at UC Irvine HDSI 123 and Zoom (Link below) Abstract: If learning from data is valuable, can learning from big data be very valuable? So far, it has been so in vision and language, for which foundation models can be trained on web-scale data to […]
Abstract: How can models with more parameters than training examples generalize well, and generalize even better when we add even more parameters, even without explicit complexity control? In recent years, it is becoming increasingly clear that much, or perhaps all, of the complexity control and generalization ability of deep learning comes from the optimization bias, […]
Abstract: Deep learning is used across a broad spectrum of applications. However, behind its remarkable performance lies an increasing gap between the demand for and supply of computation. On the demand side, the computational costs of deep learning models have surged dramatically, driven by ever-larger input and model sizes. On the supply side, as Moore's […]
Abstract: "Despite ML models' impressive performance, training and deploying them is currently a somewhat messy endeavor. But does it have to be? In this talk, I overview my work on making ML “predictably reliable”---enabling developers to know when their models will work, when they will fail, and why.
To begin, we use a case study of adversarial inputs to show that human intuition can be a poor predictor of how ML models operate. Motivated by this, we present a line of work that aims to develop a precise understanding of the ML pipeline, combining statistical tools with large-scale experiments to characterize the role of each individual design choice: from how to collect data, to what dataset to train on, to what learning algorithm to use."
Abstract: "Despite ML models' impressive performance, training and deploying them is currently a somewhat messy endeavor. But does it have to be? In this talk, I overview my work on making ML “predictably reliable”---enabling developers to know when their models will work, when they will fail, and why. To begin, we use a case study […]
Abstract: Large generative models such as ChatGPT have led to amazing results and revolutionized artificial intelligence. In this talk, I will discuss my research on advancing the foundation of these models, centered around addressing the architectural bottlenecks of learning from everything. First, I will describe our efforts to remove context size limitations of the transformer […]
Abstract: We reveal an intriguing and prevalent phenomenon of diffusion models which we term as ``consistent model reproducibility'': given the same starting noise input and a deterministic sampler, different diffusion models often yield remarkably similar outputs while they generate new samples. We demonstrate this phenomenon through comprehensive experiments and theoretical studies, implying that different diffusion […]
Abstract: We reveal an intriguing and prevalent phenomenon of diffusion models which we term as "consistent model reproducibility'': given the same starting noise input and a deterministic sampler, different diffusion models often yield remarkably similar outputs while they generate new samples. We demonstrate this phenomenon through comprehensive experiments and theoretical studies, implying that different diffusion models consistently reach the same data distribution and scoring function regardless of frameworks, model architectures, or training procedures. More strikingly, our further investigation implies that diffusion models are learning distinct distributions affected by the training data size and model capacity, so that the model reproducibility manifests in two distinct training regimes with phase transition: (i) "memorization regime", where the diffusion model overfits to the training data distribution, and (ii) "generalization regime", where the model learns the underlying data distribution and generate new samples with finite training data. Finally, our results have strong practical implications regarding training efficiency, model privacy, and controllable generation of diffusion models, and our work raises numerous intriguing theoretical questions for future investigation.
Abstract: As the intelligence of everyday smart devices continues to evolve, they can already monitor basic health behaviors such as physical activities and heart rates. The vision of an intelligent behavior change intervention pipeline for health -- combining behavior modeling & interaction design -- seems to be within reach. How do we get there? In […]
