Special Seminar | Zhuang Liu
Halıcıoğlu Data Science Institute (HDSI), Room 123 3234 Matthews Ln, La Jolla, CA, United StatesTalk info: to be provided
Talk info: to be provided
Adam is a postdoctoral scholar at Stanford University, working with Leonidas Guibas. He received a Ph.D. in robotics from Carnegie Mellon University, where he worked with Katerina Fragkiadaki. He received his M.S. in Computer Science at Toronto Metropolitan University, working with Kosta Derpanis. Adam is a recipient of the NSERC PGS-D scholarship, and the Toronto Metropolitan University Gold Medal. His research interests lie in Computer Vision and Machine Learning, particularly for 3D understanding and fine-grained tracking.
Talk Abstract: Although incorporating causal concepts into deep learning shows promise for increasing explainability, fairness, and robustness, existing methods require unrealistic assumptions and aim to recover the full latent causal model. This talk proposes an alternative: domain counterfactuals. Domain counterfactuals ask a more concrete question: "What would a sample look like if it had been […]
Roy Fox, Assistant Professor and Director of the Intelligent Dynamics Lab at UC Irvine HDSI 123 and Zoom (Link below) Abstract: If learning from data is valuable, can learning from big data be very valuable? So far, it has been so in vision and language, for which foundation models can be trained on web-scale data to […]
Abstract: How can models with more parameters than training examples generalize well, and generalize even better when we add even more parameters, even without explicit complexity control? In recent years, it is becoming increasingly clear that much, or perhaps all, of the complexity control and generalization ability of deep learning comes from the optimization bias, […]
Abstract: Deep learning is used across a broad spectrum of applications. However, behind its remarkable performance lies an increasing gap between the demand for and supply of computation. On the demand side, the computational costs of deep learning models have surged dramatically, driven by ever-larger input and model sizes. On the supply side, as Moore's […]
Abstract: "Despite ML models' impressive performance, training and deploying them is currently a somewhat messy endeavor. But does it have to be? In this talk, I overview my work on making ML “predictably reliable”---enabling developers to know when their models will work, when they will fail, and why.
To begin, we use a case study of adversarial inputs to show that human intuition can be a poor predictor of how ML models operate. Motivated by this, we present a line of work that aims to develop a precise understanding of the ML pipeline, combining statistical tools with large-scale experiments to characterize the role of each individual design choice: from how to collect data, to what dataset to train on, to what learning algorithm to use."
Abstract: Large generative models such as ChatGPT have led to amazing results and revolutionized artificial intelligence. In this talk, I will discuss my research on advancing the foundation of these models, centered around addressing the architectural bottlenecks of learning from everything. First, I will describe our efforts to remove context size limitations of the transformer […]
Abstract: We reveal an intriguing and prevalent phenomenon of diffusion models which we term as ``consistent model reproducibility'': given the same starting noise input and a deterministic sampler, different diffusion models often yield remarkably similar outputs while they generate new samples. We demonstrate this phenomenon through comprehensive experiments and theoretical studies, implying that different diffusion […]
Abstract: We reveal an intriguing and prevalent phenomenon of diffusion models which we term as "consistent model reproducibility'': given the same starting noise input and a deterministic sampler, different diffusion models often yield remarkably similar outputs while they generate new samples. We demonstrate this phenomenon through comprehensive experiments and theoretical studies, implying that different diffusion models consistently reach the same data distribution and scoring function regardless of frameworks, model architectures, or training procedures. More strikingly, our further investigation implies that diffusion models are learning distinct distributions affected by the training data size and model capacity, so that the model reproducibility manifests in two distinct training regimes with phase transition: (i) "memorization regime", where the diffusion model overfits to the training data distribution, and (ii) "generalization regime", where the model learns the underlying data distribution and generate new samples with finite training data. Finally, our results have strong practical implications regarding training efficiency, model privacy, and controllable generation of diffusion models, and our work raises numerous intriguing theoretical questions for future investigation.