HDSI Open House 2022
Join us for our annual Open House on August 31st at 3PM PDT (held virtually). The event will provide an in-depth look at our undergraduate and graduate data science talent and opportunities […]
Join us for our annual Open House on August 31st at 3PM PDT (held virtually). The event will provide an in-depth look at our undergraduate and graduate data science talent and opportunities […]
What are the optimal algorithms for learning from data? Have we found them already, or are better ones out there to be discovered? Making these questions precise, and answering them, requires taking on the mathematically deep interplay between statistical and computational constraints. It also requires reconciling our theoretical toolbox with surprising new phenomena arising from practice, which seem to violate conventional rules of thumb regarding algorithm and model design. I will discuss progress along these lines: in terms of designing new algorithms for basic learning problems, controlling generalization in large statistical models, and understanding key statistical questions for generative modeling.
In this lecture we shall present some recent results on the interplay between control and Machine Learning, and more precisely, Supervised Learning and Universal Approximation. We adopt the perspective of the simultaneous or ensemble control of systems of Residual Neural Networks (ResNets). Roughly, each item to be classified corresponds to a different initial datum for the Cauchy problem of the ResNets, leading to an ensemble of solutions to be driven to the corresponding targets, associated to the labels, by means of the same control. We present a genuinely nonlinear and constructive method, allowing to show that such an ambitious goal can be achieved, estimating the complexity of the control strategies. This property is rarely fulfilled by the classical dynamical systems in Mechanics and the very nonlinear nature of the activation function governing the ResNet dynamics plays a determinant role. It allows deforming half of the phase space while the other half remains invariant, a property that classical models in mechanics do not fulfill. The turnpike property is also analyzed in this context, showing that a suitable choice of the cost functional used to train the ResNet leads to more stable and robust dynamics. This lecture is inspired in joint work, among others, with Borjan Geshkovski (MIT), Carlos Esteve (Cambridge), Domènec Ruiz-Balet (IC, London) and Dario Pighin (Sherpa.ai).
A key challenge in cryptography is to ensure that a protocol resists all computationally feasible attacks, even when an adversary decides to follow a completely arbitrary and unpredictable strategy. This […]
This talk will cover current VA ocular telehealth programs and future directions, including our research and collaborations for AI, predictive analytics, and very early preliminary results from the Eye911 trial that I am running right now.
Reinforcement learning (RL) is a pillar for modern artificial intelligence. Compared to classical statistical learning, several new statistical and computational phenomena arise from RL problems, leading to different trade-offs in the choice of the estimators, tuning of their parameters, and the design of efficient algorithms. In many settings, asymptotic and/or worst-case theory fails to provide the relevant guidance.
In this talk, I present recent advances that involve a more refined approach to RL, one that leads to non-asymptotic and instance-optimal guarantees. The bulk of this talk focuses on function approximation methods for policy evaluation. I establish a novel class of optimal and instance-dependent oracle inequalities for projected Bellman equations, as well as efficient computational algorithms achieving them. Among other results, I will highlight how the instance-optimal guarantees guide the selection of tuning parameters in temporal different methods, and tackle the instability issue with general function classes. Drawing on this perspective, I will also discuss a novel class of stochastic approximation methods that yield optimal statistical guarantees for policy optimization problems.
Recent progress in Artificial Intelligence (AI) and Machine Learning (ML) has provided groundbreaking methods for processing large data sets. These new techniques are particularly powerful when dealing with scientific data with complex structures, non-linear relationships, and unknown uncertainties that are challenging to model and analyze with traditional tools. This has triggered a flurry of activity in science and engineering, developing new methods to tackle problems which used to be impossible or extremely hard to deal with.
The goal of this symposium is to bring together researchers and practitioners at the intersection of AI and Science, to discuss opportunities to use AI to accelerate scientific discovery, and to explore the potential of scientific knowledge to guide AI development. The symposium will provide a platform to nurture the research community, to fertilize interdisciplinary ideas, and shape the vision of future developments in the rapidly growing field of AI + Science.
We plan to use the symposium as the launching event for the AI + Science event series, co-hosted by Computer Science and Engineering(CSE), Halıcıoğlu Data Science Institute (HDSI), and Scripps Institution of Cceanography(SIO) at UC San Diego. The symposium will include a combination of invited talks, posters, panel discussions, social and networking events. The first event will put a particular emphasis on AI + physical sciences. We will invite contribution and participation from physics, engineering, and oceanography, among others. Part of the program will highlight the research from climate science, as a result of our DOE funded scientific ML project for tackling climate extremes.
In many scientific settings we use a statistical model to describe a high-dimensional distribution over many variables. Such models are often represented as a weighted graph encoding the dependencies between different variables and are known as graphical models. Graphical models arise in a wide variety of scientific fields throughout science and engineering.
Human language is extraordinarily complex. Nevertheless, we readily acquire language as children, when we are most cognitively limited, and we comprehend language as adults with striking efficiency. My research seeks to understand the mental algorithms that allow us to accomplish this feat, with particular focus on how memory and prediction mechanisms are recruited to overcome the bottlenecks of real-time language processing. In this talk, I will review results from three of my lines of inquiry into this question. First, using diverse naturalistic reading datasets, I will show evidence that prediction is a central concern of the human language processing system. Second, using fMRI measures of naturalistic story listening, I will show evidence that memory and prediction processes are dissociable in the brain's response to language, that syntactic structure building plays a major role in ordinary language comprehension, and that the neural resources that are responsible for structure building are largely specialized for language. Third, I will show evidence from computational modeling that memory and prediction pressures independently encourage discovery of phonological regularities from natural speech. Together, these results support an intricate coordination of memory and prediction abilities for language learning and comprehension. I will conclude by outlining planned directions for my future lab, integrating neuroimaging, behavioral methods, natural language processing, and computational modeling to study language learning and processing.
The field of natural language processing has recently unlocked a wide range of new capabilities through the use of large language models, such as GPT-4. The growing application of these models motivates developing a more thorough understanding of how and why they work, as well as further improvements in both quality and efficiency.
In this talk, I will present my work on analyzing and improving the Transformer architecture underlying today's language models through the study of how information is routed between multiple words in an input. I will show that such models can predict the syntactic structure of text in a variety of languages, and discuss how syntax can inform our understanding of how the networks operate. I will also present my work on structuring information flow to build radically more efficient models, including models that can process text of up to one million words, which enables new possibilities for NLP with book-length text.
