BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Halıcıoğlu Data Science Institute - UC San Diego - ECPv6.16.2//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-ORIGINAL-URL:https://datascience.ucsd.edu
X-WR-CALDESC:Events for Halıcıoğlu Data Science Institute - UC San Diego
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:America/Los_Angeles
BEGIN:DAYLIGHT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
TZNAME:PDT
DTSTART:20220313T100000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
TZNAME:PST
DTSTART:20221106T090000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
TZNAME:PDT
DTSTART:20230312T100000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
TZNAME:PST
DTSTART:20231105T090000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
TZNAME:PDT
DTSTART:20240310T100000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
TZNAME:PST
DTSTART:20241103T090000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
TZNAME:PDT
DTSTART:20250309T100000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
TZNAME:PST
DTSTART:20251102T090000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240404T080000
DTEND;TZID=America/Los_Angeles:20240405T120000
DTSTAMP:20260528T112927
CREATED:20240226T234322Z
LAST-MODIFIED:20240313T192254Z
UID:10000418-1712217600-1712318400@datascience.ucsd.edu
SUMMARY:Causality Workshop
DESCRIPTION:
URL:https://www.eventbrite.com/e/ucsd-hdsi-causality-workshop-tickets-817326594847?aff=oddtdtcreator
LOCATION:SDSC\, The Auditorium\, 9836 Hopkins Dr\, La Jolla\, San Diego\, CA\, United States
CATEGORIES:Workshops
ATTACH;FMTTYPE=image/png:https://datascience.ucsd.edu/wp-content/uploads/2024/01/HDSI_Causality_Wrkshp_Eventbrite.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20230626T120000
DTEND;TZID=America/Los_Angeles:20230626T130000
DTSTAMP:20260528T112927
CREATED:20230629T160446Z
LAST-MODIFIED:20230629T160646Z
UID:10000396-1687780800-1687784400@datascience.ucsd.edu
SUMMARY:The Emergence of General AI for Medicine | Dr. Peter Lee
DESCRIPTION:Dr. Peter Lee is Corporate Vice President of Research and Incubations at Microsoft where he leads Microsoft Research and incubates new research-powered products and lines of business in areas such as artificial intelligence\, computing foundations\, health\, and life sciences. He speaks and writes widely on science and technology trends\, including the attached NEJM article “Benefits\, Limits\, and Risks of GPT-4 as an AI Chatbot for Medicine” and recently published a book with Dr. Isaac Kohane\, “The AI Revolution in Medicine: GPT-4 and Beyond.” \nBefore joining Microsoft in 2010\, he was at DARPA\, where he established a new technology office that created operational capabilities in machine learning\, data science\, and computational social science. Prior to that\, he was a professor and the head of the computer science department at Carnegie Mellon University. Dr. Lee is a member of the National Academy of Medicine and serves on the Boards of Directors of several institutes for the Allen Institute for Artificial Intelligence\, the Brotman Baty Institute for Precision Medicine\, and the Kaiser Permanente Bernard J. Tyson School of Medicine. He served on President Obama’s Commission on Enhancing National Cybersecurity and led studies for PCAST and the National Academies. He has testified before both the US House Science and Technology Committee and the US Senate Commerce Committee.
URL:https://datascience.ucsd.edu/event/the-emergence-of-general-ai-for-medicine-dr-peter-lee/
LOCATION:SDSC\, The Auditorium\, 9836 Hopkins Dr\, La Jolla\, San Diego\, CA\, United States
CATEGORIES:Guest Lecture
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20230531T140000
DTEND;TZID=America/Los_Angeles:20230531T153000
DTSTAMP:20260528T112927
CREATED:20230530T153213Z
LAST-MODIFIED:20230530T153213Z
UID:10000390-1685541600-1685547000@datascience.ucsd.edu
SUMMARY:On the complexity of Frank-Wolfe methods
DESCRIPTION:Abstract: Frank-Wolfe methods are popular for optimization over a polytope. One of the reasons is because they do not need projection onto the polytope but only linear optimization over it. This talk has two parts. \nThe first part will be about the complexity of Wolfe’s method\, an algorithm closely related to Frank-Wolfe methods. In 1974 Phillip Wolfe proposed a method to find the minimum Euclidean-norm point in a convex polyhedron. The method is essentially the same as the Lawson-Hanson algorithm for non-negative least squares. The complexity of Wolfe’s method has remained unknown since he proposed it. The method is important because it is used as a subroutine for one of the most practical algorithms for submodular function minimization. We present the first example that Wolfe’s method takes exponential time. Additionally\, we improve previous results to show that linear programming reduces in strongly-polynomial time to the minimum norm point problem over a simplex. \nThe second part will be about the smoothed complexity of Frank-Wolfe methods. To understand their complexity\, a fruitful approach in many\nworks has been the use of condition measures of polytopes. Lacoste-Julien and Jaggi introduced a condition number for polytopes and showed linear convergence for several variations of the method. The actual running time can still be exponential in the worst case (when the condition number is exponential). We study the smoothed complexity of the condition number\, namely the condition number of small random perturbations of the input polytope and show that it is polynomial for any simplex and exponential for general polytopes. Our argument for polytopes is a refinement of an argument that we develop to study the conditioning of random matrices. The basic argument shows that for c > 1\, a d-by-n random Gaussian matrix with n >= cd has a d-by-d submatrix with minimum singular value that is exponentially small with high probability. This also has consequences on known results about the robust uniqueness of tensor decompositions\, the complexity of the simplex method and the diameter of polytopes.
URL:https://datascience.ucsd.edu/event/on-the-complexity-of-frank-wolfe-methods/
LOCATION:SDSC\, The Auditorium\, 9836 Hopkins Dr\, La Jolla\, San Diego\, CA\, United States
CATEGORIES:Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20230530T140000
DTEND;TZID=America/Los_Angeles:20230530T153000
DTSTAMP:20260528T112927
CREATED:20230530T153018Z
LAST-MODIFIED:20230530T153018Z
UID:10000389-1685455200-1685460600@datascience.ucsd.edu
SUMMARY:Representation Learning: A Causal Perspective
DESCRIPTION:Abstract: Representation learning constructs low-dimensional representations to summarize essential features of high-dimensional data like images and texts. Ideally\, such a representation should efficiently capture non-spurious features of the data. It shall also be disentangled so that we can interpret what feature each of its dimensions captures. However\, these desiderata are often intuitively defined and challenging to quantify or enforce. \nIn this talk\, we take on a causal perspective of representation learning. We show how desiderata of representation learning can be formalized using counterfactual notions\, enabling metrics and algorithms that target efficient\, non-spurious\, and disentangled representations of data. We discuss the theoretical underpinnings of the algorithm and illustrate its empirical performance in both supervised and unsupervised representation learning.
URL:https://datascience.ucsd.edu/event/representation-learning-a-causal-perspective/
LOCATION:SDSC\, The Auditorium\, 9836 Hopkins Dr\, La Jolla\, San Diego\, CA\, United States
CATEGORIES:Colloquium
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20230518T140000
DTEND;TZID=America/Los_Angeles:20230518T150000
DTSTAMP:20260528T112927
CREATED:20230323T181111Z
LAST-MODIFIED:20230513T000759Z
UID:10000364-1684418400-1684422000@datascience.ucsd.edu
SUMMARY:Scaling and Generalizing Approximate Bayesian Inference | David Blei
DESCRIPTION:Abstract: A core problem in statistics and machine learning is to approximate difficult-to-compute probability distributions. This problem is especially important in Bayesian statistics\, which frames all inference about unknown quantities as a calculation about a conditional distribution. In this talk I review and discuss innovations in variational inference (VI)\, a method that approximates probability distributions through optimization. VI has been used in myriad applications in machine learning and Bayesian statistics. It tends to be faster than more traditional methods\, such as Markov chain Monte Carlo sampling. \nAfter quickly reviewing the basics\, I will discuss two lines of research in VI. I first describe stochastic variational inference\, an approximate inference algorithm for handling massive datasets\, and demonstrate its application to probabilistic topic models of millions of articles. Then I discuss black box variational inference\, a generic algorithm for approximating the posterior. Black box inference easily applies to many models but requires minimal mathematical work to implement. I will demonstrate black box inference on deep exponential families—a method for Bayesian deep learning—and describe how it enables powerful tools for probabilistic programming. \n  \nBio: David Blei is a Professor of Statistics and Computer Science at Columbia University\, and a member of the Columbia Data Science\nInstitute. He studies probabilistic machine learning\, including its theory\, algorithms\, and application. David has received several awards for his research. He received a Sloan Fellowship (2010)\, Office of Naval Research Young Investigator Award (2011)\, Presidential Early Career Award for Scientists and Engineers (2011)\, Blavatnik Faculty Award (2013)\, ACM-Infosys Foundation Award (2013)\, a Guggenheim fellowship (2017)\, and a Simons Investigator Award (2019). He is the co-editor-in-chief of the Journal of Machine Learning Research. He is a fellow of the ACM and the IMS. \nWebsite : http://www.cs.columbia.edu/~blei/ \n  \nZoom Link : http://bit.ly/HDSI-Seminars
URL:https://datascience.ucsd.edu/event/david-blei/
LOCATION:SDSC\, The Auditorium\, 9836 Hopkins Dr\, La Jolla\, San Diego\, CA\, United States
CATEGORIES:Colloquium,Guest Lecture
ATTACH;FMTTYPE=image/jpeg:https://datascience.ucsd.edu/wp-content/uploads/2023/03/professordavisblei_headshot-scaled.jpg
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20230510T140000
DTEND;TZID=America/Los_Angeles:20230510T153000
DTSTAMP:20260528T112927
CREATED:20230509T195847Z
LAST-MODIFIED:20230510T170811Z
UID:10000385-1683727200-1683732600@datascience.ucsd.edu
SUMMARY:Proximal MCMC for Bayesian Inference of Constrained and Regularized Estimation
DESCRIPTION:Abstract: In this talk I will introduce some extensions to the proximal Markov Chain Monte Carlo (Proximal MCMC) – a flexible and general Bayesian inference framework for constrained or regularized parametric estimation. The basic idea of Proximal MCMC is to approximate nonsmooth regularization terms via the Moreau-Yosida envelope. Initial proximal MCMC strategies\, however\, fixed nuisance and regularization parameters as constants\, and relied on the Langevin algorithm for the posterior sampling. We extend Proximal MCMC to the full Bayesian framework with modeling and data-adaptive estimation of all parameters including regularization parameters. More efficient sampling algorithms such as the Hamiltonian Monte Carlo are employed to scale Proximal MCMC to high-dimensional problems. Our proposed Proximal MCMC offers a versatile and modularized procedure for the inference of constrained and non-smooth problems that is mostly tuning parameter free. We illustrate its utility on various statistical estimation and machine learning tasks.
URL:https://datascience.ucsd.edu/event/proximal-mcmc-for-bayesian-inference-of-constrained-and-regularized-estimation/
LOCATION:SDSC\, The Auditorium\, 9836 Hopkins Dr\, La Jolla\, San Diego\, CA\, United States
CATEGORIES:Seminar
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20230503T140000
DTEND;TZID=America/Los_Angeles:20230503T153000
DTSTAMP:20260528T112927
CREATED:20230501T161933Z
LAST-MODIFIED:20230501T161933Z
UID:10000382-1683122400-1683127800@datascience.ucsd.edu
SUMMARY:Security and Privacy in an Everchanging System Landscape
DESCRIPTION:Abstract: From AI and IoT to AR/VR and Web 3.0\, computer systems are evolving at an unprecedented rate. While this evolution has given rise to exciting applications and opportunities\, it has also brought about novel security and privacy challenges within these systems and across their interactions with existing platforms. In this talk\, I will discuss how system security researchers can keep up with this everchanging landscape and showcase some of my lab’s recent work on understanding and detecting malicious web bots. I will explore how we can build and roll out research infrastructure to measure web bot activities and later use our newfound understanding to develop practical solutions to counter them. I will highlight how we can apply similar research principles to areas such as AI and IoT. Finally\, I will conclude my talk by previewing some of my ongoing work and outlining my research roadmap toward achieving “security at inception” for emerging systems. \nBio: Amir Rahmati is an Assistant Professor in the Department of Computer Science at Stony Brook University\, where he leads the Ethos Security & Privacy lab. He received his Ph.D. in Computer Science & Engineering from the University of Michigan in 2017. His research focuses on understanding emerging threats in computer systems and building practical solutions that can tackle their security and privacy challenges. His work has resulted in tens of publications and patents\, as well as thousands of citations. Rahmati’s research is supported by the Air Force Office of Scientific Research (AFOSR)\, Office of Naval Research (ONR)\, Meta\, and IBM. His research has received frequent attention from media outlets\, including MIT Technology Review\, Washington Post\, and Bloomberg. His work on the security of autonomous driving systems is part of the permanent display at the London Science Museum.
URL:https://datascience.ucsd.edu/event/security-and-privacy-in-an-everchanging-system-landscape/
LOCATION:SDSC\, The Auditorium\, 9836 Hopkins Dr\, La Jolla\, San Diego\, CA\, United States
CATEGORIES:Seminar
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20230425T140000
DTEND;TZID=America/Los_Angeles:20230425T140000
DTSTAMP:20260528T112927
CREATED:20230424T213937Z
LAST-MODIFIED:20230424T214101Z
UID:10000381-1682431200-1682431200@datascience.ucsd.edu
SUMMARY:Leveraging Simulators for ML Inference in Particle Physics
DESCRIPTION:Abstract: The field of research investigating machine-learning (ML) methods that can exploit a physical model of the world through simulators is rapidly growing\, particularly for applications in particle physics. While these methods have shown considerable promise in phenomenological studies\, they are also known to be susceptible to inaccuracies in the simulators used to train them. In this work\, we design a novel analysis strategy that uses the concept of simulation-based inference for a crucial Higgs Boson measurement\, where traditional methods are rendered sub-optimal due to quantum interference between Higgs and non-Higgs processes. Our work develops uncertainty quantification methods that account for the impact of inaccuracies in the simulators\, uncertainties in the ML predictions themselves\, and novel strategies to test the coverage of these quoted uncertainties. These new ML methods leverage the vast computational resources that have recently become available to perform scientific measurements in a way that was not feasible before. In addition\, this talk briefly discusses certain ML-bias-mitigation methods developed in particle physics and their potential wider applications.\nBio: Dr. Aishik Ghosh is a postdoctoral scholar at UC Irvine and Berkeley National Lab where he develops innovative machine learning solutions for particle physics\, and is part of the ATLAS collaboration. He earned his Ph.D. from University of Paris-Saclay where he developed the first deep generative models for fast calorimeter simulation in the ATLAS experiment. Since then he has worked on several topics at the intersection of ML and uncertainty quantification and uncertainty mitigation\, including applications in astrophysics\, as well as generative models for physics simulation. Recently\, he has been working on reinforcement learning methods for particle physics. Dr. Ghosh has fostered interdisciplinary collaborations within academia and with industry. He has contributed to a book on Artificial Intelligence for High Energy Physics and organises ML training schools for graduate students. Dr. Ghosh consults on AI policy with international organisations like the OECD\, with whom he has published writings on Trustworthy AI and AI for Science\, and has given interviews to organisations like The Royal Society\,
URL:https://datascience.ucsd.edu/event/leveraging-simulators-for-ml-inference-in-particle-physics/
LOCATION:SDSC\, The Auditorium\, 9836 Hopkins Dr\, La Jolla\, San Diego\, CA\, United States
CATEGORIES:Seminar
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20230419T100000
DTEND;TZID=America/Los_Angeles:20230419T130000
DTSTAMP:20260528T112927
CREATED:20230403T185835Z
LAST-MODIFIED:20230407T172914Z
UID:10000371-1681898400-1681909200@datascience.ucsd.edu
SUMMARY:Chatting GPT
DESCRIPTION:Artificial Intelligence (AI) systems have made astonishing progress in the last year. In particular\, Large Language Models (LLMs) — AI systems trained on massive amounts of text — have reached a surprising level of capability\, with the most recent iterations able to write essays\, poems\, and computer code\, and score near the 90th percentile on standardized tests such as the LSAT and the Math SAT. The most popular interface to this technology\, ChatGPT\, made the power of LLMs readily-available to the general public for the first time\, and in doing so became the fastest-growing consumer application in history. It is clear that ChatGPT and other LLMs will have major impacts on how we work\, learn\, and live — and there is a sense that we have only seen the tip of the iceberg in terms of what these technologies can do. \nIn this series of talks and panels\, targeted to the campus community and open to the general public\, UCSD experts will discuss ChatGPT and other generative artificial intelligence: What is it? How does it work? What are its ethical implications? And what impacts will it have on fields such as medicine\, business\, and education?
URL:https://www.sdsc.edu/event_items/202304-ChatGPT.html
LOCATION:SDSC\, The Auditorium\, 9836 Hopkins Dr\, La Jolla\, San Diego\, CA\, United States
CATEGORIES:Symposium,Webinar
ATTACH;FMTTYPE=image/png:https://datascience.ucsd.edu/wp-content/uploads/2023/04/UCSD-Lecture-Template_Chatting-GPT-e1680886379636.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20230417T140000
DTEND;TZID=America/Los_Angeles:20230417T153000
DTSTAMP:20260528T112927
CREATED:20230413T234714Z
LAST-MODIFIED:20230413T234959Z
UID:10000380-1681740000-1681745400@datascience.ucsd.edu
SUMMARY:Beyond classification: using Machine Learning to probe new physics with the ATLAS experiment in “impossible” final states
DESCRIPTION:Abstract: Although the discovery of the Higgs Boson is often referred to as the completion of the Standard Model of Particle Physics\, the many outstanding mysteries of our universe indicate that some unknown new physics is awaiting discovery. Machine learning has played an increasingly critical role in searching for this new physics\, typically by better separating a physical process of interest (signal) from other Standard Model processes producing similar detector signatures (background). However\, we can also cleverly utilize machine learning to better understand these background processes\, opening up “impossible” regions of data for analysis. In this talk\, I will present two examples of analyses from the ATLAS experiment utilizing machine learning to tackle especially challenging backgrounds. I will also discuss how future advances in machine learning in both data analysis and particle detector hardware will continue to open new avenues for probing for new physics.\n\n\nBio: Dr. Rachel Hyneman is currently a postdoctoral researcher working with Dr. Michael Kagan at SLAC National Accelerator Laboratory\, where she studies physics at the smallest scales as part of the ATLAS Experiment at CERN. Her research has focused on taking advantage of machine learning techniques to search for evidence of new physics hiding in the behavior of the Higgs Boson\, as well as the developing the construction procedure and readout of the upgraded ATLAS Inner Tracker detector for the High-Luminosity LHC program. She earned her PhD in physics from the University of Michigan\, Ann Arbor\, under the supervision of Dr. Tom Schwarz. Prior to her graduate studies\, she earned her bachelors degree in physics with a minor in music from the College of William and Mary in Virginia. Outside of physics\, Rachel enjoys playing double bass and venturing to mountains for hiking and skiing.”\n\nZoom Info: http://bit.ly/HDSI-Seminars
URL:https://datascience.ucsd.edu/event/beyond-classification-using-machine-learning-to-probe-new-physics-with-the-atlas-experiment-in-impossible-final-states/
LOCATION:SDSC\, The Auditorium\, 9836 Hopkins Dr\, La Jolla\, San Diego\, CA\, United States
CATEGORIES:Seminar
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20230411T140000
DTEND;TZID=America/Los_Angeles:20230411T153000
DTSTAMP:20260528T112927
CREATED:20230302T000628Z
LAST-MODIFIED:20240402T224727Z
UID:10000351-1681221600-1681227000@datascience.ucsd.edu
SUMMARY:Responsible AI: Privacy and Fairness in Decision and Learning Systems
DESCRIPTION:Differential Privacy has become the go-to approach for protecting sensitive information in data releases and learning tasks that are used for critical decision processes. For example\, census data is used to allocate funds and distribute benefits\, while several corporations use machine learning systems for criminal assessments\, hiring decisions\, and more. While this privacy notion provides strong guarantees\, we will show that it may also induce biases and fairness issues in downstream decision processes. These issues may adversely affect many individuals’ health\, well-being\, and sense of belonging\, and are currently poorly understood. \nIn this talk\, we delve into the intersection of privacy\, fairness\, and decision processes\, with a focus on understanding and addressing these fairness issues. We first provide an overview of Differential Privacy and its applications in data release and learning tasks. Next\, we examine the societal impacts of privacy through a fairness lens and present a framework to illustrate what aspects of the private algorithms and/or data may be responsible for exacerbating unfairness. We hence show how to extend this framework to assess the disparate impacts arising in Machine Learning tasks. Finally\, we propose a path to partially mitigate these fairness issues and discuss grand challenges that require further exploration. \nBio: Ferdinando Fioretto is an assistant professor at Syracuse University. He works at the juncture of Machine Learning\, optimization\, privacy\, and ethics focusing on two themes: (1) Responsible AI: it analyzes the equity of AI systems in support of decision-making and learning tasks and designs algorithms that better align with societal values and (2) ML for Science and Engineering: it develops the foundation to blend deep learning with mathematical optimization to enable the integration of knowledge\, constraints\, and physical principles into learning models. \nHe is a recipient of the 2022 NSF CAREER award\, the 2022 Amazon Research Award\, the 2022 Google Research Scholar Award\, the 2022 Caspar Bowden PET award\, the 2021 ISSNAF Mario Gerla Young Investigator Award\, the 2021 ACP Early Career Researcher Award\, the 2017 AI*AI Best AI dissertation award\, and several best paper awards. He is also actively involved in the organization of several events\, including the Privacy-Preserving Artificial Intelligence workshop at AAAI\, the Algorithmic Fairness through the lens of Causality and Privacy at NeurIPS\, and the Optimization and Learning in multiagent systems workshop at AAMAS.
URL:https://datascience.ucsd.edu/event/ferdinando-nando-fioretto/
LOCATION:SDSC\, The Auditorium\, 9836 Hopkins Dr\, La Jolla\, San Diego\, CA\, United States
CATEGORIES:Seminar
ATTACH;FMTTYPE=image/png:https://datascience.ucsd.edu/wp-content/uploads/2023/03/Ferdinando-Fioretto-e1680886080944.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20230410T120000
DTEND;TZID=America/Los_Angeles:20230410T130000
DTSTAMP:20260528T112927
CREATED:20230406T192237Z
LAST-MODIFIED:20230407T165127Z
UID:10000375-1681128000-1681131600@datascience.ucsd.edu
SUMMARY:Language Models and Human Language Acquisition
DESCRIPTION:Abstract: Children have a remarkable ability to acquire language. This propensity has been an object of fascination in science for millennia\, but in just the last few years\, neural language models (LMs) have also proven to be incredibly adept at learning human language. In this talk\, I discuss scientific progress that uses recent developments in natural language processing to advance linguistics—and vice-versa. My research explores this intersection from three angles: evaluation\, experimentation\, and engineering. Using linguistically motivated benchmarks\, I provide evidence that LMs share many aspects of human grammatical knowledge and probe how this knowledge varies across training regimes. I further argue that—under the right circumstances—we can use LMs to test hypotheses that have been difficult or impossible to evaluate with human subjects. Such experiments have the potential to transform debates about the roles of nature and nurture in human language learning. As a proof of concept\, I describe a controlled experiment examining how the distribution of linguistic phenomena in the input affects syntactic generalization. While the results suggest that the linguistic stimulus may be richer than often thought\, there is no avoiding the fact that current LMs and humans learn language in vastly different ways. I describe ongoing work to engineer learning environments and objectives for LM pretraining inspired by human development\, with the goal of making LMs more data efficient and more plausible models of human learning.\n \nBio: Alex Warstadt is a postdoc in the computer science department at ETH Zürich working with Ryan Cotterell. In 2022\, he completed a PhD in linguistics at New York University supervised by Sam Bowman. Alex works on a variety of topics at the intersection of natural language processing and linguistics\, including language model pretraining\, evaluation and interpretability\, language acquisition\, and pragmatics.
URL:https://datascience.ucsd.edu/event/language-models-and-human-language-acquisition/
LOCATION:SDSC\, The Auditorium\, 9836 Hopkins Dr\, La Jolla\, San Diego\, CA\, United States
CATEGORIES:Seminar
ATTACH;FMTTYPE=image/png:https://datascience.ucsd.edu/wp-content/uploads/2023/04/Alex-Warstadt-e1680886271585.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20230406T140000
DTEND;TZID=America/Los_Angeles:20230406T150000
DTSTAMP:20260528T112927
CREATED:20230302T000631Z
LAST-MODIFIED:20240402T224726Z
UID:10000354-1680789600-1680793200@datascience.ucsd.edu
SUMMARY:Intelligent mobile systems for equitable healthcare
DESCRIPTION:Access to even basic medical resources is greatly influenced by factors like an individual’s birth country and zip code. In this talk\, I will present my work on designing AI-based mobile systems for equitable healthcare. I will showcase three systems that are not only interesting from an AI standpoint but are also having real-world medical impact. The first system can detect ear infections using only a smartphone and a paper cone. The second system enables low-cost newborn hearing screening using inexpensive earphones. Lastly\, I will present an ambient sensing system that employs smart devices to detect emergent and life-threatening medical events such as cardiac arrest. Through these examples\, I will demonstrate how new applied machine learning and sensing approaches that generalize across hardware and work in real-world environments can help to address pressing societal problems. \nBio: Justin Chan is a Ph.D. candidate at the Paul G. Allen School of Computer Science and Engineering at the University of Washington. His work on smartphone-based ear infections is now FDA-listed and is available to select early access healthcare systems. His work on new-born hearing screening has led to an international effort called TUNE with the goal of bringing universal newborn hearing screening across Kenya as well as collaborations with NGOs such as the Global Foundation for Children with Hearing Loss to deploy this technology in Nepal and Mongolia. His work on contactless cardiac arrest detection has been licensed to a startup which has recently been acquired by Google. He was also a lead contributor for CovidSafe (now WA Notify)\, a COVID-19 contact tracing and symptom tracking app\, which became part of official efforts by the WA Department of Health to manage the pandemic. He has authored publications in interdisciplinary journals like Nature Biomedical Engineering\, Science Translational Medicine\, Nature Communications as well as Computer Science and Engineering venues like MobiSys\, MobiCom\, SIGCOMM\, SIGGRAPH Asia and UIST.
URL:https://datascience.ucsd.edu/event/justin-chan/
LOCATION:SDSC\, The Auditorium\, 9836 Hopkins Dr\, La Jolla\, San Diego\, CA\, United States
CATEGORIES:Guest Lecture,Seminar
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20230406T110000
DTEND;TZID=America/Los_Angeles:20230406T123000
DTSTAMP:20260528T112927
CREATED:20230323T182059Z
LAST-MODIFIED:20240402T224726Z
UID:10000369-1680778800-1680784200@datascience.ucsd.edu
SUMMARY:Acceleration in Optimization\, Sampling\, and Machine Learning
DESCRIPTION:Optimization\, sampling\, and machine learning are essential components of data science. In this talk\, I will cover my work on accelerated methods in these fields and highlight some connections between them. \nIn optimization\, I will present optimization as a two-player zero-sum game\, which is a modular approach for designing and analyzing convex optimization algorithms by pitting a pair of no-regret learning strategies against each other. This approach not only recovers several existing algorithms but also gives rise to new ones. I will also discuss the use of Heavy Ball in non-convex optimization\, which is a popular momentum method in deep learning. Despite its success in practice\, Heavy Ball currently lacks theoretical evidence for its acceleration in non-convex optimization. To bridge this gap\, I will present some non-convex problems where Heavy Ball exhibits provable acceleration guarantees. \nIn sampling\, I will describe how to accelerate a classical sampling method called Hamiltonian Monte Carlo by setting its integration time appropriately\, which builds on a connection between sampling and optimization. In machine learning\, I will talk about Gradient Descent with pseudo-labels for fast test-time adaptation under the context of tackling distribution shifts. \nBio: Jun-Kun Wang is a postdoctoral researcher in the Department of Computer Science at Yale University\, working with Dr. Andre Wibisono. He received his Ph.D. in Computer Science from the Georgia Institute of Technology in 2021\, advised by Dr. Jacob Abernethy. He earned an MS in Communication Engineering and a BS in Electrical Engineering from National Taiwan University. His research interests are in the theoretical and algorithmic foundations of optimization\, sampling\, and machine learning.
URL:https://datascience.ucsd.edu/event/jun-kun-wang/
LOCATION:SDSC\, The Auditorium\, 9836 Hopkins Dr\, La Jolla\, San Diego\, CA\, United States
CATEGORIES:Seminar
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20230405T140000
DTEND;TZID=America/Los_Angeles:20230405T153000
DTSTAMP:20260528T112927
CREATED:20230323T181933Z
LAST-MODIFIED:20230403T190338Z
UID:10000368-1680703200-1680708600@datascience.ucsd.edu
SUMMARY:Constrained\, Casual\, and Logical Reasoning for Neural Language Generation
DESCRIPTION:Today’s language models (LMs) can produce human-like fluent text. However\, they generate words with no grounding in the world and cannot flexibly reason about everyday situations and events\, such as counterfactual (“what if?”) and abductive (“what might explain these observations?”) reasoning that are important forms of human cognition activities. In this talk\, I will present my research on connecting reasoning with language generation. Reasoning for language generation poses several key challenges\, including incorporating diverse contextual constraints on the fly\, understanding cause and effect when events unfold\, and grounding on logic structures for consistent reasoning. I will first discuss COLD decoding\, a unified energy-based framework for any off-the-shelf LMs to reason with arbitrary constraints. It also introduces differentiable reasoning over discrete symbolic text for improved efficiency. Secondly\, I will focus on a particularly important form of reasoning\, counterfactual reasoning\, including its first formulation in language generation and our algorithm\, DeLorean\, that enables off-the-shelf LMs to capture causal invariance. Thirdly\, I will present Maieutic prompting\, which improves the logical consistency of neural reasoning by integrating with logic structures. I will conclude with future research toward more general\, grounded\, and trustworthy reasoning with language. \nBio: Lianhui Qin is a final year PhD student in Paul G. Allen School of Computer Science & Engineering at University of Washington\, advised by Prof. Yejin Choi. Her research interests lie in natural language processing\, artificial intelligence\, and machine learning\, with a particular focus on natural language reasoning and generation. Her research has been recognized with Best Paper Award at NAACL 2022\, Best Paper Award at WeCNLP 2020\, Best Demo Paper Nomination at ACL 2019\, as well as Microsoft Research PhD Fellowship.
URL:https://datascience.ucsd.edu/event/lianhui-qin/
LOCATION:SDSC\, The Auditorium\, 9836 Hopkins Dr\, La Jolla\, San Diego\, CA\, United States
CATEGORIES:Seminar
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20230404T140000
DTEND;TZID=America/Los_Angeles:20230404T153000
DTSTAMP:20260528T112927
CREATED:20230323T181718Z
LAST-MODIFIED:20230403T190229Z
UID:10000367-1680616800-1680622200@datascience.ucsd.edu
SUMMARY:Lijun Ding
DESCRIPTION:
URL:https://datascience.ucsd.edu/event/lijun-ding/
LOCATION:SDSC\, The Auditorium\, 9836 Hopkins Dr\, La Jolla\, San Diego\, CA\, United States
CATEGORIES:Seminar
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20230403T140000
DTEND;TZID=America/Los_Angeles:20230403T153000
DTSTAMP:20260528T112927
CREATED:20230323T181509Z
LAST-MODIFIED:20240402T224726Z
UID:10000366-1680530400-1680535800@datascience.ucsd.edu
SUMMARY:Frameworks for High Dimensional Optimization
DESCRIPTION:I present frameworks for solving extremely large\, prohibitively massive optimization problems. Today\, practical applications require optimization solvers to work at extreme scales\, but existing solvers do not often scale as desired. I present black-box acceleration algorithms for speeding up optimization solvers\, in both distributed and parallel settings. Given a huge problem\, I develop dimension reduction techniques that allow the problem to be solved in a fraction of the original time\, and simultaneously make the computation amenable to distributed computation. Efficient\, dependable and secure distributed computing is increasingly fundamental to a wide range of core applications including distributed data centers\, decentralized power grid\, coordination of autonomous devices\, and scheduling and routing problems. \nIn particular\, I consider two optimization settings of interest. First\, I consider packing linear programming (LP). LP solvers are fundamental to many problems in supply chain management\, routing\, learning and inference problems. I present a framework that speeds up linear programming solvers such as Cplex and Gurobi by an order of magnitude\, while maintaining provably nearly optimal solutions. Secondly\, I present a distributed algorithm that achieves an exponential reduction in message complexity compared to existing distributed methods. I present both empirical demonstrations and theoretical guarantees on the quality of the solution and the speedup provided by my methods
URL:https://datascience.ucsd.edu/event/palma-london/
LOCATION:SDSC\, The Auditorium\, 9836 Hopkins Dr\, La Jolla\, San Diego\, CA\, United States
CATEGORIES:Seminar
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20230329T140000
DTEND;TZID=America/Los_Angeles:20230329T153000
DTSTAMP:20260528T112927
CREATED:20230302T000631Z
LAST-MODIFIED:20230323T175902Z
UID:10000353-1680098400-1680103800@datascience.ucsd.edu
SUMMARY:Distance-Estimation in Modern Graphs: Algorithms and Impossibility | Nicole Wein
DESCRIPTION:The size and complexity of today’s graphs present challenges that necessitate the discovery of new algorithms. One central area of research in this endeavor is computing and estimating distances in graphs. In this talk I will discuss two fundamental families of distance problems in the context of modern graphs: Diameter/Radius/Eccentricities and Hopsets/Shortcut Sets. The best-known algorithm for computing the diameter (largest distance) of a graph is the naive algorithm of computing all-pairs shortest paths and returning the largest distance. Unfortunately\, this can be prohibitively slow for massive graphs. Thus\, it is important to understand how fast and how accurately the diameter of a graph can be approximated. I will present tight bounds for this problem via conditional lower bounds from fine-grained complexity. Secondly\, for a number of settings relevant to modern graphs (e.g. parallel algorithms\, streaming algorithms\, dynamic algorithms)\, distance computation is more efficient when the input graph has low hop-diameter. Thus\, a useful preprocessing step is to add a set of edges (a hopset) to the graph that reduces the hop-diameter of the graph\, while preserving important distance information. I will present progress on upper and lower bounds for hopsets. \n  \nBio: Nicole Wein is a Simons Postdoctoral Leader at DIMACS at Rutgers University. Previously\, she obtained her Ph.D. from MIT advised by Virginia Vassilevska Williams. She is a theoretical computer scientist and her research interests include graph algorithms and lower bounds including in the areas of distance-estimation algorithms\, dynamic algorithms\, and fine-grained complexity.
URL:https://datascience.ucsd.edu/event/nicole-wein/
LOCATION:SDSC\, The Auditorium\, 9836 Hopkins Dr\, La Jolla\, San Diego\, CA\, United States
CATEGORIES:Seminar
END:VEVENT
BEGIN:VEVENT
DTSTART;VALUE=DATE:20230323
DTEND;VALUE=DATE:20230325
DTSTAMP:20260528T112928
CREATED:20230323T180908Z
LAST-MODIFIED:20230712T092822Z
UID:10000363-1679529600-1679702399@datascience.ucsd.edu
SUMMARY:PhD Open House
DESCRIPTION:HDSI will host the first ever in person DSC Inaugural Open House for Prospective PhD Students (Mar 23 + 24).  The event will include Graduate Student Poster Session and will showcase HDSI diverse research activities. Prospective PhD students are invited. \n  \nPlease contact Laura Horton (lkhorton@ucsd.edu) for event details.
URL:https://datascience.ucsd.edu/event/phd-open-house/
LOCATION:SDSC\, The Auditorium\, 9836 Hopkins Dr\, La Jolla\, San Diego\, CA\, United States
CATEGORIES:Student Event
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20230320T140000
DTEND;TZID=America/Los_Angeles:20230320T140000
DTSTAMP:20260528T112928
CREATED:20230302T000631Z
LAST-MODIFIED:20230317T154618Z
UID:10000352-1679320800-1679320800@datascience.ucsd.edu
SUMMARY:Sampling from Graphical Models via Spectral Independence | Zongchen Chen
DESCRIPTION:Abstract: In many scientific settings we use a statistical model to describe a high-dimensional distribution over many variables. Such models are often represented as a weighted graph encoding the dependencies between different variables and are known as graphical models. Graphical models arise in a wide variety of scientific fields throughout science and engineering.\nOne fundamental task for graphical models is to generate random samples from the associated distribution. The Markov chain Monte Carlo (MCMC) method is one of the simplest and most popular approaches to tackle such problems. Despite the popularity of graphical models and MCMC algorithms\, theoretical guarantees of their performance are not known even for some simple models. I will describe a new tool called “spectral independence” to analyze MCMC algorithms and more importantly to reveal the underlying structure behind such models. I will also discuss how these structural properties can be applied to sampling when MCMC fails and to other statistical problems like parameter learning or model fitting. \n\nBio: Zongchen Chen is an instructor (postdoc) in Mathematics at MIT. He received his PhD degree in Algorithms\, Combinatorics and Optimization (ACO) at Georgia Tech in 2021 advised by Eric Vigoda. His thesis received the 2021 Georgia Tech College of Computing Outstanding Doctoral Dissertation Award. He received his BS degree in Mathematics & Applied Mathematics from Zhiyuan College at Shanghai Jiao Tong University in 2016. He is broadly interested in randomized algorithms\, discrete probability\, and machine learning. His current research interests include Markov chain Monte Carlo (MCMC) methods\, approximate counting and sampling\, and learning and testing for high-dimensional distributions.
URL:https://datascience.ucsd.edu/event/zongchen-chen/
LOCATION:SDSC\, The Auditorium\, 9836 Hopkins Dr\, La Jolla\, San Diego\, CA\, United States
CATEGORIES:Seminar
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20230316T140000
DTEND;TZID=America/Los_Angeles:20230316T153000
DTSTAMP:20260528T112928
CREATED:20230315T154024Z
LAST-MODIFIED:20230315T154024Z
UID:10000360-1678975200-1678980600@datascience.ucsd.edu
SUMMARY:Optimal methods for reinforcement learning: Efficient algorithms with instance-dependent guarantees | Wenlong Mou
DESCRIPTION:Abstract: Reinforcement learning (RL) is a pillar for modern artificial intelligence. Compared to classical statistical learning\, several new statistical and computational phenomena arise from RL problems\, leading to different trade-offs in the choice of the estimators\, tuning of their parameters\, and the design of efficient algorithms. In many settings\, asymptotic and/or worst-case theory fails to provide the relevant guidance.\nIn this talk\, I present recent advances that involve a more refined approach to RL\, one that leads to non-asymptotic and instance-optimal guarantees. The bulk of this talk focuses on function approximation methods for policy evaluation. I establish a novel class of optimal and instance-dependent oracle inequalities for projected Bellman equations\, as well as efficient computational algorithms achieving them. Among other results\, I will highlight how the instance-optimal guarantees guide the selection of tuning parameters in temporal different methods\, and tackle the instability issue with general function classes. Drawing on this perspective\, I will also discuss a novel class of stochastic approximation methods that yield optimal statistical guarantees for policy optimization problems. \nBio: Wenlong Mou is a Ph.D. candidate at Department of EECS\, UC Berkeley\, advised by Martin Wainwright and Peter Bartlett. Prior to Berkeley\, he received his B.Sc. degree in Computer Science from Peking University. Wenlong’s research interests include statistics\, machine learning theory\, dynamic programming and optimization\, and applied probability. He is particularly interested in designing optimal statistical methods that enable optimal data-driven decision making\, powered by efficient computational algorithms.
URL:https://datascience.ucsd.edu/event/optimal-methods-for-reinforcement-learning-efficient-algorithms-with-instance-dependent-guarantees-wenlong-mou/
LOCATION:SDSC\, The Auditorium\, 9836 Hopkins Dr\, La Jolla\, San Diego\, CA\, United States
CATEGORIES:Seminar
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20230306T150000
DTEND;TZID=America/Los_Angeles:20230306T150000
DTSTAMP:20260528T112928
CREATED:20230302T000631Z
LAST-MODIFIED:20230302T183320Z
UID:10000355-1678114800-1678114800@datascience.ucsd.edu
SUMMARY:Enrique Zuazua | Control and Machine Learning
DESCRIPTION:In this lecture we shall present some recent results on the interplay between control and Machine Learning\, and more precisely\, Supervised Learning and Universal Approximation. We adopt the perspective of the simultaneous or ensemble control of systems of Residual Neural Networks (ResNets). Roughly\, each item to be classified corresponds to a different initial datum for the Cauchy problem of the ResNets\, leading to an ensemble of solutions to be driven to the corresponding targets\, associated to the labels\, by means of the same control. We present a genuinely nonlinear and constructive method\, allowing to show that such an ambitious goal can be achieved\, estimating the complexity of the control strategies. This property is rarely fulfilled by the classical dynamical systems in Mechanics and the very nonlinear nature of the activation function governing the ResNet dynamics plays a determinant role. It allows deforming half of the phase space while the other half remains invariant\, a property that classical models in mechanics do not fulfill. The turnpike property is also analyzed in this context\, showing that a suitable choice of the cost functional used to train the ResNet leads to more stable and robust dynamics. This lecture is inspired in joint work\, among others\, with Borjan Geshkovski (MIT)\, Carlos Esteve (Cambridge)\, Domènec Ruiz-Balet (IC\, London) and Dario Pighin (Sherpa.ai). \nBio \nEnrique Zuazua Iriondo (Eibar\, Basque Country – Spain\, 1961) holds a Chair of Dynamics\, Control and Numerics – Alexander von Humboldt Professorship at FAU- Friedrich–Alexander University\, Erlangen–Nürnberg (Germany). He also holds secondary appointments as Professor of Applied Mathematics (UAM) and Director of CCM – Chair of Computational Mathematics (Deusto). \nHis research in the area of Applied Mathematics covers topics in Partial Differential Equations\, Systems Control\, Numerical Analysis and Machine Learning\, and led to fruitful collaborations in different industrial sectors such as the optimal shape design in aeronautics\, the management of electrical and water distribution networks and the design of recommendation systems. His research had a high impact (h-index 46) and he has mentored a significant number of postdoctoral researchers and coached a wide network of Science managers. \nHe holds a degree in Mathematics from the University of the Basque Country\, and a dual PhD degree from the same university (1987) and the Université Pierre et Marie Curie\, Paris (1988). In 1990 he became Professor of Applied Mathematics at the Complutense University of Madrid\, to later move to UAM in 2001. He has been awarded the Euskadi (Basque Country) Prize for Science and Technology 2006 and the Spanish National Julio Rey Pastor Prize 2007 in Mathematics and Information and Communication Technology\, the Advanced Grants NUMERIWAVES in 2010 and DyCon in 2016 of the European Research Council (ERC) and the SIAM W.T. and Idalia Reid Prize 2022. \nHe is an Honorary member of the of Academia Europaea and Jakiunde\, the Basque Academy of Sciences\, Letters and Humanities\, Doctor Honoris Causa from the Université de Lorraine in France and Ambassador of the Friedrisch-Alexandre University in Erlangen-Nurenberg\, Germany. He was an invited speaker at ICM2006 in the section on Control and Optimization. \nFrom 1999-2002 he was the first Scientific Manager of the Panel for Mathematics within the Spanish National Research Plan and the Founding Scientific Director of the BCAM – Basque Center for Applied Mathematics from 2008-2012. He is also a member of the Scientific Council of a number of international research institutions such as the INSMI-CNRS and CERFACS in France and member of the Editorial Board in some of the leading journals in Applied Mathematics and Control Theory.
URL:https://datascience.ucsd.edu/event/enrique-zuazua/
LOCATION:SDSC\, The Auditorium\, 9836 Hopkins Dr\, La Jolla\, San Diego\, CA\, United States
CATEGORIES:Seminar
ATTACH;FMTTYPE=image/jpeg:https://datascience.ucsd.edu/wp-content/uploads/2023/03/enriquezuazuairiondo_headshot.jpg
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20230306T100000
DTEND;TZID=America/Los_Angeles:20230306T100000
DTSTAMP:20260528T112928
CREATED:20230302T000627Z
LAST-MODIFIED:20230303T163910Z
UID:10000347-1678096800-1678096800@datascience.ucsd.edu
SUMMARY:Towards the Statistically Principled Design of ML Algorithms | Frederic Koehler
DESCRIPTION:What are the optimal algorithms for learning from data? Have we found them already\, or are better ones out there to be discovered? Making these questions precise\, and answering them\, requires taking on the mathematically deep interplay between statistical and computational constraints. It also requires reconciling our theoretical toolbox with surprising new phenomena arising from practice\, which seem to violate conventional rules of thumb regarding algorithm and model design. I will discuss progress along these lines: in terms of designing new algorithms for basic learning problems\, controlling generalization in large statistical models\, and understanding key statistical questions for generative modeling. \nBio: Frederic is currently a Motwani Postdoctoral Fellow in the Department of Computer Science at Stanford University. He was previously a research fellow at the Simons Institute\, and before that received his PHD in Mathematics and Statistics.
URL:https://datascience.ucsd.edu/event/frederic-koehler/
LOCATION:SDSC\, The Auditorium\, 9836 Hopkins Dr\, La Jolla\, San Diego\, CA\, United States
CATEGORIES:Guest Lecture
END:VEVENT
END:VCALENDAR