BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Halıcıoğlu Data Science Institute - UC San Diego - ECPv6.16.2//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-ORIGINAL-URL:https://datascience.ucsd.edu
X-WR-CALDESC:Events for Halıcıoğlu Data Science Institute - UC San Diego
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:America/Los_Angeles
BEGIN:DAYLIGHT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
TZNAME:PDT
DTSTART:20230312T100000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
TZNAME:PST
DTSTART:20231105T090000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
TZNAME:PDT
DTSTART:20240310T100000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
TZNAME:PST
DTSTART:20241103T090000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
TZNAME:PDT
DTSTART:20250309T100000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
TZNAME:PST
DTSTART:20251102T090000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240405T123000
DTEND;TZID=America/Los_Angeles:20240405T140000
DTSTAMP:20260603T122002
CREATED:20240329T001842Z
LAST-MODIFIED:20240329T001950Z
UID:10000468-1712320200-1712325600@datascience.ucsd.edu
SUMMARY:"Advancing NLP for Timely and Actionable Feedback in Healthcare Conversations"  | Veronica Perez-Rosas
DESCRIPTION:Abstract: “Effective communication is crucial in healthcare for ensuring successful clinical interactions\, as it affects how patients respond\, the decisions  being made by both patients and clinicians\, and the outcomes of treatments. Recent developments in Natural Language Processing (NLP) aim to improve and support these interactions within clinical settings. In this talk\, I will discuss my research on offering timely and actionable evaluative feedback for mental healthcare interactions\, addressing a crucial bottleneck in effective mental healthcare delivery. I will specifically focus on computational approaches for building conversational systems to aid in psychotherapy training\, and present two NLP tasks to generate language-based feedback: (1) generating counselor responses following established counseling strategies\, and (2) offering alternative rewrites to counseling trainees’ responses to refine their counseling skills. I will conclude the talk by outlining future directions towards my long-term agenda of building computational approaches that understand\, model\, and predict health behaviors while also being human-centric and scalable” \nBio: “Veronica Perez-Rosas is an Assistant Research Scientist at the University of Michigan. She received her Ph.D. in Computer Science and Engineering from the University of North Texas in 2014\, and was a postdoctoral fellow at the University of Michigan until 2016. Her research interests include Natural Language Processing\, Machine Learning\,  Affect Recognition\, and Multimodal Processing of Human Behavior. Her research focuses on developing computational methods to analyze\, recognize\, and predict human behaviors during social interactions. She has authored papers in leading conferences and journals in Natural Language Processing and Multimodal Processing\, has mentored numerous students in these research areas\, and has served as workshop chair or area chair for multiple international conferences in the field.”
URL:https://datascience.ucsd.edu/event/advancing-nlp-for-timely-and-actionable-feedback-in-healthcare-conversations-veronica-perez-rosas/
LOCATION:Halıcıoğlu Data Science Institute (HDSI)\, Room 123\, 3234 Matthews Ln\, La Jolla\, CA\, 92093\, United States
CATEGORIES:Seminar
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240403T140000
DTEND;TZID=America/Los_Angeles:20240403T153000
DTSTAMP:20260603T122002
CREATED:20240326T221709Z
LAST-MODIFIED:20240329T001345Z
UID:10000462-1712152800-1712158200@datascience.ucsd.edu
SUMMARY:"Contextualized learning for adaptive yet persistent AI in biomedicine" | Ben Lengerich
DESCRIPTION:Abstract: “In biomedical data analysis\, an emerging trend focuses on contextualizing observations within biological and real-world processes. This approach facilitates high-resolution\, context-specific insights by integrating information across datasets\, but it is difficult to design systems which both share information and dynamically adapt to context. Toward this aim\, this presentation will examine “contextualized learning”\, a meta-learning paradigm which learns relationships between dataset context and statistical parameters. Using contextualized network inference as an illustrative example\, I will show how we can estimate context-specific graphical models\, offering insights such as personalized gene expression analysis for SOTA cancer subtyping. The talk will also discuss trends towards “contextualized understanding”\, bridging statistical and foundation models to standardize interpretability. The primary aim is to illustrate how contextualized learning and understanding contribute to creating learning systems that are both adaptive and persistent\, facilitating cross-context information sharing and detailed analysis.” \nBio: “Ben Lengerich is a Postdoctoral Associate and Alana Fellow at MIT’s Computer Science and Artificial Intelligence Lab (CSAIL) and the Broad Institute of MIT and Harvard\, where he is advised by Manolis Kellis. His research in machine learning and computational biology emphasizes the use of context-adaptive models to understand complex diseases and advance precision medicine. Through his work\, Ben aims to bridge the gap between data-driven insights and actionable medical interventions. He holds a PhD in Computer Science and MS in Machine Learning from Carnegie Mellon University\, where he was advised by Eric Xing. His work has been recognized with spotlight presentations at conferences including NeurIPS\, ISMB\, AMIA\, and SMFM\, financial support from the Alana Foundation\, selection as a “”Rising Star in Data Science” by the University of Chicago and UC San Diego\, and “”Next Generation in Biomedicine”” by the Broad Institute.”
URL:https://datascience.ucsd.edu/event/special-seminar-ben-lengerich/
LOCATION:Computer Science & Engineering Building (CSE)\, Room 1202
CATEGORIES:Seminar
ATTACH;FMTTYPE=image/png:https://datascience.ucsd.edu/wp-content/uploads/2024/01/HDSI-UCSD-Image_Dark-blue-e1710178042629.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240403T120000
DTEND;TZID=America/Los_Angeles:20240403T133000
DTSTAMP:20260603T122002
CREATED:20240327T215044Z
LAST-MODIFIED:20240401T225533Z
UID:10000466-1712145600-1712151000@datascience.ucsd.edu
SUMMARY:MathWorks & HDSI AI Seminar | Esperanza Linares
DESCRIPTION:HDSI! Come and join MathWorks Engineers for a technical seminar on AI (and lunch!) on Wednesday\, April 3! Come learn why data scientists should learn MATLAB – we will highlight the tools that will be serve your role as data scientists and data science students. You can also learn about our engineer’s journey\, roles available at MathWorks\, and the use of our tools in industry! \nMathworks UCSD Technical Seminar Series \nLow-Code AI in MATLAB \nLearn how you can apply AI in your field without extensive knowledge in programming. This hands-on session includes a quick recap on the fundamentals of AI and three exercises where you will learn how to classify human activities using MATLAB® interactive tools and apps: \n1. Accessing and preprocessing data acquired from a mobile device\n2. Applying clustering to the unlabeled data using the Cluster Data Live Editor Task\n3. Classifying the labeled data using two apps: Classification Learner app and the Deep Network Designer app \nAt the end of the seminar\, you will be able to design and train different machine learning and deep learning models without extensive programming knowledge. You will also learn how to automatically generate code from the interactive workflow. This will not only help you to reuse the models without manually going through all the steps but also to learn programming or advance your coding skills. \nAbout the Speaker: \nEsperanza Linares is a Senior Customer Success Engineer at MathWorks. She is part of a global team that partners with academic and research institutions worldwide\, focusing on student and research success. Before joining MathWorks\, she did her postdoctoral work in the pharmaceutical industry\, where she developed a discrete element method model to simulate the compaction of granular materials. She holds a BS in Mechanical Engineering from UNAM (Mexico) and a Ph.D. in Mechanical Engineering from Caltech. \nRegistration Link: https://forms.office.com/Pages/ResponsePage.aspx?id=ETrdmUhDaESb3eUHKx3B5tTIy0i-nn1KjKWuEYZzK09UNVNXNFM4NTA3Q045REVJWUNHNjcxUkZSTi4u \n*Lunch will be provided
URL:https://datascience.ucsd.edu/event/mathworks-hdsi-ai-seminar-esperanza-linares/
LOCATION:Halıcıoğlu Data Science Institute (HDSI)\, Room 123\, 3234 Matthews Ln\, La Jolla\, CA\, 92093\, United States
CATEGORIES:Seminar
ATTACH;FMTTYPE=image/png:https://datascience.ucsd.edu/wp-content/uploads/2023/03/mathworks_logo.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240402T140000
DTEND;TZID=America/Los_Angeles:20240402T153000
DTSTAMP:20260603T122002
CREATED:20240313T191528Z
LAST-MODIFIED:20240313T191528Z
UID:10000459-1712066400-1712071800@datascience.ucsd.edu
SUMMARY:Special Seminar | Xuhai Xu
DESCRIPTION:
URL:https://datascience.ucsd.edu/event/special-seminar-xuhai-xu/
LOCATION:Halıcıoğlu Data Science Institute (HDSI)\, Room 123\, 3234 Matthews Ln\, La Jolla\, CA\, 92093\, United States
CATEGORIES:Seminar
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240401T140000
DTEND;TZID=America/Los_Angeles:20240401T153000
DTSTAMP:20260603T122002
CREATED:20240304T172031Z
LAST-MODIFIED:20240329T000153Z
UID:10000453-1711980000-1711985400@datascience.ucsd.edu
SUMMARY:"Instance-Optimization: Rethinking Database Design for the Next 1000X" | Jialin Ding
DESCRIPTION:Abstract: “Modern database systems aim to support a large class of different use cases while simultaneously achieving high performance. However\, as a result of their generality\, databases often achieve adequate performance for the average use case but do not achieve the best performance for any individual use case. In this talk\, I will describe my work on designing databases that use machine learning and optimization techniques to automatically achieve performance much closer to the optimal for each individual use case. In particular\, I will present my work on instance-optimized database storage layouts\, in which the co-design of data structures and optimization policies improves query performance in analytic databases by orders of magnitude. I will highlight how these instance-optimized data layouts address various challenges posed by real-world database workloads and how I implemented and deployed them in production within Amazon Redshift\, a widely-used commercial database system.” \nBio: “Jialin Ding is an Applied Scientist at AWS. Before that\, he received his PhD in computer science from MIT\, advised by Tim Kraska. He works broadly on applying machine learning and optimization techniques to improve data management systems\, with a focus on building databases that automatically self-optimize to achieve high performance for any specific application. His work has appeared in top conferences such as SIGMOD\, VLDB\, and CIDR\, and has been recognized by a Meta Research PhD Fellowship. To learn more about Jialin’s work\, please visit https://jialinding.github.io/.”
URL:https://datascience.ucsd.edu/event/special-seminar-jialin-ding/
LOCATION:Halıcıoğlu Data Science Institute (HDSI)\, Room 123\, 3234 Matthews Ln\, La Jolla\, CA\, 92093\, United States
CATEGORIES:Seminar
ATTACH;FMTTYPE=image/png:https://datascience.ucsd.edu/wp-content/uploads/2024/01/HDSI-UCSD-Image_Dark-blue-e1710178042629.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240401T110000
DTEND;TZID=America/Los_Angeles:20240401T123000
DTSTAMP:20260603T122002
CREATED:20240328T234406Z
LAST-MODIFIED:20240328T234737Z
UID:10000467-1711969200-1711974600@datascience.ucsd.edu
SUMMARY:How Do We Get There?: Toward Intelligent Behavior Intervention | Xuhai Xu
DESCRIPTION:Abstract: As the intelligence of everyday smart devices continues to evolve\, they can already monitor basic health behaviors such as physical activities and heart rates. The vision of an intelligent behavior change intervention pipeline for health — combining behavior modeling & interaction design — seems to be within reach. How do we get there? \nIn this talk\, I will introduce a comprehensive intervention pipeline that bridges behavior science theory-driven designs and generalizable behavior models. I will also introduce my efforts on passive sensing datasets\, human-centered algorithms\, and a benchmark platform that drives the community toward more robust and deployable intervention systems for health and well-being. \nBio: Xuhai “Orson” Xu is a postdoc at MIT EECS. He received his PhD at the University of Washington. Specializing in human-computer interaction\, applied machine learning\, and health\, Xu develops intelligent behavior intervention systems to promote human health and well-being. His research covers two aspects — 1) building deployable human-centered behavior models and 2) designing interactive user experiences — to establish a complete system to improve end-users’ well-being. Moreover\, his research also goes beyond end-users and supports health experts by designing new human-AI collaboration paradigms in clinical settings. Xu has earned several awards\, including 9 Best Paper\, Best Paper Honorable Mention\, and Best Artifact awards. His research has been covered by media outlets such as the Washington Post and ACM News. He was recognized as the Outstanding Student Award Winner at UbiComp 2022\, the 2023 UW Distinguished Dissertation Award\, and the 2024 Innovation and Technology Award at the Western Association of Graduate Schools.  \nZoom:  https://ucsd.zoom.us/j/92792843021\nPassword: 741675
URL:https://datascience.ucsd.edu/event/how-do-we-get-there-toward-intelligent-behavior-intervention-xuhai-xu/
LOCATION:Computer Science & Engineering Building (CSE)\, Room 1242\, 3234 Matthews Ln\, La Jolla\, CA\, 92093\, United States
CATEGORIES:Seminar
ATTACH;FMTTYPE=image/png:https://datascience.ucsd.edu/wp-content/uploads/2024/01/HDSI-UCSD-Image_Dark-blue-e1710178042629.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240328T140000
DTEND;TZID=America/Los_Angeles:20240328T153000
DTSTAMP:20260603T122002
CREATED:20240326T031112Z
LAST-MODIFIED:20240326T031112Z
UID:10000464-1711634400-1711639800@datascience.ucsd.edu
SUMMARY:The Emergence of Reproducibility and Generalizability in Diffusion Models | Qing Qu
DESCRIPTION:Abstract: We reveal an intriguing and prevalent phenomenon of diffusion models which we term as “consistent model reproducibility”: given the same starting noise input and a deterministic sampler\, different diffusion models often yield remarkably similar outputs while they generate new samples. We demonstrate this phenomenon through comprehensive experiments and theoretical studies\, implying that different diffusion models consistently reach the same data distribution and scoring function regardless of frameworks\, model architectures\, or training procedures. More strikingly\, our further investigation implies that diffusion models are learning distinct distributions affected by the training data size and model capacity\, so that the model reproducibility manifests in two distinct training regimes with phase transition: (i) “memorization regime”\, where the diffusion model overfits to the training data distribution\, and (ii) “generalization regime”\, where the model learns the underlying data distribution and generate new samples with finite training data. Finally\, our results have strong practical implications regarding training efficiency\, model privacy\, and controllable generation of diffusion models\, and our work raises numerous intriguing theoretical questions for future investigation. \nSpeaker Bio: Qing Qu is an assistant professor in EECS department at the University of Michigan. Prior to that\, he was a Moore-Sloan data science fellow at Center for Data Science\, New York University\, from 2018 to 2020. He received his Ph.D from Columbia University in Electrical Engineering in Oct. 2018. He received his B.Eng. from Tsinghua University in Jul. 2011\, and a M.Sc.from the Johns Hopkins University in Dec. 2012\, both in Electrical and Computer Engineering. His research interest lies at the intersection of foundation of data science\, machine learning\, numerical optimization\, and signal/image processing\, with focus on developing efficient nonconvex methods and global optimality guarantees for solving representation learning and nonlinear inverse problems in engineering and imaging sciences. He is the recipient of Best Student Paper Award at SPARS’15\, and the recipient of Microsoft PhD Fellowship in machine learning in 2016\, and best paper awards in NeurIPS Diffusion Model Workshop in 2023. He received the NSF Career Award in 2022\, and Amazon Research Award (AWS AI) in 2023. He is the program chair of the new Conference on Parsimony & Learning.
URL:https://datascience.ucsd.edu/event/the-emergence-of-reproducibility-and-generalizability-in-diffusion-models-qing-qu/
LOCATION:Halıcıoğlu Data Science Institute (HDSI)\, Room 123\, 3234 Matthews Ln\, La Jolla\, CA\, 92093\, United States
CATEGORIES:Seminar
ATTACH;FMTTYPE=image/png:https://datascience.ucsd.edu/wp-content/uploads/2024/01/HDSI-UCSD-Image_Dark-blue-e1710178042629.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240328T140000
DTEND;TZID=America/Los_Angeles:20240328T153000
DTSTAMP:20260603T122002
CREATED:20240304T171827Z
LAST-MODIFIED:20240323T082150Z
UID:10000452-1711634400-1711639800@datascience.ucsd.edu
SUMMARY:The Emergence of Reproducibility and Generalizability in Diffusion Models | Qing Qu
DESCRIPTION:Abstract: We reveal an intriguing and prevalent phenomenon of diffusion models which we term as “consistent model reproducibility”: given the same starting noise input and a deterministic sampler\, different diffusion models often yield remarkably similar outputs while they generate new samples. We demonstrate this phenomenon through comprehensive experiments and theoretical studies\, implying that different diffusion models consistently reach the same data distribution and scoring function regardless of frameworks\, model architectures\, or training procedures. More strikingly\, our further investigation implies that diffusion models are learning distinct distributions affected by the training data size and model capacity\, so that the model reproducibility manifests in two distinct training regimes with phase transition: (i) “memorization regime”\, where the diffusion model overfits to the training data distribution\, and (ii) “generalization regime”\, where the model learns the underlying data distribution and generate new samples with finite training data. Finally\, our results have strong practical implications regarding training efficiency\, model privacy\, and controllable generation of diffusion models\, and our work raises numerous intriguing theoretical questions for future investigation. \nBio: “Qing Qu is an assistant professor in EECS department at the University of Michigan. Prior to that\, he was a Moore-Sloan data science fellow at Center for Data Science\, New York University\, from 2018 to 2020. He received his Ph.D from Columbia University in Electrical Engineering in Oct. 2018. He received his B.Eng. from Tsinghua University in Jul. 2011\, and a M.Sc.from the Johns Hopkins University in Dec. 2012\, both in Electrical and Computer Engineering. His research interest lies at the intersection of foundation of data science\, machine learning\, numerical optimization\, and signal/image processing\, with focus on developing efficient nonconvex methods and global optimality guarantees for solving representation learning and nonlinear inverse problems in engineering and imaging sciences.\nHe is the recipient of Best Student Paper Award at SPARS’15\, and the recipient of Microsoft PhD Fellowship in machine learning in 2016\, and best paper awards in NeurIPS Diffusion Model Workshop in 2023. He received the NSF Career Award in 2022\, and Amazon Research Award (AWS AI) in 2023. He is the program chair of the new Conference on Parsimony & Learning.”
URL:https://datascience.ucsd.edu/event/special-seminar-qing-qu/
LOCATION:Halıcıoğlu Data Science Institute (HDSI)\, Room 123\, 3234 Matthews Ln\, La Jolla\, CA\, 92093\, United States
CATEGORIES:Seminar
ATTACH;FMTTYPE=image/png:https://datascience.ucsd.edu/wp-content/uploads/2024/01/HDSI-UCSD-Image_Dark-blue-e1710178042629.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240327T140000
DTEND;TZID=America/Los_Angeles:20240327T153000
DTSTAMP:20260603T122002
CREATED:20240313T191359Z
LAST-MODIFIED:20240323T081955Z
UID:10000458-1711548000-1711553400@datascience.ucsd.edu
SUMMARY:Towards a Machine Capable of Learning Everything | Hao Liu
DESCRIPTION:Abstract: Large generative models such as ChatGPT have led to amazing results and revolutionized artificial intelligence. In this talk\, I will discuss my research on advancing the foundation of these models\, centered around addressing the architectural bottlenecks of learning from everything. First\, I will describe our efforts to remove context size limitations of the transformer architecture. Our new model architecture and training method allow for nearly infinitely large context sizes without approximations. Our proposed technique has been used for building state-of-the-art open-source and proprietary models. I will then discuss the applications of large context in world model learning and in reinforcement learning\, including Large World Model\, the world’s first multimodal model of million-length scale\, and the required training methodologies. Next\, I will introduce my research on unsupervised exploration that pioneered learning beyond existing knowledge\, allowing unsupervised pretrained models to outperform human experts in gameplay and paving the road for learning beyond imitating existing knowledge. Finally\, I will envision the modeling and training paradigms for the next generation of large generative models we should build\, focusing on advances in neural net architecture\, efficient scaling\, large context reasoning\, and discovery.” \nBio: Hao Liu is a final-year Ph.D. candidate in the Department of Electrical Engineering and Computer Sciences at UC Berkeley\, where he is advised by Pieter Abbeel. During his PhD\, he has also spent two years part-time at Google Brain and DeepMind. His research interests focus on the foundations of generative models\, including machine learning and neural networks\, with the goal of developing computationally scalable solutions for generalization. He recently developed Large World Model (LWM) and architectural advances (BlockwiseTransformers\, and RingAttention) for scaling transformers. Earlier\, he pioneered general and scalable unsupervised exploration (APT and APS). His work on million-length contexts has been influential at Google\, Meta\, and the broader industry. Several of his papers have been presented as spotlight and oral presentations at top-tier machine learning conferences\, and have also been featured in popular media\, including MarkTechPost\, Business Insider\, and ZDNet.
URL:https://datascience.ucsd.edu/event/special-seminar-hao-liu/
LOCATION:Halıcıoğlu Data Science Institute (HDSI)\, Room 123\, 3234 Matthews Ln\, La Jolla\, CA\, 92093\, United States
CATEGORIES:Seminar
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240326T140000
DTEND;TZID=America/Los_Angeles:20240326T153000
DTSTAMP:20260603T122002
CREATED:20240304T171618Z
LAST-MODIFIED:20240326T030740Z
UID:10000451-1711461600-1711467000@datascience.ucsd.edu
SUMMARY:Making machine learning predictably reliable | Andrew Ilyas
DESCRIPTION:Abstract: “Despite ML models’ impressive performance\, training and deploying them is currently a somewhat messy endeavor. But does it have to be? In this talk\, I overview my work on making ML “predictably reliable”—enabling developers to know when their models will work\, when they will fail\, and why. \nTo begin\, we use a case study of adversarial inputs to show that human intuition can be a poor predictor of how ML models operate. Motivated by this\, we present a line of work that aims to develop a precise understanding of the ML pipeline\, combining statistical tools with large-scale experiments to characterize the role of each individual design choice: from how to collect data\, to what dataset to train on\, to what learning algorithm to use.” \n\nBio “Andrew Ilyas is a PhD student in Computer Science at MIT\, where he is advised by Aleksander Madry and Constantinos Daskalakis. His research aims to improve the reliability and predictability of machine learning systems. He was previously supported by an Open Philanthropy AI Fellowship.”
URL:https://datascience.ucsd.edu/event/special-seminar-andrew-ilyas/
LOCATION:Halıcıoğlu Data Science Institute (HDSI)\, Room 123\, 3234 Matthews Ln\, La Jolla\, CA\, 92093\, United States
CATEGORIES:Seminar
ATTACH;FMTTYPE=image/png:https://datascience.ucsd.edu/wp-content/uploads/2024/01/HDSI-UCSD-Image_Dark-blue-e1710178042629.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240322T140000
DTEND;TZID=America/Los_Angeles:20240322T153000
DTSTAMP:20260603T122002
CREATED:20240313T190501Z
LAST-MODIFIED:20240321T174552Z
UID:10000457-1711116000-1711121400@datascience.ucsd.edu
SUMMARY:Efficient Deep Learning with Sparsity: Algorithms\, Systems\, and Applications | Zhijian Liu
DESCRIPTION:Abstract: Deep learning is used across a broad spectrum of applications. However\, behind its remarkable performance lies an increasing gap between the demand for and supply of computation. On the demand side\, the computational costs of deep learning models have surged dramatically\, driven by ever-larger input and model sizes. On the supply side\, as Moore’s Law slows down\, hardware no longer delivers increasing performance within the same power budget. \nIn this talk\, I will discuss my research efforts to bridge this demand-supply gap through the lens of sparsity. I will begin with my research on input sparsity. First\, I will introduce algorithms that systematically eliminate the least important patches/tokens from dense input data\, such as images\, enabling up to 60% sparsity without any loss in accuracy. Then\, I will present the system library that we have developed to effectively translate the theoretical savings from sparsity to practical speedups on hardware. Our system is up to 3 times faster than the leading industry solution from NVIDIA. Following this\, I will touch on my research on model sparsity\, highlighting a family of automated\, hardware-aware model compression frameworks that surpass manual solutions in accuracy and reduce the design cycle from weeks of human efforts to mere hours of GPU computation. Finally\, I will demonstrate the use of sparsity to accelerate a wide range of computation-intensive AI applications\, such as autonomous driving\, language modeling\, and high-energy physics. I will conclude this talk with my vision towards building more efficient and accessible AI. \nBio: Zhijian Liu is a Ph.D. candidate at MIT\, advised by Song Han. His research focuses on efficient machine learning and systems. He has developed efficient ML algorithms and provided them with effective system support. He has also contributed to accelerating computation-intensive AI applications in computer vision\, natural language processing\, and scientific discovery. His work has been featured as oral and spotlight presentations at conferences such as NeurIPS\, ICLR\, and CVPR. He was selected as the recipient of the Qualcomm Innovation Fellowship and the NVIDIA Graduate Fellowship. He was also recognized as a Rising Star in ML and Systems by MLCommons and a Rising Star in Data Science by UChicago and UCSD. Previously\, he was the founding research scientist at OmniML\, which was acquired by NVIDIA.
URL:https://datascience.ucsd.edu/event/special-seminar-zhijian-liu/
LOCATION:Halıcıoğlu Data Science Institute (HDSI)\, Room 123\, 3234 Matthews Ln\, La Jolla\, CA\, 92093\, United States
CATEGORIES:Seminar
ATTACH;FMTTYPE=image/png:https://datascience.ucsd.edu/wp-content/uploads/2024/01/HDSI-UCSD-Image_Dark-blue-e1710178042629.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240320T140000
DTEND;TZID=America/Los_Angeles:20240320T153000
DTSTAMP:20260603T122002
CREATED:20240313T184800Z
LAST-MODIFIED:20240318T230405Z
UID:10000456-1710943200-1710948600@datascience.ucsd.edu
SUMMARY:Understanding Deep Learning through Optimization Geometry|  Nati (Nathan) Srebro
DESCRIPTION:Abstract: How can models with more parameters than training examples generalize well\, and generalize even better when we add even more parameters\, even without explicit complexity control?  In recent years\, it is becoming increasingly clear that much\, or perhaps all\, of the complexity control and generalization ability of deep learning comes from the optimization bias\, or implicit bias\, of the training procedures.  In this talk\, I will survey our work from the past several years on highlighting the role of optimization geometry in determining such implicit bias\, and understanding deep learning through it\, and how this view influences the study of further deep learning phenomena. \nBio: Nati (Nathan) Srebro is a professor at the Toyota Technological Institute at Chicago\, with cross-appointments at the University of Chicago’s Department of Computer Science\, and Committee on Computational and Applied Mathematics. He obtained his PhD from the Massachusetts Institute of Technology in 2004\, and previously was a postdoctoral fellow at the University of Toronto\, a visiting scientist at IBM\, and an associate professor at the Technion\, and held visiting position at the Weizmann Institute and at École Polytechnique Fédérale de Lausanne. \nDr. Srebro’s research encompasses methodological\, statistical and computational aspects of machine learning\, as well as related problems in optimization. Some of Srebro’s significant contributions include work on learning “wider” Markov networks\, introducing the use of the nuclear norm for machine learning\, introducing the “equalized odds” fairness notion for non-discrimination\, work on fast optimization techniques for machine learning\, and on the relationship between learning and optimization. \nWebsite: https://nati.ttic.edu/
URL:https://datascience.ucsd.edu/event/special-seminar-nathan-srebro/
LOCATION:Halıcıoğlu Data Science Institute (HDSI)\, Room 123\, 3234 Matthews Ln\, La Jolla\, CA\, 92093\, United States
CATEGORIES:Seminar
ATTACH;FMTTYPE=image/png:https://datascience.ucsd.edu/wp-content/uploads/2024/01/HDSI-UCSD-Image_Dark-blue-e1710178042629.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240320T100000
DTEND;TZID=America/Los_Angeles:20240320T110000
DTSTAMP:20260603T122002
CREATED:20240318T224758Z
LAST-MODIFIED:20240318T224925Z
UID:10000461-1710928800-1710932400@datascience.ucsd.edu
SUMMARY:TILOS Seminar: How Large Models of Language and Vision Help Agents to Learn to Behave
DESCRIPTION:Roy Fox\, Assistant Professor and Director of the Intelligent Dynamics Lab at UC Irvine\nHDSI 123 and Zoom (Link below) \nAbstract: If learning from data is valuable\, can learning from big data be very valuable? So far\, it has been so in vision and language\, for which foundation models can be trained on web-scale data to support a plethora of downstream tasks; not so much in control\, for which scalable learning remains elusive. Can information encoded in vision and language models guide reinforcement learning of control policies? In this talk\, I will discuss several ways for foundation models to help agents to learn to behave. Language models can provide better context for decision-making: we will see how they can succinctly describe the world state to focus the agent on relevant features; and how they can form generalizable skills that identify key subgoals. Vision and vision–language models can help the agent to model the world: we will see how they can block visual distractions to keep state representations task-relevant; and how they can hypothesize about abstract world models that guide exploration and planning. \nBio: Roy Fox is an Assistant Professor of Computer Science at the University of California\, Irvine. His research interests include theory and applications of control learning: reinforcement learning (RL)\, control theory\, information theory\, and robotics. His current research focuses on structured and model-based RL\, language for RL and RL for language\, and optimization in deep control learning of virtual and physical agents.
URL:https://datascience.ucsd.edu/event/tilos-seminar-how-large-models-of-language-and-vision-help-agents-to-learn-to-behave/
LOCATION:Halıcıoğlu Data Science Institute (HDSI)\, Room 123\, 3234 Matthews Ln\, La Jolla\, CA\, 92093\, United States
CATEGORIES:Seminar
ATTACH;FMTTYPE=image/png:https://datascience.ucsd.edu/wp-content/uploads/2023/10/TILOS-Square_HDSI-Website-e1712854679822.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240313T130000
DTEND;TZID=America/Los_Angeles:20240313T140000
DTSTAMP:20260603T122002
CREATED:20240313T195909Z
LAST-MODIFIED:20240313T195909Z
UID:10000460-1710334800-1710338400@datascience.ucsd.edu
SUMMARY:Domain Counterfactuals for Trustworthy ML via Sparse Interventions | David I. Inouye
DESCRIPTION:Talk Abstract: \nAlthough incorporating causal concepts into deep learning shows promise for increasing explainability\, fairness\, and robustness\, existing methods require unrealistic assumptions and aim to recover the full latent causal model. This talk proposes an alternative: domain counterfactuals. Domain counterfactuals ask a more concrete question: “What would a sample look like if it had been generated in a different domain (or environment)?”   This avoids the challenges of full causal recovery while answering an important causal query. I will theoretically analyze the domain counterfactual problem for invertible causal models and prove an estimation bound that depends on the sparsity of intervention\, i.e.\, the number of intervened causal variables.  Leveraging this theory\, I will introduce a practical counterfactual estimation algorithm that outperforms baselines. Additionally\, I will showcase the potential of domain counterfactuals for counterfactual fairness and domain generalization through preliminary results. Finally\, I will connect this work to my broader research focus on distribution matching\, highlighting its potential as a foundational tool for building trustworthy machine learning systems. \nBio: \nProf. David I. Inouye is an assistant professor in the Elmore Family School of Electrical and Computer Engineering at Purdue University. His lab focuses on trustworthy machine learning (ML)\, which aims to make ML systems more robust\, causal and explainable. Currently\, he is interested in advancing distribution matching algorithms and applications such as causality\, domain generalization\, and distribution shift explanations. He is also interested in highly robust distributed learning algorithms on a network of devices\, called Internet Learning. His research is funded by ARL\, ONR\, and NSF. Previously\, he was a postdoc at Carnegie Mellon University working with Prof. Pradeep Ravikumar. He completed his Computer Science PhD at The University of Texas at Austin in 2017 advised by Prof. Inderjit Dhillon and Prof. Pradeep Ravikumar. He was awarded the NSF Graduate Research Fellowship (NSF GRFP).
URL:https://datascience.ucsd.edu/event/domain-counterfactuals-for-trustworthy-ml-via-sparse-interventions-david-i-inouye/
LOCATION:Halıcıoğlu Data Science Institute (HDSI)\, Room 404\, 3234 Matthews Ln\, La Jolla\, CA\, 92093\, United States
CATEGORIES:Seminar
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240312T140000
DTEND;TZID=America/Los_Angeles:20240312T153000
DTSTAMP:20260603T122002
CREATED:20240304T171426Z
LAST-MODIFIED:20240311T181154Z
UID:10000450-1710252000-1710257400@datascience.ucsd.edu
SUMMARY:From Pixels to Measurements: Understanding the Dynamic World ~ Adam Harley
DESCRIPTION:In computer vision\, “video understanding” typically concerns summarization: tracking the main objects\, or describing the main actions. While progress here has been impressive\, many practical applications require extracting information which is much more fine-grained. For example\, biologists are highly interested in tracking specific key points of organisms in long video recordings. Algorithms for such tasks require the generality and precision of low-level vision methods (e.g.\, optical flow)\, but benefit from knowledge about the physical world (e.g.\, things continue to exist while they are occluded). In this talk\, I will present our progress on this crucial space of problems. Our central contribution is to widen the window of “temporal context” used for inference: instead of tracking entities from one frame to the next\, we inspect dozens of frames simultaneously\, and return an answer that makes sense for the full clip. I will discuss the methods and datasets that we have created to drive progress along these lines\, and highlight natural science applications of the work. Finally\, I will introduce our ongoing effort to produce a “foundation model” of motion\, aiming to deliver arbitrary-granularity tracking for a huge variety of real-world situations. \nAdam is a postdoctoral scholar at Stanford University\, working with Leonidas Guibas. He received a Ph.D. in robotics from Carnegie Mellon University\, where he worked with Katerina Fragkiadaki. He received his M.S. in Computer Science at Toronto Metropolitan University\, working with Kosta Derpanis. Adam is a recipient of the NSERC PGS-D scholarship\, and the Toronto Metropolitan University Gold Medal. His research interests lie in Computer Vision and Machine Learning\, particularly for 3D understanding and fine-grained tracking.
URL:https://datascience.ucsd.edu/event/special-seminar-adam-harley/
LOCATION:Halıcıoğlu Data Science Institute (HDSI)\, Room 123\, 3234 Matthews Ln\, La Jolla\, CA\, 92093\, United States
CATEGORIES:Seminar
ATTACH;FMTTYPE=image/png:https://datascience.ucsd.edu/wp-content/uploads/2024/01/HDSI-UCSD-Image_Dark-blue-e1710178042629.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240311T140000
DTEND;TZID=America/Los_Angeles:20240311T153000
DTSTAMP:20260603T122002
CREATED:20240304T171239Z
LAST-MODIFIED:20240304T171239Z
UID:10000449-1710165600-1710171000@datascience.ucsd.edu
SUMMARY:Special Seminar | Zhuang Liu
DESCRIPTION:Talk info: to be provided
URL:https://datascience.ucsd.edu/event/special-seminar-zhuang-liu/
LOCATION:Halıcıoğlu Data Science Institute (HDSI)\, Room 123\, 3234 Matthews Ln\, La Jolla\, CA\, 92093\, United States
CATEGORIES:Seminar
ATTACH;FMTTYPE=image/png:https://datascience.ucsd.edu/wp-content/uploads/2024/01/HDSI-UCSD-Image_Dark-blue-e1710178042629.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240306T170000
DTEND;TZID=America/Los_Angeles:20240306T183000
DTSTAMP:20260603T122002
CREATED:20240209T162735Z
LAST-MODIFIED:20240209T163214Z
UID:10000438-1709744400-1709749800@datascience.ucsd.edu
SUMMARY:The Ethical and Policy Implications of Artificial Intelligence
DESCRIPTION:The Institute for Practical Ethics welcomes David Danks as the 2024 keynote speaker. \nDanks\, a UC San Diego professor in the Department of Philosophy and Halıcıoğlu Data Science Institute\, is an expert researcher at the intersection of philosophy\, cognitive science and machine learning. He serves on multiple boards\, including the United States National AI Advisory Committee. \nArtificial intelligence is seemingly everywhere today\, both in public perception and in our everyday lives. This growth has led to many stories about the widespread harms that can result from AI done poorly. As a result\, there are now numerous demands for ‘ethical AI\,’ but relatively little understanding of what that might involve. \nIn this keynote\, David Danks will explore the nature of responsible AI\, arguing that it involves much more than code or data. He will critically assess current approaches to producing more responsible AI\, then suggest key policy and practical approaches that would likely be more effective. It is critical we create more responsible AI\, but that will require rethinking many of our current practices in academia\, government and industry.
URL:https://www.eventbrite.com/e/the-ethical-and-policy-implications-of-artificial-intelligence-tickets-817599541237?aff=ipewebsite
LOCATION:Sanford Consortium
CATEGORIES:Seminar
ATTACH;FMTTYPE=image/jpeg:https://datascience.ucsd.edu/wp-content/uploads/2024/02/IPE_David-Danks.jpg
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240226T123000
DTEND;TZID=America/Los_Angeles:20240226T140000
DTSTAMP:20260603T122002
CREATED:20240213T221414Z
LAST-MODIFIED:20240220T235645Z
UID:10000443-1708950600-1708956000@datascience.ucsd.edu
SUMMARY:Building Human-AI Alignment: Specifying\, Inspecting\, and Modeling AI Behaviors | Serena Booth
DESCRIPTION:Abstract: The learned behaviors of AI and robot agents should align with the intentions of their human designers. Alignment is necessary for AI systems to be used in many sectors of the economy\, and so the process of aligning AI systems becomes critical to study for defining effective AI policy. Toward this goal\, people must be able to easily specify\, inspect\, and model agent behaviors. For specifications\, we will consider expert-written reward functions for reinforcement learning (RL) and non-expert preferences for reinforcement learning from human feedback (RLHF). I will show evidence that experts are bad at writing reward functions: even in a trivial setting\, experts write specifications that are overfit to a particular RL algorithm\, and they often write erroneous specifications for agents that fail to encode their true intent. I will also show that the common approach to learning a reward function from non-experts in RLHF uses an inductive bias that fails to encode how humans express preferences\, and that our proposed bias better encodes human preferences both theoretically and empirically. I will discuss the policy implications: namely\, that engineers’ design processes and embedded assumptions in building AI must be considered. For inspection\, humans must be able to assess the behaviors an agent learns from a given specification. I will discuss a method to find settings that exhibit particular behaviors\, like out-of-distribution failures. I will discuss the policy implications for testing AI systems\, for example through red teaming. Lastly\, cognitive science theories attempt to show how people build conceptual models that explain agent behaviors. I will show evidence that some of these theories are used in research to support humans\, but that we can still build better curricula for modeling. I will discuss the policy need for careful onboarding to AI systems. I will end by discussing my current work in the U.S. Senate on responding to the proliferation of AI. Collectively\, my research provides evidence that—even with the best of intentions— current human-AI systems often fail to induce alignment\, and my research proposes promising directions for how to build better aligned human-AI systems.
URL:https://datascience.ucsd.edu/event/special-seminar-serena-booth/
LOCATION:GPS\, Robinson Building Complex (RBC)\, 3106
CATEGORIES:Seminar
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240222T140000
DTEND;TZID=America/Los_Angeles:20240222T150000
DTSTAMP:20260603T122002
CREATED:20240126T183316Z
LAST-MODIFIED:20240202T225717Z
UID:10000431-1708610400-1708614000@datascience.ucsd.edu
SUMMARY:The continuum of gene regulation at single cell resolution\, from Drosophila development to human complex traits | Diego Calderon
DESCRIPTION:Single-cell technologies have emerged as powerful tools for studying development\, enabling comprehensive surveys of cellular diversity at profiled timepoints. They shed light on the dynamics of regulatory element activity and gene expression changes during the emergence of each cell type. Despite their potential\, nearly all atlases of embryogenesis are constrained by sampling density\, i.e.\, the number of discrete time points at which individual embryos are harvested. This limitation affects the resolution at which regulatory transitions can be characterized. In this talk\, I present a novel cell collection approach capable of constructing a continuous representation of dynamic regulatory processes. I applied this approach to generate a continuous\, single-cell atlas of chromatin accessibility and gene expression spanning Drosophila embryogenesis. Additionally\, I will discuss my past and future research\, applying new genomic technologies to characterize gene regulation important for human diseases.
URL:https://datascience.ucsd.edu/event/special-seminar-diego-calderon/
LOCATION:Powell-Focht Bioengineering Hall (PFBH)\, FUNG Auditorium
CATEGORIES:Seminar
ATTACH;FMTTYPE=image/png:https://datascience.ucsd.edu/wp-content/uploads/2024/01/HDSI-UCSD-Image_Dark-blue-e1710178042629.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240221T123000
DTEND;TZID=America/Los_Angeles:20240221T140000
DTSTAMP:20260603T122002
CREATED:20240220T163858Z
LAST-MODIFIED:20240220T163858Z
UID:10000444-1708518600-1708524000@datascience.ucsd.edu
SUMMARY:Computational approaches for uncovering implicit strategies in political discourse | Julia Mendelsohn
DESCRIPTION:When discussing politics\, people often use subtle linguistic strategies to influence how their audience thinks about issues\, which can then impact public opinion and policy. For example\, anti-immigration activists may frame immigration as a threat to native born citizens’ jobs\, describe immigrants with dehumanizing vermin-related metaphors\, or even use coded expressions to covertly connect immigration with antisemitic conspiracy theories. This talk will focus on the development of computational approaches to analyze three strategies: framing\, dehumanization\, and dogwhistle communication. I will discuss how I draw from multiple social science disciplines to develop typologies and curate data resources\, as well as how I build and evaluate natural language processing models for detecting these strategies. I further analyze the use of these strategies in political discourse across several domains\, and assess the implications of such nuanced rhetoric for both society and technology.
URL:https://datascience.ucsd.edu/event/computational-approaches-for-uncovering-implicit-strategies-in-political-discourse-julia-mendelsohn-2/
LOCATION:GPS\, Robinson Building Complex (RBC)\, 3106
CATEGORIES:Seminar
ATTACH;FMTTYPE=image/png:https://datascience.ucsd.edu/wp-content/uploads/2024/01/HDSI-UCSD-Image_Dark-blue-e1710178042629.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240220T130000
DTEND;TZID=America/Los_Angeles:20240220T143000
DTSTAMP:20260603T122002
CREATED:20240220T182547Z
LAST-MODIFIED:20240220T182547Z
UID:10000446-1708434000-1708439400@datascience.ucsd.edu
SUMMARY:Learning Inductive Representations for Reasoning over Knowledge Graphs | Zhaocheng Zhu
DESCRIPTION:Abstract: Reasoning\, the ability to logically draw conclusions from existing knowledge\, has been long pursued as a goal of artificial intelligence. Although numerous learning algorithms have been developed for reasoning\, most of them are limited to the domain they are trained on. By contrast\, humans often derive high-level rules or principles from experience and apply them to new domains — an ability referred as inductive generalization. In this talk\, we present a series of works that learn inductive representations for reasoning over knowledge graphs. First\, we introduce Neural Bellman-Ford Networks (NBFNet) that captures paths between entities and can generalize to graphs of new entities. Then we discuss Graph Neural Network Query Executor (GNN-QE)\, an extension of NBNet that answers multi-hop logical queries and generalizes well on our inductive benchmark. Finally\, by learning inductive representations for both entities and relations\, we demonstrate that a model can generalize to any graph with arbitrary entity and relation vocabularies\, paving the way for foundation models for knowledge graph reasoning. \n \nBio: Zhaocheng Zhu is a final-year Ph.D. candidate advised by Prof. Jian Tang at Mila – Quebec AI Institute\, University of Montreal. His research interests include reasoning\, knowledge graphs and large language models. His works\, among the first to study inductive generalization across structures\, have led to a paradigm shift away from traditional knowledge graph embedding methods that have been used for years. He gave a tutorial on knowledge graph reasoning at AAAI 2022. He is also an active developer of machine learning systems\, and led the development of two open-source libraries\, GraphVite for large-scale embedding training and TorchDrug for drug discovery research.
URL:https://datascience.ucsd.edu/event/learning-inductive-representations-for-reasoning-over-knowledge-graphs-zhaocheng-zhu/
LOCATION:Computer Science & Engineering Building (CSE)\, Room 1242\, 3234 Matthews Ln\, La Jolla\, CA\, 92093\, United States
CATEGORIES:Seminar
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240215T123000
DTEND;TZID=America/Los_Angeles:20240215T140000
DTSTAMP:20260603T122002
CREATED:20240213T220919Z
LAST-MODIFIED:20240213T221018Z
UID:10000441-1708000200-1708005600@datascience.ucsd.edu
SUMMARY:Targeting humanitarian aid with machine learning and digital data | Emily Aiken
DESCRIPTION:Abstract: The majority of humanitarian aid and social protection programs globally are targeted\, providing assistance to individuals or communities identified to be poorest or most in need. In low- and middle-income countries\, the targeting of aid programs is often limited by low-quality\, out-of-date\, or missing data on poverty and vulnerability. Novel “big” digital data sources\, such as those captured by satellites\, mobile phones\, and financial services providers — when combined with advances in machine learning — can improve the accuracy of aid program targeting. In this talk\, I will cover empirical results on the accuracy of these new data-driven and algorithmic approaches to aid allocation\, and will discuss emergent implications for fairness\, privacy\, transparency\, and community dynamics.
URL:https://datascience.ucsd.edu/event/targeting-humanitarian-aid-with-machine-learning-and-digital-data-emily-aiken/
LOCATION:GPS\, Robinson Building Complex (RBC)\, 3203
CATEGORIES:Seminar
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240214T140000
DTEND;TZID=America/Los_Angeles:20240214T153000
DTSTAMP:20260603T122002
CREATED:20240201T193829Z
LAST-MODIFIED:20240220T172238Z
UID:10000434-1707919200-1707924600@datascience.ucsd.edu
SUMMARY:Enabling Performant and Trustworthy Learning-enabled CPS-IoT Systems | Mani Srivastava
DESCRIPTION:Abstract: “The previously discrete technologies of IoT and AI have now entered a tight virtuous embrace. IoT allows sensing and actuation in our physical\, social\, and urban spaces with unimaginable ubiquity. AI allows sophisticated inferences and decisions to be made algorithmically using deep neural networks\, even from unstructured and high- dimensional data\, with uncanny performance. Together they seek to perform sophisticated perception-cognition-communication-action loops in diverse applications. However\, designers of learning-enabled IoT systems face the challenge of extremely resource-constrained edge platforms operating in uncertain environments while assuring performance and trustworthiness. Moreover\, in many applications\, the systems go beyond taking actions based on rich inferences about the world state to perform long-term reasoning about complex events and obey the underlying physics\, rules\, and constraints. Based on our experience in designing such systems in applications including mHealth\, ocean animal health\, agriculture robotics\, and military\, This talk explores meeting these challenges through a combination of (i) neurosymbolic architectures that allow the incorporation of physics awareness and human knowledge while enhancing user trust\, (ii) automatic platform-aware architecture search and code generation\, and (iii) techniques to efficiently adapt to the deployment environment.”            \n \nBio: “Mani Srivastava is Distinguished Professor and Vice Chair at UCLA’s ECE Department with a joint appointment in the CS Department. His research is broadly in human-cyber-physical and IoT systems that are learning-enabled\, resource- constrained\, and trustworthy. It spans problems across the entire spectrum of applications\, architectures\, algorithms\, and technologies in the context of systems and applications for mHealth\, sustainable buildings\, smart environments\, etc. He is a Fellow of the ACM and the IEEE.”    
URL:https://datascience.ucsd.edu/event/special-seminar-mani-srivastava/
LOCATION:Halıcıoğlu Data Science Institute (HDSI)\, Room 123\, 3234 Matthews Ln\, La Jolla\, CA\, 92093\, United States
CATEGORIES:Seminar
ATTACH;FMTTYPE=image/png:https://datascience.ucsd.edu/wp-content/uploads/2024/01/HDSI-UCSD-Image_Dark-blue-e1710178042629.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240213T130000
DTEND;TZID=America/Los_Angeles:20240213T140000
DTSTAMP:20260603T122002
CREATED:20240209T164305Z
LAST-MODIFIED:20240209T164305Z
UID:10000440-1707829200-1707832800@datascience.ucsd.edu
SUMMARY:On Data Ecology\, Data Markets\, the Value of Data\, and Dataflow Governance | Raul Castro Fernandez
DESCRIPTION:Abstract:\nData shapes our social\, economic\, cultural\, and technological environments. Data is valuable\, so people seek it\, inducing data to flow. The resulting dataflows distribute data and thus value. For example\, large Internet companies profit from accessing data from their users\, and engineers of large language models seek large and diverse data sources to train powerful models. It is possible to judge the impact of data in an environment by analyzing how the dataflows in that environment impact the participating agents. My research hypothesizes that it is also possible to design (better) data environments by controlling what dataflows materialize; not only can we analyze environments but also synthesize them. In this talk\, I present the research agenda on “data ecology\,” which seeks to build the principles\, theory\, algorithms\, and systems to design beneficial data environments. I will also present examples of data environments my group has designed\, including data markets for machine learning\, data-sharing\, and data integration. I will conclude by discussing the impact of dataflows in data governance and how the ideas are interwoven with the concepts of trust\, privacy\, and the elusive notion of “data value.” As part of the technical discussion\, I will complement the data market designs with the design of a data escrow system that permits controlling dataflows.\nBio:\nIn my research\, I ask what is the value of data and explore the potential of data markets to unlock that value. My group collaborates with economists\, legal scholars\, statisticians\, and domain scientists. We build systems to share\, discover\, prepare\, integrate\, and process data. I have traditionally worked on distributed query processing systems and continue to do so. I have received a SIGMOD’23 Test-of-time-Award. I am an assistant professor in the Department of Computer Science and on the Committee of Data Science at The University of Chicago. Before UChicago\, I did a postdoc at MIT with Sam Madden and Mike Stonebraker. And before that\, I completed a PhD at Imperial College London with Peter Pietzuch.
URL:https://datascience.ucsd.edu/event/on-data-ecology-data-markets-the-value-of-data-and-dataflow-governance-raul-castro-fernandez/
CATEGORIES:Colloquium,Seminar
ATTACH;FMTTYPE=image/png:https://datascience.ucsd.edu/wp-content/uploads/2024/01/HDSI-UCSD-Image-e1712856546428.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240213T080000
DTEND;TZID=America/Los_Angeles:20240213T170000
DTSTAMP:20260603T122002
CREATED:20240213T221205Z
LAST-MODIFIED:20240213T221205Z
UID:10000442-1707811200-1707843600@datascience.ucsd.edu
SUMMARY:Computational approaches for uncovering implicit strategies in political discourse | Julia Mendelsohn
DESCRIPTION:Abstract: When discussing politics\, people often use subtle linguistic strategies to influence how their audience thinks about issues\, which can then impact public opinion and policy. For example\, anti-immigration activists may frame immigration as a threat to native born citizens’ jobs\, describe immigrants with dehumanizing vermin-related metaphors\, or even use coded expressions to covertly connect immigration with antisemitic conspiracy theories. This talk will focus on the development of computational approaches to analyze three strategies: framing\, dehumanization\, and dogwhistle communication. I will discuss how I draw from multiple social science disciplines to develop typologies and curate data resources\, as well as how I build and evaluate natural language processing models for detecting these strategies. I further analyze the use of these strategies in political discourse across several domains\, and assess the implications of such nuanced rhetoric for both society and technology.
URL:https://datascience.ucsd.edu/event/computational-approaches-for-uncovering-implicit-strategies-in-political-discourse-julia-mendelsohn/
LOCATION:GPS\, Robinson Building Complex (RBC)\, 3106
CATEGORIES:Seminar
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240213T023000
DTEND;TZID=America/Los_Angeles:20240213T160000
DTSTAMP:20260603T122002
CREATED:20240207T223735Z
LAST-MODIFIED:20240220T172208Z
UID:10000437-1707791400-1707840000@datascience.ucsd.edu
SUMMARY:Principled Approaches for Trustworthy Algorithms\, Statistics\, and Machine Learning | Gautam Kamath
DESCRIPTION:Abstract: Despite impressive recent advances\, machine learning models exhibit a number of critical deficiencies. They are prone to leaking sensitive information about their training data. They remain alarmingly brittle to attacks by malicious parties. Troublingly\, these issues stem from more fundamental statistical vulnerabilities\, which remain unresolved even decades later\, highlighting significant gaps in our understanding of how to deal with these important considerations. As long as these problems remain\, our models will not be appropriate for use beyond deployment in toy settings. In this talk\, I will discuss recent advances on a number of these problems\, which give key new algorithmic insights into how to address these considerations\, and enable real-world deployments that were previously thought infeasible. In a first vignette\, we will explore how to guarantee individual privacy in machine learning models\, with a particular focus on large language models and the important role played by public data in the training pipeline. In a second vignette\, we focus on how to robustly perform mean estimation\, giving the first efficient and accurate algorithms for multivariate settings. We will go on to discuss connections to robustness against data poisoning attacks\, robust exploratory data analysis\, and surprising conceptual and technical connections with privacy. \nBio: Gautam Kamath is an Assistant Professor at the University of Waterloo\, and a Faculty Member and Canada CIFAR AI Chair at the Vector Institute for Artificial Intelligence. His research interests are in trustworthy algorithms\, statistics\, and machine learning\, particularly focusing on considerations like data privacy and robustness. He has a B.S. from Cornell University and a Ph.D. from MIT. He is the recipient of the 2023 Golden Jubilee Research Excellence Award\, recognizing him as the most outstanding junior researcher in the University of Waterloo’s Faculty of Math. Beyond research\, he is celebrated for his teaching. His course on differential privacy is the most popular resource for learning the topic\, with his lecture videos having over 100\,000 views. He has also given invited tutorials on the topic in multiple different countries. He is further well known for his passion and commitment to service and improving the community. Besides organizing and chairing several workshops and conferences\, he is an Editor-in-Chief of Transactions on Machine Learning Research\, and on the Executive Committee of the Learning Theory Alliance.
URL:https://datascience.ucsd.edu/event/principled-approaches-for-trustworthy-algorithms-statistics-and-machine-learning-gautam-kamath/
LOCATION:Halıcıoğlu Data Science Institute (HDSI)\, Room 123\, 3234 Matthews Ln\, La Jolla\, CA\, 92093\, United States
CATEGORIES:Seminar
ATTACH;FMTTYPE=image/png:https://datascience.ucsd.edu/wp-content/uploads/2024/01/HDSI-UCSD-Image_Dark-blue-e1710178042629.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240212T140000
DTEND;TZID=America/Los_Angeles:20240212T153000
DTSTAMP:20260603T122002
CREATED:20240126T182854Z
LAST-MODIFIED:20240220T172343Z
UID:10000430-1707746400-1707751800@datascience.ucsd.edu
SUMMARY:Integrating Longitudinal Multimodal Data To Realize Precision Medicine | Samantha Piekos
DESCRIPTION:Abstract: The interplay of biology\, environment\, and lifestyle direct the development and progression of complex diseases and other health outcomes. Therefore\, integration of longitudinal multimodal data is needed to understand the mechanisms underpinning major molecular transitions. Previously during my doctoral work at Stanford\, I integrated multiomics data to elucidate the epigenetic mechanism of human surface ectoderm differentiation. I also built a pipeline to investigate the role of polymorphism\, particularly non-coding genetic variants\, in complex diseases. To address the common pain point of data silos limiting the interpretation of multimodal data integration\, I formed a collaboration with Google Data Commons to build a free\, open-source biomedical knowledge graph with a common schema and API. Currently it is composed of approximately 130 million nodes and 1.7 trillion triples (node-edge-node) from 22 publicly available biomedical datasets. Knowledge graphs are a key tool for hypothesis generation\, data interpretation\, and dimensionality reduction required for systems medicine research. Upon starting my postdoctoral work at the Institute for Systems Biology\, I identified pregnancy as an excellent model system for prototyping precision medicine approaches. I used electronic healthcare records (EHR) from Providence St. Joseph Healthcare to investigate the impact of COVID-19 maternal infection and vaccination on maternal-fetal outcomes. In addition\, I integrated multiomics placental data to investigate molecular network changes (interomics and intraomics) in common obstetric disorders. In a follow-up study (enrollment complete) we have longitudinal deep-phenotyping data of 435 people throughout pregnancy 80 of which have pregnancy complications. This includes multiomics\, survey\, EHR\, and air quality data collected from first prenatal visit through delivery. My lab will use this data to define major molecular transition states throughout pregnancy. I will also investigate the disease mechanisms of common obstetric disorders including identifying for an individual the earliest possible point of deviation from a healthy trajectory. This interdisciplinary approach will identify potential drug targets\, biomarker panels\, and individualized clinical interventions. \n  \nBio: Samantha completed her PhD in Stem Cell Biology and Regenerative Medicine with a PhD minor in Biomedical Informatics at Stanford University under the advisement of Dr. Anthony Oro. Using a multiomics approach\, Samantha demonstrated how transcription factors direct keratinocyte differentiation by changing the epigenetic landscape\, including chromatin looping\, thereby effecting the cell transcriptional program. Samantha has also been collaborating with Google since June 2019 to build Biomedical Data Commons\, a knowledge graph that integrates biomedical data from a wide array of sources into a single searchable database thereby increasing data accessibility. Upon completion of her PhD in 2020\, began her postdoctoral fellowship at the Institute for Systems Biology under the advisement of Drs. Lee Hood and Nathan Price. Using electronic healthcare records (EHR)\, she has provided insight into the impact of maternal COVID-19 and vaccination on maternal-fetal outcomes. In addition to her EHR research\, Samantha is using multidimensional omics placental data to understand the molecular mechanism of common obstetric disorders. Upon transitioning to Assistant Professor\, she intends to perform multimodal data integration of longitudinal deep-phenotyping data to evaluate changes in molecular networks in complex diseases.
URL:https://datascience.ucsd.edu/event/special-seminar-samantha-piekos/
LOCATION:Halıcıoğlu Data Science Institute (HDSI)\, Room 123\, 3234 Matthews Ln\, La Jolla\, CA\, 92093\, United States
CATEGORIES:Seminar
ATTACH;FMTTYPE=image/png:https://datascience.ucsd.edu/wp-content/uploads/2024/01/HDSI-UCSD-Image_Dark-blue-e1710178042629.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240209T130000
DTEND;TZID=America/Los_Angeles:20240209T140000
DTSTAMP:20260603T122002
CREATED:20240209T164509Z
LAST-MODIFIED:20240209T164509Z
UID:10000439-1707483600-1707487200@datascience.ucsd.edu
SUMMARY:EnCORE : Theoretical Exploration of Foundation Model Adaptation\, Kangwook Lee\, UW Madison\, Feb 9th\, 1-2pm
DESCRIPTION:Abstract: Due to the enormous size of foundation models\, various new methods for efficient model adaptation have been developed. Parameter-efficient fine-tuning (PEFT) is an adaptation method that updates only a tiny fraction of the model parameters\, leaving the remainder unchanged. In-context Learning (ICL) is a test-time adaptation method\, which repurposes foundation models by providing them with labeled samples as part of the input context. Given the growing importance of this emerging paradigm\, developing theoretical foundations for the new paradigm is of utmost importance. \nIn this talk\, I will introduce two preliminary results toward this goal. In the first part\, I will present a theoretical analysis of Low-Rank Adaptation (also known as LoRA)\, one of the most popular PEFT methods today. Our analysis of the expressive power of LoRA not only helps us better understand the high adaptivity of LoRA observed in practice but also provides insights to practitioners. In the second part\, I will introduce our probabilistic framework for a better understanding of ICL. With our framework\, one can analyze the transition between two distinct modes of ICL: task retrieval and learning. We also discuss how our framework can help explain and predict various phenomena\, which can be observed with large language models in practice yet not fully explained. \nBio: Kangwook Lee is an Assistant Professor in the Electrical and Computer Engineering Department and the Computer Sciences Department (by courtesy) at the University of Wisconsin-Madison. Previously\, he was a Research Assistant Professor at the Information and Electronics Research Institute of KAIST and was a postdoctoral scholar at the same institute. He received his PhD in 2016 from the Electrical Engineering and Computer Science department at UC Berkeley. He is the recipient of the IEEE Joint Communications Society/Information Theory Society Paper Award (2020) and the KSEA Young Investigator Grant Award (2022).
URL:https://datascience.ucsd.edu/event/encore-theoretical-exploration-of-foundation-model-adaptation-kangwook-lee-uw-madison-feb-9th-1-2pm/
LOCATION:Atkinson Hall\, Fourth Floor
CATEGORIES:Colloquium,Seminar
ATTACH;FMTTYPE=image/png:https://datascience.ucsd.edu/wp-content/uploads/2023/10/Encore-logo_HDSI-Website.png
ORGANIZER;CN="k1omerry@ucsd.edu":MAILTO:k1omerry@ucsd.edu
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240208T130000
DTEND;TZID=America/Los_Angeles:20240208T143000
DTSTAMP:20260603T122002
CREATED:20240121T235126Z
LAST-MODIFIED:20240205T165821Z
UID:10000427-1707397200-1707402600@datascience.ucsd.edu
SUMMARY:Inference in context: Statistical theory and thinking | Jeffrey Bye
DESCRIPTION:Abstract: The likelihood function plays a foundational role in statistical theory. I will demonstrate my teaching philosophy and approach through a lesson on maximum likelihood estimation and its connection to Neyman-Pearson\, Bayesian\, and other approaches to statistical and scientific inference. I will then expand on the role of context in statistical thinking\, particularly how it informs my scholarship on how people learn about data\, math\, statistics\, and programming. \nBio: Jeffrey K. Bye is an interdisciplinary teacher and researcher who received his Ph.D. from UCLA in Cognitive Psychology with a specialization in computational modeling and minor in quantitative methods. He has years of experience teaching undergraduate- and graduate-level statistics and programming at UCLA and University of Minnesota. His research blends cognitive and learning science approaches to understand how people learn and think about data\, math\, statistics\, and programming. He is passionate about making science more collaborative\, inclusive\, and understandable to the public.
URL:https://datascience.ucsd.edu/event/special-seminar-jeffrey-bye/
LOCATION:3234 Matthews Ln\, La Jolla\, 92093\, United States
CATEGORIES:Seminar
ATTACH;FMTTYPE=image/png:https://datascience.ucsd.edu/wp-content/uploads/2024/01/HDSI-UCSD-Image_Dark-blue-e1710178042629.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20240205T140000
DTEND;TZID=America/Los_Angeles:20240205T153000
DTSTAMP:20260603T122002
CREATED:20240121T234846Z
LAST-MODIFIED:20240202T204856Z
UID:10000426-1707141600-1707147000@datascience.ucsd.edu
SUMMARY:Algebraic vision: A gentle introduction | Jessie Loucks-Tavitas
DESCRIPTION:Abstract:\n\nMy talk will be broken into three parts:\nPart I: Meet Jessie.\nPart II: Assessing Deep Learning Models. A short lesson on assessment criteria for deep learning models\, such as LLMs and image segmentation models.\nPart III: Algebraic Vision\, a Gentle Introduction. Algebraic vision\, lying in the intersection of computer vision and projective geometry\, is the study of 3D objects being photographed by multiple cameras\, using techniques found in computational algebraic geometry. Two natural questions arise: (1) Given a 3D object and multiple images of it\, can we determine the relative camera positions? And\, (2) given multiple images as well as relative camera locations\, can we reconstruct the object being photographed? Carlsson and Weinshall showed in 1998 that the algorithms to solve these problems are intrinsically connected. A beneficial corollary of recent joint work with Erin Connelly and Timothy Duff is a formalization of this “duality” mechanism. We will discuss this formalization\, along with some future directions that we hope to venture down.\n\n \nBio: Jessie Loucks-Tavitas is currently a 6th-year PhD candidate in mathematics at the University of Washington. She received her MS in mathematics in 2022\, following her BA in mathematics in 2018 from California State University\, Sacramento. Jessie’s commitment to higher education and supporting underrepresented groups has been acknowledged with the Gloria Hewitt Endowed Fellowship and the Excellence in Teaching Award from the UW mathematics department in 2022. Outside of academic pursuits\, Jessie finds joy in drinking black coffee\, cozying up with a book and her two cats\, and adventuring with her friends and family in her newfound love for skiing.
URL:https://datascience.ucsd.edu/event/special-seminar-jessie-loucks-tavitas/
LOCATION:3234 Matthews Ln\, La Jolla\, 92093\, United States
CATEGORIES:Seminar
ATTACH;FMTTYPE=image/png:https://datascience.ucsd.edu/wp-content/uploads/2024/01/HDSI-UCSD-Image_Dark-blue-e1710178042629.png
END:VEVENT
END:VCALENDAR