• AI in the Enterprise

    Virtual

    Are you interested in the exciting world of AI startups and industry? Are you interested in helping strengthen bridges between industry and academia? UCSD HDSI and RapidFire AI are delighted to announce a virtual panel discussion event titled "AI in the Enterprise." The goal is to bridge industry and academia at the cutting edge of practical AI […]

  • 2024 HDSI Virtual Industry Open House

    Virtual

    The Halıcıoğlu Data Science Institute (HDSI) at UC San Diego is excited to welcome all employers, partners, campus colleagues, and the broader community to our 2024 Virtual Industry Open House. Join us to learn about the latest developments at HDSI, including our innovative programs, engagement opportunities, and how we are preparing the next generation of […]

  • TILOS Seminar: Transformers learn in-context by (functional) gradient descent

    TILOS Seminar Series
    Virtual

    Transformers learn in-context by (functional) gradient descent Xiang Cheng, TILOS Postdoctoral Scholar at MIT HDSI 123 and Zoom: https://ucsd.zoom.us/j/99334315002 Abstract: Motivated by the in-context learning phenomenon, we investigate how the Transformer neural network can implement learning algorithms in its forward pass. We show that a linear Transformer naturally learns to implement gradient descent, which enables it to […]

  • TILOS Webinar: AI Ethics in Research

    TILOS Seminar Series
    Virtual

    The Ethics and Early Career Committee would like to invite you to our upcoming webinar on AI Ethics in Research. This will take place virtually through Zoom on Friday, March 8th at noon Pacific, 2pm Central, 3pm Eastern (https://nu.zoom.us/j/2183621123). Please join Dr. Nisheeth Vishnoi from Yale and Dr. David Danks from UC San Diego who will discuss their Research […]

  • EnCORE Workshop | Old Questions and New Directions in Theory of Clustering

    EnCORE Series
    Virtual

    We are hosting an EnCORE workshop on Old Questions and New Directions in Theory of Clustering at UCSD from March 4th to 6th, 2024. While in person registration is closed due to limited seats availability, you can register to attend the workshop virtually here: https://sites.google.com/view/clusteringinsandiego We have a stellar lineup of speakers and we hope […]

  • Scaling Data-Constrained Language Model

    EnCORE Series
    Virtual

    Extrapolating scaling trends suggest that training dataset size for LLMs may soon be limited by the amount of text data available on the internet. In this talk we investigate scaling language models in data-constrained regimes. Specifically, we run a set of empirical experiments varying the extent of data repetition and compute budget. From these experiments we propose and empirically validate a scaling law for compute optimality that accounts for the decreasing value of repeated tokens and excess parameters. Finally, we discuss and experiment with approaches for mitigating data scarcity.