Give us a call or drop by anytime, we endeavor to answer all inquiries within 24 hours.

PO Box 16122 Collins Street West Victoria, Australia
info@domain.com / example@domain.com
Phone: + (066) 0760 0260 / + (057) 0760 0560
Speaker: Zhijian Liu
Date & Time: Wednesday February 25th, 2:00pm
Location: HDSI Multipurpose Room 123
Talk Title:
Efficient AI in the Era of Large Models
Abstract:
Large foundation models now deliver remarkable capabilities in understanding, reasoning, and acting across digital and physical domains. Yet their computational cost has become the primary bottleneck to scaling and real-world deployment. As models expand in size, context length, and modality, efficiency is no longer optional. It is a first-order challenge in algorithms and systems design.
In this talk, I present advances along three complementary directions: parallel decoding, quantized reasoning, and structured sparsity. I show how diffusion-based parallel drafting reduces autoregressive latency through speculative decoding; how 4-bit quantization remains near-lossless even under demanding reasoning workloads; and how structured sparsity accelerates long-context inference and long-form reasoning. I conclude with case studies demonstrating how these principles enable efficient multimodal and physical AI systems. Together, these results show that advances in algorithms and systems design are essential to making large models faster, more affordable, and deployable at scale.
Speaker Bio: Zhijian Liu is an assistant professor at UCSD. Previously, he received his Ph.D. and S.M. from MIT and his B.Eng. from Shanghai Jiao Tong University. His research focuses on efficient machine learning and systems. He was selected as the recipient of the Qualcomm Innovation Fellowship. He was also recognized as a Rising Star in ML and Systems by MLCommons and a Rising Star in Data Science by UChicago and UCSD.