Contact Us

Give us a call or drop by anytime, we endeavor to answer all inquiries within 24 hours.


Find us

PO Box 16122 Collins Street West Victoria, Australia

Email us /

Phone support

Phone: + (066) 0760 0260 / + (057) 0760 0560

Loading Events

« All Events

Event Series Event Series: Special Seminar Series

Some new results for streaming principal component analysis

October 2 @ 2:00 pm - 3:00 pm

Abstract: While streaming PCA (also known as Oja’s algorithm) was proposed about four decades ago and has roots going back to 1949, theoretical resolution in terms of obtaining optimal convergence rates has been obtained only in the last decade. However, we are not aware of any available distributional guarantees, which can help provide confidence intervals on the quality of the solution. In this talk, I will present the problem of quantifying uncertainty for the estimation error of the leading eigenvector using Oja’s algorithm for streaming PCA, where the data are generated IID from some unknown distribution. Combining classical tools from the U-statistics literature with recent results on high-dimensional central limit theorems for quadratic forms of random vectors and concentration of matrix products, we establish a distributional approximation result for the error between the population eigenvector and the output of Oja’s algorithm. We also propose an online multiplier bootstrap algorithm and establish conditions under which the bootstrap distribution is close to the corresponding sampling distribution with high probability. While there are optimal rates for the streaming PCA problem, they typically apply to the IID setting, whereas in many applications like distributed optimization, the data is generated from a Markov chain and the goal is to infer parameters of the limiting stationary distribution. If time permits, I will also present our near-optimal finite sample guarantees which remove the logarithmic dependence on the sample size in previous work, where Markovian data is downsampled to get a nearly independent data stream.

Bio: Purnamrita Sarkar is an associate professor of Statistics at the University of Texas at Austin. Their interests are in the intersection of asymptotic statistics, scalable algorithms and networks and recently on uncertainty estimation for streaming algorithms and resampling methods for networks. Dr. Sarkar is affiliated with the AI institute and EnCORE: Institute for Emerging CORE Methods of Data Science. They were a postdoctoral scholar at the University of California, Berkeley working on asymptotic theory for network models and the nonparametric bootstrap for big data. Dr, Sarkar earned their PhD from the Machine Learning Department at Carnegie Mellon University


October 2
2:00 pm - 3:00 pm
Event Category:


HDSI General


Purnamrita Sarkar


HDSI Building, Room 123
3234 Matthews Ln
La Jolla, CA 92093 United States
+ Google Map