Loading view.
Calendar of Events
M Mon
T Tue
W Wed
T Thu
F Fri
S Sat
S Sun
0 events,
0 events,
1 event,
Detection and recovery of low-rank signals under heteroskedastic noise
Detection and recovery of low-rank signals under heteroskedastic noise
A fundamental task in data analysis is to detect and recover a low-rank signal in a noisy data matrix. Typically, this task is addressed by inspecting and manipulating the spectrum of the observed data, e.g., thresholding the singular values of the data matrix at a certain critical level. This approach is well-established in the case of homoskedastic noise, where the noise variance is identical across the entries. However, in numerous applications, such as single-cell RNA sequencing (scRNA-seq), the noise can be heteroskedastic, where the noise characteristics vary considerably across the rows and columns of the data. In such scenarios, the noise spectrum can differ significantly from the homoskedastic case, posing various challenges for signal detection and recovery. In this talk, I will present a procedure for standardizing the noise spectrum by judiciously scaling the rows and columns of the data. Importantly, this procedure can provably enforce the standard spectral behavior of homoskedastic noise -- the Marchenko-Pastur law. I will describe methods for estimating the required scaling factors directly from the observed data with suitable theoretical justification, and demonstrate the advantages of the proposed approach for signal detection and recovery in simulations and on real scRNA-seq data.
0 events,
0 events,
0 events,
0 events,
1 event,
Flat minima and generalization in deep learning: a case study in low rank matrix recovery
Flat minima and generalization in deep learning: a case study in low rank matrix recovery
Abstract: Recent advances in machine learning and artificial intelligence have relied on fitting highly overparameterized models, notably deep neural networks, to observed data. In such settings, the number of parameters of the model is much greater than the number of data samples, thereby resulting in a continuum of models with near-zero training error. Understanding which of these models generalize well and which do not is the central open question in deep learning. Recent empirical evidence suggests one mechanism for generalization: the shape of the training loss around a local minimizer seems to strongly impact the model’s performance. In particular, flat minima -- those around which the loss grows slowly -- appear to generalize well. Clarifying this phenomenon can shed new light on generalization in deep learning, which still largely remains a mystery.
I will describe our recent work that takes a step towards this goal by focusing on the simplest class of overparameterized nonlinear models: those arising in low-rank matrix recovery. We analyze overparameterized matrix and bilinear sensing, robust PCA, covariance matrix estimation, and single hidden layer neural networks with quadratic activation functions. In all cases, we show that flat minima, measured by the trace of the Hessian, exactly recover the ground truth under standard statistical assumptions. These results suggest (i) a theoretical basis for favoring methods that bias iterates towards flat solutions and (ii) use of Hessian trace as a good regularizer for some learning tasks. We end by discussing the impact of depth on the generalization properties of flat solutions, which surprisingly is not always beneficial.