Return to Colloquia & Seminar listing
Gaussian kernelized graph Laplacian: Bi-stochastic normalization and eigen-convergence
Mathematics of Data & DecisionsSpeaker: | Xiuyuan Cheng, Duke University |
Location: | zoom |
Start time: | Tue, May 30 2023, 12:10PM |
Eigen-data of graph Laplacian matrices are widely used in data analysis and machine learning, for example, dimension reduction by spectral embedding. A kernelized graph affinity matrix is constructed from $N$ high-dimensional data points, and the classical manifold data setting assumes i.i.d. samples lying on a general unknown $d$-dimensional manifold embedded in the ambient space. When clean manifold data are corrupted by high dimensional noise, it can negatively influence the performance of graph Laplacian methods. In this talk, we first introduce the use of bi-stochastic normalization to improve the robustness of graph Laplacian to high-dimensional outlier noise, possibly heteroskedastic, with a proven convergence guarantee. The analysis suggests an approximate matrix scaling problem that can be solved by Sinkhorn iterations with early termination. Back to the manifold data setting and for the important question of eigen-convergence - namely the convergence of eigenvalues and eigenvectors to the spectra of the Laplace-Beltrami operator - we show that choosing a smooth kernel function contributes to the improvement of convergence rates compared to prior results. We prove eigen-convergence rates as $N$ increases and the kernel bandwidth $\epsilon$ decreases accordingly by analyzing the Dirichlet form convergence and constructing candidate approximate eigenfunctions via convolution with manifold heat kernel. When data density is non-uniform on the manifold, we prove the same rates for the density-corrected graph Laplacian. The theory is supported by numerical results. Joint work with Boris Landa and Nan Wu.