Many modern data analysis tasks often require one to efficiently
handle and analyze large matrix-form datasets such as term-document matrices
and spatiotemporal measurements made via sensor networks. Since such matrices
are often shuffled and scrambled, they do not have spatial coherency and
smoothness that usual images and photographs possess, and consequently,
the conventional wavelets and their relatives cannot be used in practice.
Instead we propose to use our multiscale basis dictionaries for graphs, i.e.,
the Generalized Haar-Walsh Transform.
In particular, we build such dictionaries for columns and rows separately,
extract the column best basis and the row best basis from the basis dictionaries,
and construct the tensor product of such best bases, which turns out to
reveal hidden dependency and underlying geometric structure in the given matrix
data. Finally, we will demonstrate the effectiveness of our approach using the
Science News database.
Keywords:
Multiscale basis dictionaries on graphs, Haar-Walsh wavelet packets,
adaptive best basis algorithm, spectral co-clustering, term-document matrices