This talk will introduce the world of data clustering techniques, beginning with some heuristic algorithms for partitioning a dataset into a given number of bins. Two methods will be discussed: k-Means, which is a geometric algorithm based on a least-squares type minimization problem, and Spectral Clustering, which may be viewed as a Graph Cut problem once a graph is associated with the data. We will also discuss the Subspace Clustering Problem, in which the data is assumed to be well modeled as coming from the union of low-dimensional subspaces of the high-dimensional ambient space. For this problem, a general framework for designing similarity matrices for the data will be given in terms of a strange yet beautiful matrix factorization called the CUR (or skeleton) decomposition. This framework encompasses many known methods utilized in practice for subspace clustering, and is tied to other methods based on compressed sensing rather than matrix factorization. Additionally, its effectiveness on real data will be discussed. This is joint work with Akram Aldroubi, Ahmet Bugra Koku, and Ali Sekmen.
Thinking about UNT?
It's easy to apply online. Join us and discover why we're the choice of over 46,000 students.
Apply now