Theses/Dissertations - Computer Science
Permanent URI for this collectionhttps://hdl.handle.net/2104/4810
Browse
Browsing Theses/Dissertations - Computer Science by Subject "Computer network architecture."
Now showing 1 - 1 of 1
- Results Per Page
- Sort Options
Item PG-means: learning the number of clusters in data.(2007-03-19T14:52:48Z) Feng, Yu.; Hamerly, Gregory James, 1977-; Computer Science.; Baylor University. Dept. of Computer Science.We present a novel algorithm called PG-means in this thesis. This algorithm is able to determine the number of clusters in a classical Gaussian mixture model automatically. PG-means uses efficient statistical hypothesis tests on one-dimensional projections of the data and model to determine if the examples are well represented by the model. In so doing, we apply a statistical test to the entire model at once, not just on a per-cluster basis. We show that this method works well in difficult cases such as overlapping clusters, eccentric clusters and high dimensional clusters. PG-means also works well on non-Gaussian clusters and many true clusters. Further, the new approach provides a much more stable estimate of the number of clusters than current methods.