Hamerly, Gregory James, 1977-Johnston, Joshua Benjamin.Baylor University. Dept. of Computer Science.2007-12-032007-12-0320072007-12-03http://hdl.handle.net/2104/5067Includes bibliographical references (p. 113-115).In computer architecture, researchers compare new processor designs by simulating them in software. Because simulation is slow, researchers simulate small parts of a workload to save time. The widely successful SimPoint approach identifies these key parts with k-means clustering. The extremely high-dimensional nature of these workloads causes difficulties for k-means, so SimPoint must reduce the dimension before clustering. We propose clustering workload data with the exponential Dirichlet compound multinomial (EDCM), a new relative of the multinomial probability distribution and the first model that has been used to cluster workload data without the need for dimension reduction. The EDCM mixture produces good models which have far fewer clusters than models generated by k-means, significantly reducing the amount of time spent in simulation. The EDCM mixture converges quickly and is a good model for "bursty" traits which appear in workloads. We discuss model selection and choosing cluster representatives for the EDCM mixture.viii, 115 p. : ill.153530 bytes743823 bytesapplication/pdfapplication/pdfen-USBaylor University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. Contact librarywebmaster@baylor.edu for inquiries about permission.Computer simulation.Computer architecture.System design.Clustering in high dimension and choosing cluster representatives for SimPoint.ThesisWorldwide access