Clustering in high dimension and choosing cluster representatives for SimPoint.

dc.contributor.advisorHamerly, Gregory James, 1977-
dc.contributor.authorJohnston, Joshua Benjamin.
dc.contributor.departmentComputer Science.en
dc.contributor.otherBaylor University. Dept. of Computer Science.en
dc.date.accessioned2007-12-03T18:49:58Z
dc.date.available2007-12-03T18:49:58Z
dc.date.copyright2007
dc.date.issued2007-12-03T18:49:58Z
dc.descriptionIncludes bibliographical references (p. 113-115).en
dc.description.abstractIn computer architecture, researchers compare new processor designs by simulating them in software. Because simulation is slow, researchers simulate small parts of a workload to save time. The widely successful SimPoint approach identifies these key parts with k-means clustering. The extremely high-dimensional nature of these workloads causes difficulties for k-means, so SimPoint must reduce the dimension before clustering. We propose clustering workload data with the exponential Dirichlet compound multinomial (EDCM), a new relative of the multinomial probability distribution and the first model that has been used to cluster workload data without the need for dimension reduction. The EDCM mixture produces good models which have far fewer clusters than models generated by k-means, significantly reducing the amount of time spent in simulation. The EDCM mixture converges quickly and is a good model for "bursty" traits which appear in workloads. We discuss model selection and choosing cluster representatives for the EDCM mixture.en
dc.description.degreeM.S.en
dc.description.statementofresponsibilityby Joshua Benjamin Johnston.en
dc.format.extentviii, 115 p. : ill.en
dc.format.extent153530 bytes
dc.format.extent743823 bytes
dc.format.mimetypeapplication/pdf
dc.format.mimetypeapplication/pdf
dc.identifier.urihttp://hdl.handle.net/2104/5067
dc.language.isoen_USen
dc.rightsBaylor University theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. Contact librarywebmaster@baylor.edu for inquiries about permission.en
dc.rights.accessrightsWorldwide accessen
dc.subjectComputer simulation.en
dc.subjectComputer architecture.en
dc.subjectSystem design.en
dc.titleClustering in high dimension and choosing cluster representatives for SimPoint.en
dc.typeThesisen

Files

Original bundle

Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
Josh_Johnston_masters.pdf
Size:
726.39 KB
Format:
Adobe Portable Document Format
Description:
Thesis
No Thumbnail Available
Name:
Josh_Johnston_permissions.pdf
Size:
149.93 KB
Format:
Adobe Portable Document Format
Description:
Permissions form

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.95 KB
Format:
Item-specific license agreed upon to submission
Description: