Friday 21 April 2017

Sekhar Tatikonda
Wed Apr 19 08:47:49 EDT 2017

Hi Everyone,

This Friday in the YPNG seminar John Hartigan will discuss some of his
recent work
on k-means clustering -- namely when is there a mode in each of the k-means

For k-means with 2 clusters in 1 dimension, the between cluster sum of
squares D^2 and
the cluster size F are used in a test for the presence of bimodality. The
test uses a measure
of bimodality constructed from mixtures of Gaussian densities pN(0,1)
+(1-p)N(d,1); the region
in the (p,d) space where the density is bimodal corresponds to a bimodal
region M in in the
(D, F) space. The test uses the posterior probability that the population
(D,F) lies in the
bimodal region given the sampled(D,F). This can be computed from the
distribution of the
sample (D,F) given the population (D,F) which is shown to be asymptotically
normal with
explicit parameters. The test has asymptotically correct frequency
properties.   The method
is extended to any number of clusters in any number of dimensions by
looking at all pairs
of clusters.

See you Friday at 11am in the Stat's classroom.

