ECAI 2004 Conference Paper

[PDF] [full paper] [prev] [tofc] [next]

A model-based approach to sequence clustering

Henri Binsztok, Thierry Artières, Patrick Gallinari

We present a Hidden Markov Model-based approach to cluster sequences. This problem is addressed in term of learning Hidden Markov Models (HMM) structure from data, with constraints on topology. Using a top-down approach, we iteratively simplify an initial HMM that consists of a mixture of as many left-right HMMs as training sequences. This simplification is performed by merging of HMM components using a similarity measure specifically designed for left-right HMMs. Our approach allows to learn, in an unsupervised manner, the number of clusters and the cluster models that best represent training data. This approach is generic and we provide experimental results on two different application fields. First, we apply our system to automatically identify the number and nature of allographs in on-line handwriting signals. Second, we apply our system to hypermedia navigation patterns in order to identify user typologies - a key component of user modelling.

Keywords: Machine Learning, Sequence clustering, Hidden Markov Models

Citation: Henri Binsztok, Thierry Artières, Patrick Gallinari: A model-based approach to sequence clustering. In R.López de Mántaras and L.Saitta (eds.): ECAI2004, Proceedings of the 16th European Conference on Artificial Intelligence, IOS Press, Amsterdam, 2004, pp.420-424.

[prev] [tofc] [next]

ECAI-2004 is organised by the European Coordinating Committee for Artificial Intelligence (ECCAI) and hosted by the Universitat Politècnica de València on behalf of Asociación Española de Inteligencia Artificial (AEPIA) and Associació Catalana d'Intel-ligència Artificial (ACIA).