ECAI 2004 Conference Paper

[PDF] [full paper] [prev] [tofc] [next]

Pushing "Underfitting" to the Limit: Learning in Bidimensional Text Categorization

Giorgio Maria Di Nunzio, Alessandro Micarelli

The analysis of two heuristic supervised learning algorithms for text categorization in two dimensions is presented here. The graphical properties of the bidimensional representation allows one to tailor a geometrical heuristic approach in order to exploit the peculiar distribution of text documents. In particular, we want to investigate the theoretical linear cost of the algorithms and try to push the performance to the limit. The experiments on Reuters-21578 standard benchmark confirm that this approach is an alternative to the standard linear learning models, such as support vector machines, for text classification. Moreover, due to the fast training session, this approach may also be considered as a support for text categorization systems for fast graphical investigations of large collections of documents.

Keywords: Text Categorization, Machine Learning, Information Models, Text Representation

Citation: Giorgio Maria Di Nunzio, Alessandro Micarelli: Pushing "Underfitting" to the Limit: Learning in Bidimensional Text Categorization. In R.López de Mántaras and L.Saitta (eds.): ECAI2004, Proceedings of the 16th European Conference on Artificial Intelligence, IOS Press, Amsterdam, 2004, pp.465-469.

[prev] [tofc] [next]

ECAI-2004 is organised by the European Coordinating Committee for Artificial Intelligence (ECCAI) and hosted by the Universitat Politècnica de València on behalf of Asociación Española de Inteligencia Artificial (AEPIA) and Associació Catalana d'Intel-ligència Artificial (ACIA).