ECAI 2004 Conference Paper

[PDF] [full paper] [prev] [tofc] [next]

Stacked generalization for information extraction

Georgios Sigletos, Georgios Paliouras, Constantine Spyropoulos, Takis Stamatopoulos

This paper defines a new stacked generalization framework in the context of information extraction (IE) from online sources. The proposed setting removes the constraint of applying classifiers at the base-level. A set of IE systems are trained instead to recognize relevant fragments within text documents, which differs significantly from the task of classifying candidate text fragments as relevant or not. The predictions of the base-level IE systems are stacked and form a set of feature vectors for training a meta-level classifier. Therefore, IE is transformed into a common classification task at meta-level. The proposed framework was evaluated on three Web domains, using well known IE approaches at base-level and a variety of classifiers at meta-level. Results demonstrate the added value obtained by combining the base-level IE systems in the new framework.

Keywords: Information Extraction, Machine Learning, Text Mining

Citation: Georgios Sigletos, Georgios Paliouras, Constantine Spyropoulos, Takis Stamatopoulos: Stacked generalization for information extraction. In R.López de Mántaras and L.Saitta (eds.): ECAI2004, Proceedings of the 16th European Conference on Artificial Intelligence, IOS Press, Amsterdam, 2004, pp.549-553.

[prev] [tofc] [next]

ECAI-2004 is organised by the European Coordinating Committee for Artificial Intelligence (ECCAI) and hosted by the Universitat Politècnica de València on behalf of Asociación Española de Inteligencia Artificial (AEPIA) and Associació Catalana d'Intel-ligència Artificial (ACIA).