|
[full paper] |
Georgios Sigletos, Georgios Paliouras, Constantine Spyropoulos, Takis Stamatopoulos
This paper defines a new stacked generalization framework in the context of information extraction (IE) from online sources. The proposed setting removes the constraint of applying classifiers at the base-level. A set of IE systems are trained instead to recognize relevant fragments within text documents, which differs significantly from the task of classifying candidate text fragments as relevant or not. The predictions of the base-level IE systems are stacked and form a set of feature vectors for training a meta-level classifier. Therefore, IE is transformed into a common classification task at meta-level. The proposed framework was evaluated on three Web domains, using well known IE approaches at base-level and a variety of classifiers at meta-level. Results demonstrate the added value obtained by combining the base-level IE systems in the new framework.
Keywords: Information Extraction, Machine Learning, Text Mining
Citation: Georgios Sigletos, Georgios Paliouras, Constantine Spyropoulos, Takis Stamatopoulos: Stacked generalization for information extraction. In R.López de Mántaras and L.Saitta (eds.): ECAI2004, Proceedings of the 16th European Conference on Artificial Intelligence, IOS Press, Amsterdam, 2004, pp.549-553.