[full paper] |
Olivier Ferret, Brigitte Grau
Thematic analysis is essential for a lot of Natural Language Processing (NLP) applications, such as text summarization or information extraction. It is a two-dimensional process that has both to delimit the thematic segments of a text and to identify the topic of each of them. The system we present here possesses these two characteristics. Based on the use of semantic domains, it is able to structure narrative texts into adjacent thematic segments, this segmentation operating at the paragraph level, and to identify the topic they are about. Moreover, semantic domains, that are topic representations made of words, are automatically learned, which allows us to apply our system on a wide range of texts, related to varied domains.
Keywords: Natural Language Processing, Text analysis, Topic segmentation
Citation: Olivier Ferret, Brigitte Grau: A Topic Segmentation of Texts Based on Semantic Domains . In W.Horn (ed.): ECAI2000, Proceedings of the 14th European Conference on Artificial Intelligence, IOS Press, Amsterdam, 2000, pp.426-430.