ECAI-2000 Logo

ECAI-2000 Conference Paper

[PDF] [full paper] [prev] [tofc] [next]

A Topic Segmentation of Texts Based on Semantic Domains

Olivier Ferret, Brigitte Grau

Thematic analysis is essential for a lot of Natural Language Processing (NLP) applications, such as text summarization or information extraction. It is a two-dimensional process that has both to delimit the thematic segments of a text and to identify the topic of each of them. The system we present here possesses these two characteristics. Based on the use of semantic domains, it is able to structure narrative texts into adjacent thematic segments, this segmentation operating at the paragraph level, and to identify the topic they are about. Moreover, semantic domains, that are topic representations made of words, are automatically learned, which allows us to apply our system on a wide range of texts, related to varied domains.

Keywords: Natural Language Processing, Text analysis, Topic segmentation

Citation: Olivier Ferret, Brigitte Grau: A Topic Segmentation of Texts Based on Semantic Domains . In W.Horn (ed.): ECAI2000, Proceedings of the 14th European Conference on Artificial Intelligence, IOS Press, Amsterdam, 2000, pp.426-430.


[prev] [tofc] [next]


ECAI-2000 is organised by the European Coordinating Committee for Artificial Intelligence (ECCAI) and hosted by the Humboldt University on behalf of Gesellschaft für Informatik.