|
[full paper] |
Jesús Tomás, Francisco Casacuberta
The development of a Spanish-Catalan statistical machine translation system has been described. This approach tries to solve the problem using a pure inductive method, without using linguistic knowledge. To obtain the translator we follow the next steeps: First, we obtain a bilingual corpus from Internet. Second, we fragment the corpus into units (sentences and tokens). Third, we align the sentences from the two different languages. Then, we use the aligned corpus to train statistical models. Finally, we use these models to translate. That is, given a source sentence, we search the most probable target sentence. We have compared our translator with the most used Spanish-Catalan translators and we have obtained similar translation results than the other commercial system. It is accessible at http://ttt.gan.upv.es/~jtomas/trad.
Keywords: machine translation, statistical pattern recognition, human language technology
Citation: Jesús Tomás, Francisco Casacuberta: A Spanish-Catalan Translator Using Statistical Methods. In R.López de Mántaras and L.Saitta (eds.): ECAI2004, Proceedings of the 16th European Conference on Artificial Intelligence, IOS Press, Amsterdam, 2004, pp.1099-1100.