string(10) "newsevents"
February 21, 2014

Building Bilingual Multiword Lexicons from Parallel Text

  • October 1, 2022 till October 1, 2022

Multiword expressions resources are important for both rule-based and statistical machine translation. We present a method to construct bilingual multiword lexicons from SMT phrase tables. The lexicons developed in the grammar formalism GF, which ensures syntactical correctness and generates all the bending forms of the entries. The resources created in this manner can be used to enrich either GF grammars or SMT phrase tables. A special case of this approach, which we will further discuss, is a resource for German compounds from the biomedical domain, extracted from a corpus of English-German biomedical patents.

Skip to content