Building Bilingual Multiword Lexicons from Parallel Text

Printer-friendly versionSend by email

Multiword expressions resources are important for both rule-based and statistical machine translation. We present a method to construct bilingual multiword lexicons from SMT phrase tables. The lexicons developed in the grammar formalism GF, which ensures syntactical correctness and generates all the bending forms of the entries. The resources created in this manner can be used to enrich either GF grammars or SMT phrase tables. A special case of this approach, which we will further discuss, is a resource for German compounds from the biomedical domain, extracted from a corpus of English-German biomedical patents.

Fri, 07/03/2014 - 14:00 - 16:00

© 2019 - Institute of Informatics and Telecommunications | National Centre for Scientific Research "Demokritos"

Terms of Service and Privacy Policy