WHO:
Dr. Eugene Charniak
, Brown Laboratory for Linguistic Information Processing and Department of Computer Science
TOPIC:
Syntax-Based Language Modeling for Machine Translation
ABSTRACT:
Formally a language model is a probability distribution over all string in a language. Practically they are used to improve the output of speech recogntion systems and, more recently, language-translation programs. Language models can be very simple, as in the tried and true trigram model, where one estimates the probability of the next word as a function of just the two previous words. However more recent research has investigated language models based upon statistical parsing algorithms. In this talk, I describe some experiments in which such a syntax-based model has been added to an already existing language-translation system. The resulting system exhibits a dramatically improved capability of returning grammatical sentences rather than syntactic fruit salad--the percentage of translations which are both meaning preserving and syntactically correct is up by 47%. I also stess the elegance of the resulting system, as the two programs (langauge and translation models) are quite tightly and naturally integrated. In some otherwise unwarranted speculation I suggest the system as a possible model for langauge generation, and how a single syntactic system can inform both parsing and generation. Joint work with Kevin Knight and Kenji Yamada (ISI). Refreshments served at 10:45 a.m. CSB 209.t
WHEN:
11/3/2003 11:00:00 AM
WHERE:
Computer Studies Building 209
Events Homepage
questions and comments
about this site.
Copyright © Brain & Cognitive Sciences, University of Rochester
Programmed by Edward Longhurst