Resource title

Incremental Re-training for Post-editing SMT

Resource image

image for OpenScout resource :: Incremental Re-training for Post-editing SMT

Resource description

Fremlagt på The Ninth Conference of the Association for Machine Translation in the Americas 2010 ; A method is presented for incremental retrainingof an SMT system, in which a localphrase table is created and incrementally updatedas a file is translated and post-edited.It is shown that translation data from withinthe same file has higher value than otherdomain-specific data. In two technical domains,within-file data increases BLEU scoreby several full points. Furthermore, a strongrecency effect is documented; nearby datawithin the file has greater value than moredistant data. It is also shown that the valueof translation data is strongly correlated witha metric defined over new occurrences of ngrams.Finally, it is argued that the incrementalre-training prototype could serve as the basisfor a practical system which could be interactivelyupdated in real time in a post-editingsetting. Based on the results here, such an interactivesystem has the potential to dramaticallyimprove translation quality.

Resource author

Daniel Hardt, Jakob Elming

Resource publisher

Resource publish date

Resource language

eng

Resource content type

application/pdf

Resource resource URL

http://hdl.handle.net/10398/8272

Resource license

Check the according license before adaptation. When adapting give credits to the original author.