French to English Finance machine translation beats Human

December 23, 2020

Case study

In this case study, we look at a machine translation engine training project with a French LSP. We trained a set of MT engines and evaluated the performance both automatically and with a human eye with the client’s pool of linguists.

Language combination: French to English 

Domain: Financial documentation

Training dataset: 607k parallel segments (295k after cleanup)

BLEU scores attained: 44 and 43

There was a “Kasparov vs Deep Blue” moment when the machine won against a specialist human translation 5 out of 5. In our test, five specialist translators ran a blind test on a group of engines, among which human reference was hidden as another MT output. Without knowing which was which, the translators scored 96 segments for each engine from 5 (Perfect) to 1 (Useless). By the number of segments that needed no editing, the linguists placed human translation only in the 3rd position.

The results

The trained engine gained a lot on every metric compared to the very strong stock engine from DeepL that the client used before:

  • +30% in the human evaluation score
  • needs 42% less time to edit
  • need -62.5% effort (WER) to edit

The company is now implementing a new compensation scheme, while we’re analyzing mistakes still made by the machine to retrain it a couple of months down the road.

Error types

We estimate that the savings from upgrading to a trained engine will improve the LSP’s gross margins by more than 10% in 2021.

The machine does not always perform so well. In another evaluation, this time for English to Russian, humans won against every engine trained. Russian is a harder nut to crack for MT because it is an inflectional language. However, it still makes sense to train. The difference between stock and the best performing trained engine was huge; the client still gained +60% better MT performance after training. 

Related posts

October 14, 2021
The Arrival of Automatic Dubbing

The year 2021 marked the arrival of speech to speech translation in the commercial world. Scientists are working on making the underlying technology smoother and more accurate, engineers are integrating it into practical use cases. At the same time, there is an explosion in neural voices. Between July and September, three companies in this area […]

Read More
July 28, 2021
The Rise of Government NLP Programs – with Manuel Herranz, Pangeanic

Partner Spotlight: Pangeanic Smart governments are hiring data scientists to further automate what governments do for their  citizens. These data scientists work on creating data highways, so that the information that flows into systems is structured, and a thousand different applications can spring forth from it in the future. In the meanwhile, Manuel Herranz and his company […]

Read More
March 29, 2021
MT engine from Globalese gains 115% after training

Case Study Engines from IT giants such as Google Translate, Microsoft, and Yandex often win in quality because search engine companies possess the whole internet as their data pool. However, with very specialized content and excellent translation memory, this advantage is nullified. In this case study, the engine from a smaller MT vendor Globalese won […]

Read More
Subscribe to our newsletter