Localizing a GenAI feature

Aug 11, 2024

Building a GenAI feature requires not only expertise but considerable effort. One of the major challenges is evaluating its effectiveness and identifying rare edge cases. Expanding these features to other languages intensifies the complexity, requiring a rebuild of the feature. This includes acquiring local datasets – a challenging feat – and finding experts fluent in the target languages for testing and red teaming.

It's important to note that foundational models often perform better in English. This is hardly surprising given that a significant portion of the internet's content is in English, and many major LLM companies are based in English-speaking countries. Therefore, achieving similar performance levels in other languages may require more data for tuning.

Securing high-quality data in other languages can be challenging, so consider this strategy: Utilize a translation service to translate your existing data (both input and output) and use this translated data for model fine-tuning and evaluation. While this approach is cost-effective, it comes with downsides. Firstly, the data distribution might not accurately reflect real user-generated content. Secondly, it might propagate errors from the translation service.

It's something to get your started. However you should at least seek real early testers – internal or external – who are familiar with the language and can give you real feedback.

This strategy is reminiscent of back translation, traditionally used to validate translation quality. Before the advent of large language models (LLMs), back translation also served to augment datasets by effectively paraphrasing existing content.

Jack on ML leadership

Ready for more?