Providing gender-specific translations in Google Translate

A step forward toward reducing gender bias translations

Share

Google Translate is best for translating individual words. It is good when you want to get a quick start translating something. Throughout the years, Google Translate has made noteworthy enhancements to translation quality.

But sometimes, the translations reflected gender bias. Undoubtedly, languages differ a lot while representing gender. When there are ambiguities during translation, the systems tend to pick gender choices that reflect societal asymmetries, resulting in biased translations.

Recently, Google researchers have a taken a step forward toward reducing gender bias in our translations. Researchers have now provided both feminine and masculine translations while translating single-word queries from English to four different languages and when translating phrases and sentences from Turkish to English.

Melvin Johnson, Senior Software Engineer noted, “Supporting gender-specific translations for single-word queries involved enriching our underlying dictionary with gender attributes. Supporting gender-specific translations for longer queries (phrases and sentences) was particularly challenging and involved making significant changes to our translation framework.”

“For these longer queries, we focused initially on Turkish-to-English translation. We developed a three-step approach to solve the problem of providing a masculine and feminine translation in English for a gender-neutral query in Turkish.”

Preferring Turkish language, researchers started by detecting queries that are eligible for gender-specific translations. Though this was a difficult task because is morphologically complex.

Due to this complexity, researchers cannot use a simple list of gender-neutral pronouns to detect gender-neutral Turkish queries and need a machine-learned system.

For the detection of the queries, researchers applied state-of-the-art text classification algorithms and developed a system that can detect when a given query is gender-neutral.

Scientists tested the system on thousands of human-rated Turkish examples, where raters were asked to judge whether a given example is gender-neutral or not. The outcomes suggest that the final classification system can accurately detect queries which require gender-specific translations.

Researchers even advanced the system to produce feminine and masculine translations when requested. When no gender is requested, the system produces the default translation.

The model is added with gender-prefix to the translation request if a user’s query is determined to be gender-neutral. Such requests proceed via Neural Machine Translation model. The model can reliably produce feminine and masculine translations 99% of the time.

Johnson explained, “Finally, we have a step that decides whether to display the gender-specific translations. Since the training data that produces the masculine translation is different from the training data that produces the feminine translation, there may be differences between the two translations unrelated to gender. If the gender-specific translations are determined to be low quality, we show only the single default translation.”

How does it work?

Providing gender-specific translations in Google Translate

The input primarily goes through classifier for the detection whether it is for gender-specific translations. If the classifier replies ‘Yes’, three requests are passed to the model: a feminine request, a masculine request, and an ungendered request.

In the final step, all three requests are proceeded and decide whether to display gender-specific translations or a single default translation.

Johnson said, “This is just the first step toward addressing gender bias in machine-translation systems and reiterates Google’s commitment to fairness in machine learning. In the future, we plan to extend gender-specific translations to more languages and to address non-binary gender in translations.”

Newsletter

See stories of the future in your inbox each morning.

Trending