Custom.MT
  • Home
    • For Localization Teams
    • For LSP
    • For Product Managers
    • For Translators
  • Services
    • Machine Translation Model Fine-Tuning
    • Machine Translation Evaluation
    • On-Premise Machine Translation
    • Translation Memory (TMX) Cleaning
    • Language dataset acquisition
    • Workshops – Train Your Team in Language AI
  • Products
    • AI Translation Platform
    • Custom Translation Portals
    • For Trados
    • For Smartling
    • For memoQ
    • Shopware Translation Plugin
    • API
    • Documentation
  • Resources
    • Blog
    • Case Studies
    • Events and Webinars
      • GenAI in Localization
    • MT Leaders
  • About Us
    • About Us
    • Terms and Conditions
    • Privacy Policy
  • Book a Call
  • Sign in

Search

Guide: How to Train a Google Translate AutoML v3 Model
  • Blog post
  • Guides

This guide is useful to train your own Google AutoML translation model. For example, you can train a domain MT model, such as medical, legal, video games, financial reporting with your translation memory accumulated over the years. Alternatively, you can make an organization-specific model that knows all the product and people names, and follows your individual styles.

Unlike brands like Globalese, ModernMT and Systran, training Google models is a technically complex task that requires some developer skills and knowledge. But with our guide a technically-savvy project manager or a solution architect on a language team can take on the ML operations.

  • If you’ve used the Google Cloud Platform before, each part of the training will take: 10 to 15 minutes
  • Creating a billing: 30+ minutes
  • Training the model: 20+ minutes for the setup; 6+ hours for training

CREATING AN ACCOUNT

  • Go to https://console.cloud.google.com/
  • You need a Gmail account to use the Console, which contains many features, including Model Training.

TIP: Make sure to pick the correct Google account when entering the Console. The new tab will open after you switch accounts.

Google Cloud Platform screenshot illustrating the surrounding text instructions. How to train your machine translation model Google AutoML.

After entering the Console, you’ll see the main dashboard. Select  “Billing” from the dropdown menu on the left.

What it is:  GCP (Google Cloud Platform) is used internally to support products like Google Search and YouTube. It contains a suite of cloud computing resources. GCP’s wide range of services includes Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS) options.

Google Cloud Platform screenshot illustrating the surrounding text instructions. How to train your machine translation model Google AutoML.

CREATING A BILLING

Click the button “Add Billing Account”.

Google Cloud Platform screenshot illustrating the surrounding text instructions. How to train your machine translation model Google AutoML.

TIP: Different pages will be shown once you have more than one billing account. You can manage the current account or choose another one.

  • Follow the instructions and provide all the needed details.

Google Cloud Platform screenshot illustrating the surrounding text instructions. How to train your machine translation model Google AutoML.

In the Billing section, you can find the Payment Overview tab. This shows all the main details of your account.

Google Cloud Platform screenshot illustrating the surrounding text instructions. How to train your machine translation model Google AutoML.

  • Once you have an active billing account, go to Billing > Payment Method and add your card as a payment method.


TIP: You need to have a billing account to be able to create a project. Make sure you pick the correct billing account before adding the card. Put in all the legal information and approve your contacts if Google asks. Then, just follow Google instructions.
TIME: It can take up to one day to verify your credentials.

Google Cloud Platform screenshot illustrating the surrounding text instructions. How to train your machine translation model Google AutoML.

Now that you have an account with active billing, it’s time to begin your first project!

TRAINING THE MODEL

Set Up Your Project

  1. Go to https://console.cloud.google.com/

TIP: As always, double-check the  Google account you’re using to enter the system.

2. Select  “New Project”.

Google Cloud Platform screenshot illustrating the surrounding text instructions. How to train your machine translation model Google AutoML.

3. Provide all the needed information and click “Create”.

Google Cloud Platform screenshot illustrating the surrounding text instructions. How to train your machine translation model Google AutoML.

What is:  To know more about the organizations, visit this link:
https://cloud.google.com/resource-manager/docs/creating-managing-organization

TIP: Create individual projects for every language pair you have. This is how you can avoid Google bugs – it currently doesn’t show detailed information by language!

4. Choose your project and find the “Translation” service through the search bar.

Google Cloud Platform screenshot illustrating the surrounding text instructions. How to train your machine translation model Google AutoML.

5. Click on “Enable API” and wait for a while.

What is it:  Turn the API on. It will give you access to use models through CAT tools.
TIME: It usually takes around 3 to 5 minutes to turn on the API.

Execute Training

6. Go to Datasets and click “Create Dataset”.

What it is: The Translation section is the space where you can manage/store your data for training and models, plus see the detailed information.

Google Cloud Platform screenshot illustrating the surrounding text instructions. How to train your machine translation model Google AutoML.

7. Choose the language pair you need.

TIP: Name the dataset that mentions the language pair to make orientation easier.
It’s essential to check the language pair before training starts. There won’t be a way back!

Google Cloud Platform screenshot illustrating the surrounding text instructions. How to train your machine translation model Google AutoML.

8. Click on “Browse” to start creating a bucket.

Google Cloud Platform screenshot illustrating the surrounding text instructions. How to train your machine translation model Google AutoML.

9. Click on the icon to create a new bucket.

What it is: The “Bucket” is the space that will store all your files, Including files for trainings and glossaries.

Google Cloud Platform screenshot illustrating the surrounding text instructions. How to train your machine translation model Google AutoML.

10. Name the bucket and choose the Region (Important).

TIP: Choose the region carefully. Otherwise, you may need to recreate the project. As a best practice, use a single region, as shown in the screenshot.

Google Cloud Platform screenshot illustrating the surrounding text instructions. How to train your machine translation model Google AutoML.

11. Leave the rest of the options unchanged.

12. Choose the new bucket and click the “Select” button.

Google Cloud Platform screenshot illustrating the surrounding text instructions. How to train your machine translation model Google AutoML.

13. Next, choose to upload the file and click “Select Files”.

Google Cloud Platform screenshot illustrating the surrounding text instructions. How to train your machine translation model Google AutoML.

TIP: Use “Upload files from your computer”. This does not lead to issues and bugs.  The usual file format for training is TMX (Translation Memory eXchange). Basic information can be found here: https://cloud.google.com/translate/automl/docs/prepare#translation_memory_exchange_tmx


TIP: Some providers have a limit of 100 MB per file. You can easily separate large datasets into small, single applications–for example, the Heartsome app.

14.Choose the file and click “Continue”.

15. Wait until “processing sentence pairs” is complete.

TIME: Approximately 10 to 15 minutes, depending on the size of the  dataset.

16. Go to the “Train” section and click “Start Training”.

Google Cloud Platform screenshot illustrating the surrounding text instructions. How to train your machine translation model Google AutoML.

17. Check everything and click “Start training” again.

TIP: Make sure all details are correct before starting the training. Check the language pair, billing,  etc. Training that has commenced cannot be refunded.

Google Cloud Platform screenshot illustrating the surrounding text instructions. How to train your machine translation model Google AutoML.

HOW TO USE CREDENTIALS

  1. In case you need to use credentials to translate by CAT tools: most ask for the Project ID, Model ID, and json file (service account key).

Google Cloud Platform screenshot illustrating the surrounding text instructions. How to train your machine translation model Google AutoML.

2. The Project ID can be found in the project section, where you can create or choose projects.

Google Cloud Platform screenshot illustrating the surrounding text instructions. How to train your machine translation model Google AutoML.

3. The Model  ID can be found in the Translation section. The ID will appear below the name of the model itself.

Google Cloud Platform screenshot illustrating the surrounding text instructions. How to train your machine translation model Google AutoML.

4. To create the json file, go to IAM & Admin > Service Accounts.

Google Cloud Platform screenshot illustrating the surrounding text instructions. How to train your machine translation model Google AutoML.

5. Go to the service account that has been created automatically.

TIP: You can add as many service accounts as you need for a variety of uses.

6. Go to the KEYS tab.

Google Cloud Platform screenshot illustrating the surrounding text instructions. How to train your machine translation model Google AutoML.

7. Click “ADD KEY” and create a new key.

Google Cloud Platform screenshot illustrating the surrounding text instructions. How to train your machine translation model Google AutoML.

8. You’ll have two options to choose from: JSON and P12.

What it is: P12 is an alternate extension for what is generally referred to as a “PFX file”. It’s the combined format that holds the private key and certificate. It’s also the format that most modern signing utilities use.

9. Choose the JSON file to download it to your PC. Keep it for use.

Googlemachine translationMT
Konstantin Dranch
Konstantin Dranch
Language Industry Researcher | Founder Custom.MT learn something new every week, create transparency in specialized markets

Comments are closed.

Stay in the loop
Subscribe to receive the latest industry news, updates on MT & LLM events, and product information

Categories

  • Blog post
  • Case studies
  • Guides
  • Infographics
  • Interview
  • Press Release
  • Related Posts
  • Uncategorized
  • Webinars

Webinars

  • AI Prompt Engineering for Localization – 2024 Techniques
  • AI Prompt Engineering for Localization
  • Managing Machine translation in LSPs in 2023
  • Natural Language Processing for Business Localization (Webinar)
  • Let’s Machine Translate Our Website!
  • hello@custom.mt
  • Home
    • For Localization Teams
    • For LSP
    • For Product Managers
    • For Translators
  • Services
    • Machine Translation Model Fine-Tuning
    • Machine Translation Evaluation
    • On-Premise Machine Translation
    • Translation Memory (TMX) Cleaning
    • Language dataset acquisition
    • Workshops – Train Your Team in Language AI
  • Products
    • AI Translation Platform
    • Custom Translation Portals
    • For Trados
    • For Smartling
    • For memoQ
    • Shopware Translation Plugin
    • API
    • Documentation
  • Resources
    • Blog
    • Case Studies
    • Events and Webinars
      • GenAI in Localization
    • MT Leaders
  • About Us
    • About Us
    • Terms and Conditions
    • Privacy Policy
  • Book a Call
  • Sign in
  • Home
    • For Localization Teams
    • For LSP
    • For Product Managers
    • For Translators
  • Services
    • Machine Translation Model Fine-Tuning
    • Machine Translation Evaluation
    • On-Premise Machine Translation
    • Translation Memory (TMX) Cleaning
    • Language dataset acquisition
    • Workshops – Train Your Team in Language AI
  • Products
    • AI Translation Platform
    • Custom Translation Portals
    • For Trados
    • For Smartling
    • For memoQ
    • Shopware Translation Plugin
    • API
    • Documentation
  • Resources
    • Blog
    • Case Studies
    • Events and Webinars
      • GenAI in Localization
    • MT Leaders
  • About Us
    • About Us
    • Terms and Conditions
    • Privacy Policy
  • Book a Call
  • Sign in