Data Acquisition

While text is abundant on the web, specialist in-domain quality datasets and datasets for low-resourced languages are hard to come by.

Finding and licensing them requires expertise and grassroots presence.

How it works

Custom.MT obtains new data with the following 3 techniques:

Quality

We verify datasets before purchase to ensure they are new, non-repetitive, and of high-quality using our proprietary scanner and a network of language professionals.

Data Acquisition

How it works

Purchasing in-domain data

Parsing web sources

Data manufacturing

Quality