Adding a Custom Domain Dictionary to Watson Language Translation Service
The use of machine translation has become nearly omnipresent across many applications to help users understand content in other languages. Even though machine translation does a great job at giving you a gist of the meaning there are occasions where specialized technical terms are inappropriately translated. To help you manage these situations you can apply your own custom dictionary during machine translation. Recently the Watson Language Translation service on Bluemix introduced domain customization to let you tune your machine translated results.
In this post I am going to explain how you can quickly create your own TMX custom dictionary. If you are unfamiliar with translation memories and the TMX standard you can find more information here. Before we get started though there are a few key points to remember:
To upload your translation memory TMX file take the following steps:
curl -u “username”:”password” \
“https://gateway.watsonplatform.net/language-translation/api/v2/models”
curl -u “username”:”password” \
-X POST \
-F base_model_id=en-fr \
-F name=”custom_glossary” \
-F forced_glossary=@glossary.tmx \
“https://gateway.watsonplatform.net/language-translation/api/v2/models”
curl -u “username”:”password” \
“https://gateway.watsonplatform.net/language-translation/api/v2/models/model-id”
curl -u “username”:”password” \
-X POST \
-d {
“model_id”: “model id”,
“text”: [
“When you make a post about Cloud Computing be sure to use the correct hashtag”
]
}
I hope that this posting helps you get started with building your own custom dictionaries so that you can supplement the great results you already get from using Watson Language Translation on Bluemix.