Introduction
Weglot is a set of libraries that allows you to quickly and easily use the Weglot API and to translate your website seamlessly and delivers it in any languages.
Our API is following REST guidelines and is HTTP based.
Main feature of Weglot library isn't the library itself but the Parser we provides with it. It permit you to get high-level processing of your HTML page with ease.
Authentication
To authorize, use this code:
<?php
use Weglot\Client\Client;
$client = new Client('my_api_key');
# With shell, you can just pass the correct parameter with each request
curl "https://api.weglot.com/status?api_key=my_api_key"
Make sure to replace
my_api_key
with your Weglot API key.
Weglot uses API Keys to allow access to the API. You can register a new Weglot API Key at: Register.
Weglot expects for the API Key to be included in all API requests to the server in the URL as a parameter that looks like the following:
https://api.weglot.com/endpoint?api_key=my_api_key
Endpoints
Translate
<?php
use Weglot\Client\Api\Enum\BotType;
use Weglot\Client\Api\TranslateEntry;
use Weglot\Client\Api\WordEntry;
use Weglot\Client\Api\Enum\WordType;
use Weglot\Client\Client;
use Weglot\Client\Endpoint\Translate;
// TranslateEntry parameters
$params = [
'language_from' => 'en',
'language_to' => 'de',
'title' => 'Lorem ipsum dolor sit amet',
'request_url' => 'https://foo.bar/',
'bot' => BotType::HUMAN
];
$translate = new TranslateEntry($params);
$translate->getInputWords()
->addOne(new WordEntry('This is a blue car', WordType::TEXT))
->addOne(new WordEntry('This is a black car', WordType::TEXT));
// Client
$client = new Client(getenv('WG_API_KEY'));
$translate = new Translate($translate, $client);
$translated = $translate->handle();
curl -X POST "https://api.weglot.com/translate?api_key=my_api_key" \
-H "Content-Type: application/json" \
-d \
'
{
"l_from": "en",
"l_to": "fr",
"title": "My awesome page",
"request_url": "https://www.website.com/",
"words": [
{
"t": "1",
"w": "This is a blue car"
},
{
"t": "1",
"w": "This is a black car"
}
]
}'
The above command returns JSON structured like this:
{
"l_from":"en",
"l_to":"fr",
"title":"My awesome page",
"request_url":"https:\/\/www.website.com\/",
"bot":0,
"from_words":[
"This is a blue car",
"This is a black car"
],
"to_words":[
"C'est une voiture bleue",
"C'est une voiture noire"
]
}
This endpoint retrieves all translations. It takes an array of sentences in an original languages in input and output the same array of sentences but translated in another languages.
HTTP Request
POST https://api.weglot.com/translate
JSON Body Parameters
Parameter | Required | Description |
---|---|---|
l_from | true | ISO 639-1 code of the original language |
l_to | true | ISO 639-1 code of the destination language |
words | true | array of sentence in original language. Each words contains 2 parameters |
words[t] | true | the type of the word : check at WordType resource for more details |
words[w] | true | the sentence to translate |
bot | false | link to user agent : check at BotType resource for more details |
request_url | true | the URL where the request is from |
title | false | the title of the page where these sentences come from |
Returned Content
Parameter | Description |
---|---|
l_from | ISO 639-1 code of the original language |
l_to | ISO 639-1 code of the destination language |
title | the title of the page where these sentences come from |
request_url | the URL where the request is from |
bot | link to user agent : check at BotType resource for more details |
from_words | array of sentence in original language. |
to_words | array of sentence in destination language. This is the important part ! |
Status
<?php
use Weglot\Client\Client;
use Weglot\Client\Endpoint\Status;
$client = new Client(getenv('WG_API_KEY'));
$status = new Status($client);
curl -I "https://api.weglot.com/status?api_key=my_api_key"
The above command returns JSON structured like this:
HTTP/1.1 200 OK
This endpoint is used as check-alive. You can use it to check if Weglot API is up and running.
HTTP Request
GET https://api.weglot.com/status
Returned Content
The returned content is just a simple empty JSON array. The interesting part is to check if status code is 200
.
Parser
The Parser allows you to fetch a whole HTML page, to extract sentences & to inject them back in your HTML. It works on top of our library to let us manage every request without bothering you managing communication.
Config providers
<?php
use Weglot\Parser\ConfigProvider\ServerConfigProvider;
use Weglot\Parser\ConfigProvider\ManualConfigProvider;
use Weglot\Client\Api\Enum\BotType;
// Url to parse
$url = 'https://foo.bar/baz';
// Manual Config
$config = new ManualConfigProvider($url, BotType::HUMAN);
// Config with $_SERVER variables
$config = new ServerConfigProvider();
// Change title behavior from automatic to manual
$config->setTitle('Some title');
// Change back title behavior to automatic
$config->setTitle(null);
Some of the parameters we usually need in a (Translate query)[#translate] are skipped with config providers, actually we have two of them availables:
ManualConfigProvider
, data is fetched from what you give to the provider.ServerConfigProvider
, data is fetched inside of theParser
(throughloadFromServer()
function) from the$_SERVER
variable. Be carefull, it won't work if your environment doesn't fill it.
In addition to both providers, we have two ways to set the title
that we send to Weglot API.
On both provider construct function, you'll have a $title
variable that is null
by default.
If you let it with the default value, you'll be in automatic mode.
Like that, the Parser
would seek title tag in the HTML page and use it as title
.
If you set it manually, we'll just use what you've put as title
.
Detailled process
<?php
use Weglot\Client\Client;
use Weglot\Parser\Parser;
use Weglot\Parser\ConfigProvider\ServerConfigProvider;
// Url to parse
$url = 'https://foo.bar/baz';
// Config with $_SERVER variables
$config = new ServerConfigProvider();
// Fetching url content
$content = '...';
// Client
$client = new Client(getenv('WG_API_KEY'));
$parser = new Parser($client, $config);
// Run the Parser
$translatedContent = $parser->translate($content, 'en', 'de');
How it works:
Once you hit the translate
function, what the Parser
does ? Let's explain all steps, step by step:
- Receiving & setting both original & destination language.
- Depending on your API version, you'll now pass into
ignoredNodes
formatter. Basically,ignoredNodes
mode is skip some tags into translations by using HTML entities, like that, we would have less sentences sent to Weglot API. - After that, we're using a library called
simple_html_dom
(with some internal tweaks) to fetch all HTML content. - Then we check if there is blocks you don't want to be translated (through
$excludeBlocks
array that you can set inParser
construct). We fetch them all and add a property to make sure we won't translate them (you can find this property inParser::ATTRIBUTE_NO_TRANSLATE
constant) - From now, the real deal start, we run Checkers. Checkers are used to match elements in HTML and to retrieve strings to translate.
- And we send all strings without duplicates to API through the Translate endpoint from the library
- We're getting response with all translated sentences and apply changes to collected HTML nodes from Checkers through Formatters
- Finally, we're returning all the HTML as string with everything translated !
Caching
Depending on library implementation we added caching to requests that we send to Weglot API.
PHP
<?php
// Redis init
$redis = new Redis([
'scheme' => getenv('REDIS_SCHEME'),
'host' => getenv('REDIS_HOST'),
'port' => getenv('REDIS_PORT'),
]);
$redisPool = new PredisCachePool($redis);
// Client
$client = new Client(getenv('WG_API_KEY'));
$client->setCacheItemPool($redisPool);
// setting expire
$client->getCache()->setExpire(86400);
For PHP, we're following PSR-6 standard about caching.
Basically it can plug to any caching provider as long as you've PSR-6 adaptor. You can find many of theses adaptors in this list.
You'll find an example for a Redis server with Predis library on right pane.
Expire
When you use cache we set an expire timer on every items.
By default, this timer is set to 604800 seconds but you can change it anytime you want by using setExpire(int $seconds)
method.
Granularity
The cache has changed a bit since its first version. Originally we used to put the whole request into cache and use cache params to generate an unique ID. We removed that behavior to focus on more specific caching for each Endpoints. For example, on Translate endpoint, we're now caching each words to make sure our request (if there is one) is small as possible.
Resources
BotType
<?php
use Weglot\Client\Api\Enum\BotType;
// You can find all theses values through BotType abstract class
$botType = BotType::HUMAN;
// ...
Used to defined how a sentence gonna be used.
Short-name | Value | Description |
---|---|---|
HUMAN | 0 | Sent from human |
OTHER | 1 | Sent from unknown source |
2 | Sent from Google | |
BING | 3 | Sent from Bing |
YAHOO | 4 | Sent from Yahoo |
BAIDU | 5 | Sent from Baidu |
YANDEX | 6 | Sent from Yandex |
WordType
<?php
use Weglot\Client\Api\Enum\WordType;
// You can find all theses values through WordType abstract class
$wordType = WordType::TEXT;
// ...
Used to defined how a sentence gonna be used.
Short-name | Value | Description |
---|---|---|
OTHER | 0 | Word is any of the above elements |
TEXT | 1 | Word is simple text (default) |
VALUE | 2 | Word is an attribute value |
PLACEHOLDER | 3 | Word is a placeholder |
META_CONTENT | 4 | Word is from meta content header |
IFRAME_SRC | 5 | Word is an iframe source link |
IMG_SRC | 6 | Word is an image source link |
IMG_ALT | 7 | Word is an image alternative description |
PDF_HREF | 8 | Word is a PDF source link |