Translate text with quality checks
Note
To download this example as a Jupyter notebook, click here.
In this example, we will use Guardrails during the translation of a statement from another language to English. We will check whether the translated statement is likely high quality or not.
Objective
We want to translate a statement from another languages to English and ensure that the translated statement accurately reflects the original content.
Step 0: Setup
To do the quality check, we can use the Critique library, which allows for simple calculation of various metrics over generated text, including translation quality estimation.
First you can get an API key from the Inspired Cognition Dashboard add the following line to the ".env" file in your top directory (like you do for your OpenAI API key).
Then you can install the library
Step 1: Create the RAIL Spec
Ordinarily, we would create an RAIL spec in a separate file. For the purposes of this example, we will create the spec in this notebook as a string following the RAIL syntax. For more information on RAIL, see the RAIL documentation. We will also show the same RAIL spec in a code-first format using a Pydantic model.
In this RAIL spec, we:
- Create an
output
schema that returns a single key-value pair. The key should be 'translated_statement', and the value should be the English translation of the given statement. The translated statement should not have any profanity.
Our RAIL spec as an XML string:
rail_str = """
<rail version="0.1">
<output>
<string description="Translate the given statement into the English language" format="is-high-quality-translation" name="translated_statement" on-fail-is-high-quality-translation="fix"></string>
</output>
<prompt>
Translate the given statement into the English language:
${statement_to_be_translated}
${gr.complete_json_suffix}
</prompt>
</rail>
"""
Or as a Pydantic model:
from pydantic import BaseModel, Field
from guardrails.validators import IsHighQualityTranslation
prompt = """
Translate the given statement into the English language:
${statement_to_be_translated}
${gr.complete_json_suffix}
"""
class Translation(BaseModel):
translated_statement: str = Field(
description="Translate the given statement into the English language",
validators=[IsHighQualityTranslation(on_fail="fix")]
)
Note
In order to ensure the translated statement is high quality, we use is-high-quality-translation
as the validator. This validator uses inspiredco
package.
Step 2: Create a Guard
object with the RAIL Spec
We create a gd.Guard
object that will check, validate and correct the output of the LLM. This object:
- Enforces the quality criteria specified in the RAIL spec.
- Takes corrective action when the quality criteria are not met.
- Compiles the schema and type info from the RAIL spec and adds it to the prompt.
From our RAIL string:
Or from our Pydantic model:
We see the prompt that will be sent to the LLM:
Here, statement_to_be_translated
is the the statement and will be provided by the user at runtime.
Step 3: Wrap the LLM API call with Guard
First, let's try translating a statement that is relatively easy to translate.
import openai
statement = "これは簡単に翻訳できるかもしれない。"
raw_llm_response, validated_response = guard(
openai.Completion.create,
prompt_params={'statement_to_be_translated': statement},
metadata={'translation_source': statement},
engine='text-davinci-003',
max_tokens=2048,
temperature=0
)
print(f"Validated Output: {validated_response}")
We can look at the logs to see the quality check results:
The guard
wrapper returns the raw_llm_respose (which is a simple string), and the validated and corrected output (which is a dictionary). We can see that the output is a dictionary with the correct schema and types.
Next, let's try translating a statement that is harder to translate (because it contains some difficult-to-translate slang words). We see that the translated statement has been corrected to return an empty string instead of the translated statement.
This time, we see that the quality check failed in the logs, and the translated statement is an empty string.