Tutorials

Hello World: LLM-as-a-Judge

What you will learn

How to create an agentic guardrail using LLM-as-a-Judge.
How to structure a RAG-style LLM interaction as an Entailment Frame.
How to test a gaurdrail using the playground.
How to run inference on a guardrail using the Python client.
- For this you will need a terminal window with a Python interepreter installed.

Data Model (Intro to the Entailment Frame)

We are going to verify a completed RAG-LLM interaction using an agentic guardrail to perform a "second opinion" on whether the original LLM got the answer right.

flowchart LR %% Main flow C[(RAG Documents)] Q[User Question] LLM[Application LLM] A[Unverified Output] Q -.-> LLM C -.-> LLM LLM -.-> A subgraph Jaxon[Jaxon] direction TB G[LLM-as-a-Judge Guardrail] JLLM[LLM] G <-.-> JLLM end subgraph Result direction TB Proof{{Proof}} FinalEval{{Eval}} Proof -.-> FinalEval end Q -->|Question| Jaxon C -->|Context| Jaxon A -->|Answer| Jaxon Jaxon --> Result

Guardrail Inputs

Context: Facts injected by the RAG retrieval.
Question: The specific query put to the LLM (with the context prepended).
Answer: The original answer produced by the LLM.

Guardrail Outputs

Eval: A YES/NO decision, along with a confidence score, on whether the answer is correct.
Proof: Not used for this guardrail.

This complete set of Inputs and Outputs - Context, Question, Answer, Eval, and Proof - comprises an entailment frame. This is common data structure for interaction with all guardrails. Each guardrail type has its own interactions with the entailment frame. LLM-as-a-Judge evaluations the consistency of Context+Question with Answer, capturing its assessment in Eval. As the LLM's internal reasoning is a "black box", Proof is unused.

Creating the guardrail

Create an application. We'll name it hello.
- [Applications] -> [Add Application] -> Fill form with id: hello / name: hello -> [Create].
Create the guardrail.
- [Guardrails] -> [Add Guardrail] will display the Basic Information form for a new guardrail.
  - Application should already have hello selected; if not, do so.
  - Guardrail Name is a human readable label. Hello World is a good choice.
  - Guardrail Type should be set to LLM-as-a-Judge.
  - Click [Create].
- This will enable the Configuration section of the form. You do not need to change these default settings. Click [Update] to close.
- Your new guardrail will appear the list of guardrails. It is now available for use.

Testing the guardrail in the playground

We can use the playground available from the same guardrails screen to send a test request. Enter the following, noting that the 3 fields following the guardrail selector mirror the entailment frame.

Playground Input

Click Submit, and the guardrail response will appear.

Playground Response

Testing the guardrail with the client

We can access this guardrail programmatically; this is the primary usage model for deployment.

In a terminal with Python installed, create a workspace:
- mkdir jaxon-hello
- cd jaxon-hello
- virtualenv .venv
- . .venv/bin/activate
Copy the Jaxon client jaxon-1.0.0-py3-non-any.whl into that directory.
pip install jaxon-1.0.0-py3-non-any.whl
python
- Alternately, create file hello.py with the below program and execute with python hello.py.

Now, you can execute the following program to repeat our previous test.

from jaxon import Client
import jaxon
from data_model import EntailmentFrame

# Suppress some connection messages that can occur with certain installations
# These are harmless, and will be patched in a future release.
import logging
logging.getLogger('kafka').setLevel(logging.CRITICAL)
logging.getLogger('kafka.client').setLevel(logging.CRITICAL)
logging.getLogger('kafka.consumer').setLevel(logging.CRITICAL)

# One client instance can be reused for multiple requests
client = Client(group="hello-world-client-test", # Unique to your application/process
                   bus_host="localhost", # Change to fit your installation
                   bus_port=9094, # Change to fit your installation
                   config_host="localhost", # Change to fit your installation
                   config_port=2379, # Change to fit your installation
                   verbose=False)

# Each request is packaged as an EntailmentFrame object
frame = EntailmentFrame(
                    C=["All dogs go to heaven.", "Fido is a cat."],
                    Q="Does Fido go to heaven?",
                    A="Yes",
                    P=[],
                    E=None # Holder for any error messages
                )

# Send the frame to the guardrail
guardrail_id = "d79c0e0d" # Copy the Gaurdrail ID from the Guardrail list in the Web UI
response_topic = "hello-world-response" # You can create these to coordinate different application flows
_, trace_id = client.send_message(guardrail_id, response_topic, frame)

# Poll for the response                
timeout = 120000 # in ms
response = client.get_response(trace_id, response_topic, timeout)
print(response) # This is a dict

You will see response (a Python dict):

{'entailment_frame': 
    '{"C": ["All dogs go to heaven.", "Fido is a cat."], 
      "Q": "Does Fido go to heaven?",
      "A": "Yes", 
      "P": [], 
      "E": {"conclusion": "NO", "confidence": 1.0}, 
      "Error": null}', 
 'id': 'e42d8796-c1cf-488a-91d2-712174c8e512', 
 'trace_id': '88861a9c-dc70-4130-80d3-ba2cc8990417', 
 'response_topic': [], 
 'rail_id': 'd79c0e0d'}

Fair Housing: Policy Rules Guardrail

What you will learn

How to create a Policy Rules Guardrail.
- Extracting rules from a policy document.
- Codifying rules as DSAIL.
- Crafting data extraction questions.
How to test a policy rules gaurdrail using the playground.

Scenario

A landlord/property manager has created a chatbot to manage new tenant applications and schedule apartment showings. While the chatbot performs sufficiently well at its primary function, it exposes the landlord to the risk of fair housing law violations. In order to mediate this risk, we will set up a Policy Rules Guardrail to detect and flag such violations. This guardrail will focus on Section 8 compliance; in a real application several complimentary guardrails may function together, verifying different aspects of the same overall policy. For example, Section 8 compliance addresses financial inequality, but does not directly address other types of discrimination also outlined in different states' fair housing laws; these might be implemented as complementary guardrails.

flowchart LR U[("User")] --> C[["Chatbot"]] G <-.-> LLM C --> Jaxon Jaxon --> V{"Violation?"} V -- Yes --> Flag["Fallback to\nHuman Support"] %%V -- No --> U subgraph Jaxon["Jaxon"] direction TB G["Policy Rules Guardrail"] LLM["LLM"] end

Guardrail Inputs

Context: The chat transcript between user and chatbot
Question: "Is this chat Section 8 compliant?"
Answer: N/A We will report the answer in the guardrail outputs

Guardrail Outputs

Eval: A YES/NO decision, along with a confidence score, on whether the context is compliant with the policy.
Proof: A breakdown of individual rules/clauses from the policy, so we know which are/aren't satisfied.

Creating the guardrail

Create an application. We'll name it section8.
- [Applications] -> [Add Application] -> Fill form with id: section8 / name: section8 -> [Create].

Create the guardrail.

[Guardrails] -> [Add Guardrail] will display the Basic Information form for a new guardrail.
- Application should already have section8 selected; if not do so.
- Guardrail Name is a human readable label. Section 8 Tutorial is a good choice.
- Guardrail Type should be set to Policy Rules.
Basic Configuration
- Pick the best LLM and Utility LLM you have access to. Reasoning models such as o3-mini are a good choice, if available.
- Check Human Review Enabled

Policy Document - this is a prose capture/extraction of the policy you want to enforce. For this tutorial, copy/paste the following:

Fair housing law in Massachusetts, as governed by both federal and state regulations, prohibits landlords 
from discriminating against tenants based on race, color, religion, national origin, sex, disability, 
familial status, and source of income—including Section 8 housing vouchers.

Under the Massachusetts Fair Housing Law (M.G.L. c. 151B) and the federal Fair Housing Act, landlords 
cannot refuse to rent to an otherwise qualified tenant solely because they use a Section 8 voucher.

Additionally, landlords must follow reasonable accommodation requirements for tenants with disabilities, 
such as allowing service animals despite a no-pet policy or making necessary structural modifications.

To comply with Section 8 requirements, landlords must ensure that their rental units meet the Housing 
Quality Standards (HQS) set by the Department of Housing and Urban Development (HUD) and pass an 
inspection before a lease is approved.

They are also required to charge rent within the limits set by the local housing authority and cannot 
demand additional payments outside the terms of the Housing Assistance Payment (HAP) contract.

Furthermore, landlords must maintain habitable conditions, perform necessary repairs, and cannot 
retaliate against tenants for exercising their rights under fair housing laws. Violations of these laws 
can result in penalties, including fines, lawsuits, and potential loss of the ability to participate in 
housing programs.

We want this policy to specifically adhere to analyzing chat transcripts between a landlord and a prospective 
tenant to identify the rules that must be met in order to ensure that fair housing law was not violated. 
Rules should be specific violations that might be found in such a chat.

Click [Extract Rules]

Rules - The extraction will populate the Rules area with a bullet-point extraction of specific rules that have to be true in order to comply with the policy.
Note you can start the process directly at this step if you have "bullet point" rules; the policy document and extraction is not required.

You may continue with your generated rules, or you may copy/paste the below version into your Rules area in order to keep your system synchronized with this tutorial (as these steps use LLM-based transformations, the exact outputs will vary with each run.)

* Anti-Discrimination: Do not discriminate based on race, color, religion, national origin, sex, disability, familial status, or source of income (including Section 8 vouchers).
* Voucher Acceptance: Do not refuse a qualified tenant solely because they use a Section 8 voucher.
* Disability Accommodation: Provide reasonable accommodations for tenants with disabilities, including allowing service animals and necessary modifications.
* Housing Quality: Ensure rental units meet HUD’s Housing Quality Standards and pass required inspections before leasing.
* Rent Compliance: Charge rent within limits set by the local housing authority and do not require extra payments outside the HAP contract.
* Habitability Maintenance: Keep units in habitable condition and carry out necessary repairs.
* No Retaliation: Do not retaliate against tenants for exercising their rights under fair housing laws.

Prove Compliance vs Prove Violations describes the "orientation" of the DSAIL code that we're about to generate from these rules. Consider whether a rule (in text) or an assertion (in DSAIL) proves adherence to the policy, or proves a violation of the policy. This is pertinent when uncertainty is introduced due to some questions (see later in this tutorial) are unanswered - a common case, in practice. A satisified assertion means that some choice of unknown variables can make this assertion True; an unsatisfied assertion means that no choice of unknown variables can do so. This means that a satsified assertion may indicate the truth of the rule it describes, but cannot guarantee it.
- Prove Compliance - Orient each assert so that proven compliance produces an unsatisfied result. assert noDiscrimination {{ landlordRefusedSection8 }} # When landlordRefusedSection8 is false (compliance), assert becomes UNSAT
- Prove Violation - Orient each assert so that a violation produces an unsatisfied result. assert noDiscrimination {{ Not(landlordRefusedSection8) }} # When landlordRefusedSection8 is true (a violation), assert becomes UNSAT
For this tutorial, we will select Prove Violations

Click [Generate DSAIL] or copy the below program into Code Generated

declare landlordDiscriminatedOnRace as boolean;
declare landlordDiscriminatedOnColor as boolean;
declare landlordDiscriminatedOnReligion as boolean;
declare landlordDiscriminatedOnNationalOrigin as boolean;
declare landlordDiscriminatedOnSex as boolean;
declare landlordDiscriminatedOnDisability as boolean;
declare landlordDiscriminatedOnFamilialStatus as boolean;
declare landlordDiscriminatedOnSourceOfIncome as boolean;
declare tenantRefusedDueToSection8Voucher as boolean;
declare disabilityAccommodationsNotProvided as boolean;
declare unitPassedInspection as boolean;
declare unitMeetsHousingQuality as boolean;
declare rentWithinLimits as boolean;
declare extraPaymentsOutsideHAP as boolean;
declare unitHabitable as boolean;
declare repairsCompleted as boolean;
declare tenantRetaliationOccurred as boolean;

assert antiDiscrimination { Not(Or(landlordDiscriminatedOnRace, landlordDiscriminatedOnColor, landlordDiscriminatedOnReligion, landlordDiscriminatedOnNationalOrigin, landlordDiscriminatedOnSex, landlordDiscriminatedOnDisability, landlordDiscriminatedOnFamilialStatus)) };
assert voucherAcceptance { Not(Or(tenantRefusedDueToSection8Voucher, landlordDiscriminatedOnSourceOfIncome)) };
assert disabilityAccommodation { Not(disabilityAccommodationsNotProvided) };
assert housingQuality { And(unitPassedInspection, unitMeetsHousingQuality) };
assert rentCompliance { And(rentWithinLimits, Not(extraPaymentsOutsideHAP)) };
assert habitabilityMaintenance { And(unitHabitable, repairsCompleted) };
assert noRetaliation { Not(tenantRetaliationOccurred) };

Note the declarations at the top of this program; these form the basis for the questions in the final configuration step.
One should always review this code and ensure that (being LLM-generated) it accurately reflects the semantics and logic from the rules. See best practices for additional advice.
- You can review the DSAIL Language Guide for DSAIL background and documentation.
As with the previous steps, the process can start with the code section, skipping the natural language steps above, but this is considered poor practice in most situations.

Code Generation

Click [Generate Questions] - This converts each declared variable into a data extraction question for use at inference time.
- You can click into any of these questions if the text should be altered, or if additional context (to an LLM performing the extraction) would be helpful.
Click [Create] and find your new guardrail at the top of the list, as with the Hello World Tutorial.

Testing the guardrail in the playground

The test will function much as before in the Hello World Tutorial, however there is an additional complexity this time: the inclusion of the Human Review (a checkbox we selected early in the configuration process). Open the Human Review Dashboard (ctrl/cmd-click the Human Review tab from the navbar) in a new browser, and bring that side-by-side with our main working browser.

Human Review Nav Link

Select the new guardrail (named Section 8 Tutorial) in the test input selector. Submit the following sample:

Context

APPLICANT: I would like to schedule a tour. 1 bedroom I have a section 8 voucher. Thanks

Landlord: We're excited to hear that you are interested in renting with us at Mosaic! I'm sorry but we cannot accept section 8 vouchers. To view all of our available listings, schedule a tour, or fill out an application, please visit {{URL}}

If you have any questions, you can reply to this email. We look forward to hearing back from you!
Hugs & Kisses,
Landlord

Question

Is there a fair housing violation?

Answer (Note: this is ignored by the guardrail. But it helps illustrate our expectation of compliance.)

No

And click [Submit]. Note: don't dawdle before completing the following review step, as the playground has a timeout period

This time, before you receive a playground response, you'll see the request appear in the dashboard. This is a user's opportunity to review the LLM extraction of answers (to the generated questions), and correct any mistakes.

Human Review Dashboard

For now, we'll simply assume correctness and click [Accept] (but feel free to correct a few answers if you like). This will return a final response to the playground. This response includes those answers in JSON format, but also includes the Eval and Proof sections. Each entry for proof corresponds to one of our rule assertions.

Results

This guardrail can now be run in inference, just as described in the Hello World Tutorial.