Skip to content

LLM guards

Guards check AI agent interactions in real time. They catch inappropriate questions before the agent sees them and screen responses before users receive them. Unlike evaluations that test agents before deployment, guards run during live conversations.

How guards work

Guards check conversations at two points:

Input guards analyze user questions before the agent processes them. They filter out off-topic requests, block policy violations, or ask for clarification.

Output guards examine agent responses before delivery. They verify quality, redact sensitive information, and catch hallucinations or harmful content.

Complete PII protection

Guards work at the agent level. For platform-level PII protection that catches sensitive information in user inputs before they reach any agent, see Data Anonymization which covers Presidio integration. Use both layers for defense-in-depth.

Available guards

The Swiss AI Hub includes several guards that address specific risks. Which guards you can enable depends on how your agent was built.

Input guards

Agent description guard: Checks that questions match what the agent does. A financial compliance agent would block "What's the weather?" and explain it only handles financial questions.

Few-shot guard: Enforces custom policies through examples. If your company prohibits using work assistants for entertainment, you'd provide examples like "Recommend a movie" (blocked) and "Recommend a project management tool" (allowed). The guard learns to recognize similar patterns.

Output guards

Context sufficient guard: Checks whether the agent has enough information to answer accurately. Particularly useful for RAG agents that pull from knowledge bases. If a user asks a detailed technical question but the retrieved documents don't contain enough detail, the guard stops the response and tells the user the information isn't available.

Configuration note

Some agents (like the RAG Agent) can use the context sufficient guard automatically to prevent responses without adequate evidence.

Sensitive info guard: Detects and redacts confidential or personally identifiable information in agent responses. This catches PII that appears in retrieved documents. For example, if an agent pulls a document containing an employee email address, the guard redacts it before the user sees it, replacing it with [REDACTED].

When to use guards

Agent typeRecommended guards
Customer-facing agentsAgent description guard, Few-shot guard (for policies), Context sufficient guard
Compliance-critical domains (healthcare, finance, legal)All guards + Presidio PII protection
Internal knowledge assistantsAgent description guard, Context sufficient guard
Narrow-scope specialized agentsContext sufficient guard (minimal guardrails needed)
Development/testing environmentsOptional (prioritize speed over safety)

Relationship with Presidio

Guards and Presidio anonymization operate at different layers to provide complete PII protection:

LayerComponentPurpose
LiteLLM proxy (platform-level)PresidioRemoves PII from user questions before they reach external LLM providers
Agent (application-level)Input guardsValidate question appropriateness and scope
Agent (application-level)Output guards (Sensitive info guard)Detect PII in responses from retrieved documents

Presidio protects user input from being sent to external providers. The sensitive info guard protects agent responses that might contain PII from your knowledge base documents. Both are needed for complete PII protection.

Configuration

Guards get built into agents during development. How much control you have depends on the agent's design. Some agents ship with mandatory guards you can't disable. Others let you toggle specific guards through the configuration interface. Some don't support customization at all.

Built with ❤️ in Switzerland 🇨🇭