The Teleon Stack: Helix, Cortex, and Sentinel

We talk about Helix, Cortex, and Sentinel because they describe the three problems every serious agent team runs into. If you’re building agents beyond the proof-of-concept stage, you’ll face all three. Here’s how each component works and why we built them the way we did.

Helix: the runtime

Helix is runtime: how your agent runs, scales, and stays available. It handles the infrastructure layer so you never have to think about Kubernetes, Docker networking, or load balancers.

When you run `teleon deploy`, Helix packages your agent, provisions compute, and configures auto-scaling. Your agent scales from zero replicas (when idle) to over a thousand (under load) automatically. You pay only for active compute.

Deployments are rolling, which means zero downtime. If the new version fails health checks, Helix rolls back instantly. This happens automatically, you don’t need to watch the deploy or write rollback scripts.

Key Helix capabilities: auto-scaling from 0 to 1000+ replicas, cold starts under 50ms, rolling deployments with instant rollback, built-in health monitoring with self-healing, and multi-region deployment with automatic failover.

Cortex: the memory system

Cortex is memory: what the agent learns, stores, and recalls. It provides persistent, searchable memory with just six methods: store, search, get, update, delete, and count.

What makes Cortex different from a generic database is scope enforcement. In a multi-tenant environment, you need absolute guarantees that one user’s data never leaks into another user’s agent interactions. Cortex enforces this at the infrastructure level through scope fields. When you set `scope: { user_id: "alice" }`, every operation is automatically filtered. There’s no way to accidentally query across users.

Cortex also supports layered memory. The company layer holds shared knowledge accessible to all agents. The team layer holds group-specific context. The personal layer holds individual user data. Each layer has its own scope rules and retention policies.

Before your agent runs, Cortex automatically retrieves recent history and semantically relevant memories, then injects them into the agent’s context. Your agent starts every interaction with the right background, without any manual retrieval code.

Sentinel: the safety layer

Sentinel is safety: the guardrails that keep tool use, data access, and behavior within bounds.

Sentinel operates at two points: input validation (before the agent processes the request) and output validation (before the response reaches the user). Both are configurable through a simple Python API.

Content filtering detects toxicity, hate speech, and profanity with configurable severity thresholds. PII detection identifies and redacts emails, phone numbers, SSNs, credit cards, and custom patterns in real time. Compliance enforcement automatically applies GDPR, HIPAA, SOC 2, and CCPA requirements to every interaction.

Every enforcement action is logged to a full audit trail. You can query logs by agent, user, policy, or time window. For enterprise teams, this audit trail is often a compliance requirement in itself.

How they work together

The three components are designed to compose. Helix runs your agent and provides the infrastructure. Cortex supplies the memory that makes each interaction context-aware. Sentinel enforces safety policies across all of it.

A typical request flows like this: the user sends a message, Sentinel validates the input, Cortex retrieves relevant context, the agent processes the request with full context and safety constraints, Sentinel validates the output, and Helix returns the response with full observability.

Build each part yourself, or use Teleon

You can build each part yourself. Many teams do. Teleon’s job is to make that path shorter, clearer, and less error-prone, especially as you move from one agent to many. Check out the features page for the complete platform overview, or visit our pricing to see which plan fits your team.