in-path routing, enforcement, and provider selection for llm and agentic apps
once a request is identified (identifiabl), transformed (transformabl), and authorized (validatabl), it needs to be sent to the right place.
which provider should handle this request? which model? under what context?
proxyabl answers that question.
it is the in-path gateway that sits between your apps and llm providers, enforcing routing rules and injecting identity and governance context into every call.
proxyabl is the request routing and execution gateway for llm apps.
it lets you:
๐ฆ implementation:
ai-routing-gateway(roadmap)
as organizations adopt multiple llm providers and internal models, they need:
proxyabl turns your llm access into a governed endpoint rather than a set of scattered direct api calls.
all gatewaystack modules operate on a shared RequestContext object.
proxyabl is responsible for:
identity, metadata, modelRequest, policyDecision, limitsDecisionroutingDecision (provider, model, region, alias) and normalized provider responseproxyabl receives policy-approved requests from validatabl and pre-flight constraints from limitabl, then:
it becomes the central network and execution layer of gatewaystack.
1. selectProvider โ choose which provider should receive the request
selection can be based on tenant, content classification, region, cost, latency, or internal policy.
2. selectModel โ choose the appropriate model for the request
for example, different models for summarization vs retrieval vs reasoning, or for different sensitivity levels.
3. injectContext โ attach identity and governance metadata
ensures downstream systems receive user_id, org_id, roles, scopes, and policy decisions as headers or metadata.
4. forward โ execute the call against the selected provider/model
wraps the provider api, normalizes request and response shapes as needed.
5. fallback โ handle provider or model failures
retry, fallback to secondary providers, or gracefully degrade behavior based on configuration.
6. preflightChecks โ final checks before execution
ensures the request still satisfies any pre-execution requirements (for example, global flags, maintenance modes, additional filters).
7. loadBalance โ distribute traffic across provider backends
supports weighted round-robin, least-latency, or fail-over strategies across multiple endpoints for a given provider.
routingDecision field in RequestContextidentifiabl to authenticate users (see )transformabl transform content (see )validatabl to make policy decisions (see )limitabl to enforce quotas or budgets (see )explicabl as the primary audit log (though proxyabl contributes metadata)proxyabl evaluates routing rules to select provider and model:
routing:
- name: "sensitive-data-to-azure"
priority: 1
condition: content.metadata.classification == "sensitive"
provider: "azure-openai"
model: "gpt-4"
- name: "eu-users-to-eu-region"
priority: 2
condition: identity.user.region == "eu"
provider: "azure-openai-eu"
model: "gpt-4"
- name: "free-tier-to-cheap-model"
priority: 3
condition: identity.user.tier == "free"
provider: "openai"
model: "gpt-3.5-turbo"
- name: "default"
priority: 999
provider: "openai"
model: "gpt-4-turbo"
rules are evaluated in priority order, first match wins.
proxyabl normalizes provider-specific responses to a unified format:
// unified response format (regardless of provider)
{
content: "Response text",
metadata: {
provider: "openai",
model: "gpt-4",
tokens: { input: 100, output: 50 },
latency_ms: 234,
request_id: "req_abc123"
}
}
this decouples app code from provider apis, enabling seamless provider switching.
proxyabl implements multi-layered failover:
retry strategy:
provider fallback:
health checks:
fallback_chain:
- provider: "openai-primary"
timeout: 10s
- provider: "openai-secondary"
timeout: 10s
- provider: "anthropic"
timeout: 15s
apps reference logical model names, not provider-specific models:
aliases:
smart-model:
default: "gpt-4"
overrides:
- condition: identity.org_id == "healthcare_org"
model: "azure-openai:gpt-4" # HIPAA compliant
- condition: identity.user.tier == "free"
model: "gpt-3.5-turbo"
fast-model:
default: "gpt-3.5-turbo"
code-model:
default: "gpt-4-code-interpreter"
app code remains unchanged when swapping models:
// app requests "smart-model", proxyabl resolves to actual model
await gatewaystack.chat({ model: "smart-model", ... })
proxyabl manages provider-side authentication and secrets:
example secrets configuration:
secrets:
backend: "vault"
vault_addr: "https://vault.company.com"
providers:
openai:
path: "secret/gatewaystack/openai"
key_field: "api_key"
azure-openai:
path: "secret/gatewaystack/azure"
key_field: "api_key"
tenant_specific: true # separate keys per tenant
this keeps sensitive provider secrets out of application code and centralized in the gateway.
user
โ identifiabl (who is calling?)
โ transformabl (prepare, clean, classify, anonymize)
โ validatabl (is this allowed?)
โ limitabl (can they afford it? pre-flight constraints)
โ proxyabl (where does it go? execute)
โ llm provider (model call)
โ [limitabl] (deduct actual usage, update quotas/budgets)
โ explicabl (what happened?)
โ response
proxyabl is where approved requests actually become executed calls to llm providers โ with routing decisions driven by identity, content, policy, and spend constraints.
proxyabl plugs into gatewaystack and your existing llm stack without requiring application-level changes. it exposes http middleware and sdk hooks for:
for routing configuration examples:
โ routing patterns library
โ model aliasing guide
โ provider fallback strategies
for secrets and authentication:
โ secrets management guide
for implementation:
โ integration guide
want to explore the full gatewaystack architecture?
โ view the gatewaystack github repo
want to contact us for enterprise deployments?
โ reducibl applied ai studio
every request flows from your app through gatewaystack's modules before it reaches an llm provider โ identified, transformed, validated, constrained, routed, and audited.