Hugging Face
Overview
This guide explains how to integrate Akto AI Agent Proxy with a Hugging Face Private Inference Endpoint used by customers to run private LLM inference. The proxy sits between the end user and the agent application (Option B) to monitor, enforce guardrails, and log model invocation traffic without modifying internal client code.
Akto AI Agent Proxy provides:
Guardrail enforcement on both requests and responses
Sensitive data redaction
Security threat detection
Hugging Face’s Private Inference Endpoint provides a dedicated, managed model endpoint accessible only via AWS PrivateLink from within a VPC. Hugging Face does not automatically log full prompt & response conversations like AWS Bedrock, so Akto must capture this upstream.
Prerequisites
Before integrating Akto Proxy:
A working Hugging Face Private Inference Endpoint configured with PrivateLink.
AWS VPC where the endpoint service is reachable.
The AI agent and Akto Proxy deployed in the same VPC or with network access to the PrivateLink interfaces.
Access credentials for Hugging Face inference (API token).
Architecture Diagram
End user calls the AI agent API.
Akto Proxy intercepts requests (guardrail enforcement).
Proxy forwards to HF Private Inference Endpoint (via PrivateLink).
Responses pass back through Akto Proxy.
Akto logs, analyzes and optionally redacts or blocks results.
Setup Steps
Validate Integration
Verify end-to-end flow:
Send an inference request from the user
Akto Proxy receives and logs the call
Proxy enforces any guardrails
Proxy forwards to HF Private Endpoint
Response returns through Akto Proxy
Logs appear in Akto dashboard
Look for:
Request/response pairs in proxy logs
Guardrail hits (if configured)
Redaction results
Security & Guardrails
Akto Proxy supports:
Request guardrails (input sanitization)
Response guardrails (filtering outputs)
Redaction of sensitive tokens or PII
Rate limiting and anomaly detection
Use our policy packs or define custom rules based on:
Content patterns
Risk categories
Endpoint sensitivity
Logging & Monitoring
Hugging Face Private Endpoints offer:
Operational logs (status, errors)
Metrics (latency, throughput)
They do not log conversation content by default.
Akto Proxy will log:
Full request and response traces
Guardrail decision events
Alerts and incidents
Metadata for analytics
Troubleshooting
Proxy cannot reach HF Endpoint: Check PrivateLink and VPC routing.
Auth failures: Verify Hugging Face API token headers are passed by proxy.
No logs in Akto: Confirm AKTO_API_TOKEN and ingestion config.
Guardrail not triggering: Validate rule pack configuration.
Summary
By integrating Akto AI Agent Proxy in front of a Hugging Face Private Inference Endpoint:
You achieve guardrail enforcement without modifying the client code
You capture and monitor model invocation traffic
You gain observability of conversation logging
Akto Proxy becomes the enforcement and observability layer for private HF model usage.
Last updated