Skip to main content

Announcing APISIX Integration with AI/ML API

· 4 min read
Yilia Lin

We're thrilled to announce that AI/ML API has become a supported provider to the ai-proxy, ai-proxy-multi, and ai-request-rewrite plugins in Apache APISIX. All the AI/ML APIs will be supported in the next APISIX version.

Introduction

AI/ML API is a single endpoint that gives you access to more than 300 ready-to-use AI models—large language models, embeddings, image and audio tools—through one standard REST interface. It is used by over 150,000 developers and organizations as a centralized LLM API gateway.

We're thrilled to announce that AI/ML API has become a supported provider to the ai-proxy, ai-proxy-multi, and ai-request-rewrite plugins in Apache APISIX.

AI/ML API provides a unified OpenAI-compatible API with access to 300+ LLMs such as GPT-4, Claude, Gemini, DeepSeek, and others. This integration bridges the gap between your API infrastructure and leading AI services, enabling you to deploy intelligent features—like chatbots, real-time translations, and data analysis—faster than ever.

Proxy to OpenAI via AI/ML API

Prerequisites

  1. Install APISIX.
  2. Generate your API key on AI/ML API dashboard.
    Generate AI/ML API Key
    Click to Preview

Configure the Route

Create a route and configure the ai-proxy plugin as such:

curl "http://127.0.0.1:9180/apisix/admin/routes" -X PUT \
-H "X-API-KEY: ${ADMIN_API_KEY}" \
-d '{
"id": "ai-proxy-route",
"uri": "/anything",
"methods": ["POST"],
"plugins": {
"ai-proxy": {
"provider": "aimlapi",
"auth": {
"header": {
"Authorization": "Bearer '"$OPENAI_API_KEY"'" # Generated openai key from AI/ML API dashboard
}
},
"options":{
"model": "gpt-4"
}
}
}
}'

Test the Integration

Send a POST request to the route with a system prompt and a sample user question in the request body:

curl "http://127.0.0.1:9080/anything" -X POST \
-H "Content-Type: application/json" \
-H "Host: api.openai.com" \
-d '{
"messages": [
{ "role": "system", "content": "You are a mathematician" },
{ "role": "user", "content": "What is 1+1?" }
]
}'

Verify Response

You should receive a response similar to the following:

{
...,
"choices": [
{
"index": 0,
"finish_reason": "stop",
"logprobs": null,
"message": {
"role": "assistant",
"content": "1 + 1 equals 2.",
"refusal": null,
"annotations": []
}
}
],
"created": 1753845968,
"model": "gpt-4-0613",
"usage": {
"prompt_tokens": 1449,
"completion_tokens": 1008,
"total_tokens": 2457
...
}

Core Use Cases

  1. Unified AI Service Management

    • Multi-Model Proxy and Load Balancing: Replace hardcoded vendor endpoints with a single APISIX interface, dynamically routing requests to models from OpenAI, Claude, DeepSeek, Gemini, Mistral, etc., based on cost, latency, or performance needs.
    • Vendor-Agnostic Workflows: Seamlessly switch between models (e.g., GPT-4 for creative tasks, Claude for document analysis) without code changes.
  2. Cost-Optimized Token Governance

    • Token-Based Budget Enforcement: Set per-team/monthly spending limits; auto-throttle requests when thresholds are exceeded.
    • Caching & Fallbacks: Cache frequent LLM responses (e.g., FAQ answers) or reroute to cheaper models during provider outages.
  3. Real-Time AI Application Scaling

    • Chatbots & Virtual Agents: Power low-latency conversational interfaces with streaming support for token-by-token responses.
    • Data Enrichment Pipelines: Augment APIs with AI—e.g., auto-summarize user reviews or translate product descriptions on-the-fly.
  4. Hybrid/Multi-Cloud AI Deployment

    • Unified Control Plane: Manage on-prem LLMs (e.g., Llama 3) alongside cloud APIs (OpenAI, Azure) with consistent policy enforcement.
    • High Availability & Fault Tolerance: Built-in health-checks, automatic retries and failover; if one LLM fails, traffic is rerouted within seconds to keep services alive.
  5. Enterprise AI Security & Compliance

    • Data Security and Compliance: Prompt Guard, content moderation, PII redaction and full audit logs in a single place.
    • One Auth Layer for 300+ LLMs: Unified authentication (JWT/OAuth2/OIDC) and authorization for 300+ LLM keys and policies.

Conclusion

With AI/ML API now natively supported in Apache APISIX, you no longer have to choose between speed, security, or scale—you get all three.

  • One line of YAML turns your gateway into a 300-model AI powerhouse.
  • Zero code changes let you hot-swap GPT-4 for Claude, or route 10 % of traffic to a cheaper model for instant cost savings.
  • Built-in guardrails (PII redaction, token budgets, content moderation) keep compliance teams happy while your product team ships faster.

More Resources