Skip to main content
Version: 3.12

ai-proxy

Description#

The ai-proxy plugin simplifies access to LLM providers and models by defining a standard request format that allows key fields in plugin configuration to be embedded into the request.

Proxying requests to OpenAI is supported now. Other LLM services will be supported soon.

Request Format#

OpenAI#

  • Chat API
NameTypeRequiredDescription
messagesArrayYesAn array of message objects
messages.roleStringYesRole of the message (system, user, assistant)
messages.contentStringYesContent of the message

Plugin Attributes#

FieldRequiredTypeDescription
authYesObjectAuthentication configuration
auth.headerNoObjectAuthentication headers. Key must match pattern ^[a-zA-Z0-9._-]+$.
auth.queryNoObjectAuthentication query parameters. Key must match pattern ^[a-zA-Z0-9._-]+$.
model.providerYesStringName of the AI service provider (openai).
model.nameYesStringModel name to execute.
model.optionsNoObjectKey/value settings for the model
override.endpointNoStringOverride the endpoint of the AI provider
timeoutNoIntegerTimeout in milliseconds for requests to LLM. Range: 1 - 60000. Default: 30000
keepaliveNoBooleanEnable keepalive for requests to LLM. Default: true
keepalive_timeoutNoIntegerKeepalive timeout in milliseconds for requests to LLM. Minimum: 1000. Default: 60000
keepalive_poolNoIntegerKeepalive pool size for requests to LLM. Minimum: 1. Default: 30
ssl_verifyNoBooleanSSL verification for requests to LLM. Default: true

Example usage#

Create a route with the ai-proxy plugin like so:

curl "http://127.0.0.1:9180/apisix/admin/routes/1" -X PUT \
-H "X-API-KEY: ${ADMIN_API_KEY}" \
-d '{
"uri": "/anything",
"plugins": {
"ai-proxy": {
"auth": {
"header": {
"Authorization": "Bearer <some-token>"
}
},
"model": {
"provider": "openai",
"name": "gpt-4",
"options": {
"max_tokens": 512,
"temperature": 1.0
}
}
}
},
"upstream": {
"type": "roundrobin",
"nodes": {
"somerandom.com:443": 1
},
"scheme": "https",
"pass_host": "node"
}
}'

Upstream node can be any arbitrary value because it won't be contacted.

Now send a request:

curl http://127.0.0.1:9080/anything -i -XPOST  -H 'Content-Type: application/json' -d '{
"messages": [
{ "role": "system", "content": "You are a mathematician" },
{ "role": "user", "a": 1, "content": "What is 1+1?" }
]
}'

You will receive a response like this:

{
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": "The sum of \\(1 + 1\\) is \\(2\\).",
"role": "assistant"
}
}
],
"created": 1723777034,
"id": "chatcmpl-9whRKFodKl5sGhOgHIjWltdeB8sr7",
"model": "gpt-4o-2024-05-13",
"object": "chat.completion",
"system_fingerprint": "fp_abc28019ad",
"usage": { "completion_tokens": 15, "prompt_tokens": 23, "total_tokens": 38 }
}

Send request to an OpenAI compatible LLM#

Create a route with the ai-proxy plugin with provider set to openai-compatible and the endpoint of the model set to override.endpoint like so:

curl "http://127.0.0.1:9180/apisix/admin/routes/1" -X PUT \
-H "X-API-KEY: ${ADMIN_API_KEY}" \
-d '{
"uri": "/anything",
"plugins": {
"ai-proxy": {
"auth": {
"header": {
"Authorization": "Bearer <some-token>"
}
},
"model": {
"provider": "openai-compatible",
"name": "qwen-plus"
},
"override": {
"endpoint": "https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions"
}
}
},
"upstream": {
"type": "roundrobin",
"nodes": {
"somerandom.com:443": 1
},
"scheme": "https",
"pass_host": "node"
}
}'