AI functions
AI Functions are built-in functions in ClickHouse that you can use to call AI or generate embeddings to work with your data, extract information, classify data, etc...
AI functions can return unpredictable outputs. The result will highly depend on the quality of the prompt and the model used.
All functions are sharing a common infrastructure that provides:
- Quota enforcement: Per-query limits on tokens (
ai_function_max_input_tokens_per_query,ai_function_max_output_tokens_per_query) and API calls (ai_function_max_api_calls_per_query). - Retry with backoff: Transient failures are retried (
ai_function_max_retries) with exponential backoff (ai_function_retry_initial_delay_ms).
Configuration
AI functions reference a named collection that stores provider credentials and configuration. The first argument to each function is the name of this collection.
Example statement to create a named collection with provider credentials:
Named collection parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
provider | String | — | Model provider. Supported: 'openai', 'anthropic'. See note below. |
endpoint | String | — | API endpoint URL. |
model | String | — | Model name (e.g. 'gpt-4o-mini', 'text-embedding-3-small'). |
api_key | String | — | Authentication key for the provider. |
max_tokens | UInt64 | 1024 | Maximum number of output tokens per API call. |
api_version | String | — | API version string. Used by Anthropic ('2023-06-01'). |
Any OpenAI-compatible API (e.g. vLLM, Ollama, LiteLLM) can be used by setting provider = 'openai' and pointing the endpoint to your service.
Query-level settings
All AI-related settings are listed in Settings under the ai_function_ prefix.
Supported providers
| Provider | provider value | Chat functions | Notes |
|---|---|---|---|
| OpenAI | 'openai' | Yes | Default provider. |
| Anthropic | 'anthropic' | Yes | Uses /v1/messages endpoint. |
Observability
AI function activity is tracked through ClickHouse ProfileEvents:
| ProfileEvent | Description |
|---|---|
AIAPICalls | Number of HTTP requests made to the AI provider. |
AIInputTokens | Total input tokens consumed. |
AIOutputTokens | Total output tokens consumed. |
AIRowsProcessed | Number of rows that received a result. |
AIRowsSkipped | Number of rows skipped (quota exceeded, or error with ai_function_throw_on_error = 0). |
Query these events: