Version: 2.21-unstable

AIMLAPIChatGenerator

AIMLAPIChatGenerator enables chat completion using AI models through the AIMLAPI.


Most common position in a pipeline	After a ChatPromptBuilder
Mandatory init variables	`api_key`: The AIMLAPI API key. Can be set with `AIMLAPI_API_KEY` env var.
Mandatory run variables	`messages` A list of `ChatMessage` objects
Output variables	`replies`: A list of `ChatMessage` objects `meta`: A list of dictionaries with the metadata associated with each reply, such as token count, finish reason, and so on
API reference	AIMLAPI
GitHub link	https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/aimlapi

Overview

AIMLAPIChatGenerator provides access to AI models through the AIMLAPI, a unified API gateway for models from various providers. You can use different models within a single pipeline with a consistent interface. The default model is openai/gpt-5-chat-latest.

AIMLAPI uses a single API key for all providers, which allows you to switch between or combine different models without managing multiple credentials.

For a complete list of available models, check the AIMLAPI documentation.

The component needs a list of ChatMessage objects to operate. ChatMessage is a data class that contains a message, a role (who generated the message, such as user, assistant, system, function), and optional metadata.

You can pass any chat completion parameters valid for the underlying model directly to AIMLAPIChatGenerator using the generation_kwargs parameter, both at initialization and to the run() method.

Authentication

AIMLAPIChatGenerator needs an AIMLAPI API key to work. You can set this key in:

The api_key init parameter using Secret API
The AIMLAPI_API_KEY environment variable (recommended)

Structured Output

AIMLAPIChatGenerator supports structured output generation for compatible models, allowing you to receive responses in a predictable format. You can use Pydantic models or JSON schemas to define the structure of the output through the response_format parameter in generation_kwargs.

This is useful when you need to extract structured data from text or generate responses that match a specific format.

python

from pydantic import BaseModel
from haystack.dataclasses import ChatMessage
from haystack_integrations.components.generators.aimlapi import AIMLAPIChatGenerator

class CityInfo(BaseModel):
    city_name: str
    country: str
    population: int
    famous_for: str

client = AIMLAPIChatGenerator(
    model="openai/gpt-4o-2024-08-06",
    generation_kwargs={"response_format": CityInfo}
)

response = client.run(messages=[
    ChatMessage.from_user(
        "Berlin is the capital and largest city of Germany with a population of "
        "approximately 3.7 million. It's famous for its history, culture, and nightlife."
    )
])
print(response["replies"][0].text)

>> {"city_name":"Berlin","country":"Germany","population":3700000,
>> "famous_for":"history, culture, and nightlife"}

Model Compatibility

Structured output support depends on the underlying model. OpenAI models starting from gpt-4o-2024-08-06 support Pydantic models and JSON schemas. For details on which models support this feature, refer to the respective model provider's documentation.

Tool Support

AIMLAPIChatGenerator supports function calling through the tools parameter, which accepts flexible tool configurations:

A list of Tool objects: Pass individual tools as a list
A single Toolset: Pass an entire Toolset directly
Mixed Tools and Toolsets: Combine multiple Toolsets with standalone tools in a single list

This allows you to organize related tools into logical groups while also including standalone tools as needed.

python

from haystack.tools import Tool, Toolset
from haystack_integrations.components.generators.aimlapi import AIMLAPIChatGenerator

# Create individual tools
weather_tool = Tool(name="weather", description="Get weather info", ...)
news_tool = Tool(name="news", description="Get latest news", ...)

# Group related tools into a toolset
math_toolset = Toolset([add_tool, subtract_tool, multiply_tool])

# Pass mixed tools and toolsets to the generator
generator = AIMLAPIChatGenerator(
    tools=[math_toolset, weather_tool, news_tool]  # Mix of Toolset and Tool objects
)

For more details on working with tools, see the Tool and Toolset documentation.

Streaming

AIMLAPIChatGenerator supports streaming the tokens from the LLM directly in output. To do so, pass a function to the streaming_callback init parameter.

You can stream output as it's generated. Pass a callback to streaming_callback. Use the built-in print_streaming_chunk to print text tokens and tool events (tool calls and tool results).

python

from haystack.components.generators.utils import print_streaming_chunk

# Configure the generator with a streaming callback
component = AIMLAPIChatGenerator(streaming_callback=print_streaming_chunk)

# Pass a list of messages
from haystack.dataclasses import ChatMessage
component.run([ChatMessage.from_user("Your question here")])

info

Streaming works only with a single response. If a provider supports multiple candidates, set n=1.

See our Streaming Support docs to learn more how StreamingChunk works and how to write a custom callback.

We recommend to give preference to print_streaming_chunk by default. Write a custom callback only if you need a specific transport (for example, SSE/WebSocket) or custom UI formatting.

Usage

Install the aimlapi-haystack package to use the AIMLAPIChatGenerator:

shell

pip install aimlapi-haystack

On its own

python

from haystack.components.generators.utils import print_streaming_chunk
from haystack.dataclasses import ChatMessage
from haystack_integrations.components.generators.aimlapi import AIMLAPIChatGenerator

client = AIMLAPIChatGenerator(model="openai/gpt-5-chat-latest", streaming_callback=print_streaming_chunk)

response = client.run([ChatMessage.from_user("What's Natural Language Processing? Be brief.")])

>> Natural Language Processing (NLP) is a field of artificial intelligence that
>> focuses on the interaction between computers and humans through natural language.
>> It involves enabling machines to understand, interpret, and generate human
>> language in a meaningful way, facilitating tasks such as language translation,
>> sentiment analysis, and text summarization.

print(response)

>> {'replies': [ChatMessage(_role=<ChatRole.ASSISTANT: 'assistant'>, _content=
>> [TextContent(text='Natural Language Processing (NLP) is a field of artificial
>> intelligence that focuses on enabling computers to understand, interpret, and
>> generate human language in a meaningful and useful way.')], _name=None,
>> _meta={'model': 'openai/gpt-5-chat-latest', 'index': 0,
>> 'finish_reason': 'stop', 'usage': {'completion_tokens': 36,
>> 'prompt_tokens': 15, 'total_tokens': 51}})]}

With multimodal inputs:

python

from haystack.dataclasses import ChatMessage, ImageContent
from haystack_integrations.components.generators.aimlapi import AIMLAPIChatGenerator

# Use a multimodal model
llm = AIMLAPIChatGenerator(model="openai/gpt-4o")

image = ImageContent.from_file_path("apple.jpg", detail="low")
user_message = ChatMessage.from_user(content_parts=[
    "What does the image show? Max 5 words.",
    image
])

response = llm.run([user_message])["replies"][0].text
print(response)

>>> Red apple on straw.

In a Pipeline

python

from haystack.components.builders import ChatPromptBuilder
from haystack_integrations.components.generators.aimlapi import AIMLAPIChatGenerator
from haystack.dataclasses import ChatMessage
from haystack import Pipeline

# No parameter init, we don't use any runtime template variables
prompt_builder = ChatPromptBuilder()
llm = AIMLAPIChatGenerator()

pipe = Pipeline()
pipe.add_component("prompt_builder", prompt_builder)
pipe.add_component("llm", llm)
pipe.connect("prompt_builder.prompt", "llm.messages")

location = "Berlin"
messages = [
    ChatMessage.from_system("Always respond in German even if some input data is in other languages."),
    ChatMessage.from_user("Tell me about {{location}}")
]
pipe.run(data={"prompt_builder": {"template_variables": {"location": location}, "template": messages}})

>> {'llm': {'replies': [ChatMessage(_role=<ChatRole.ASSISTANT: 'assistant'>,
>> _content=[TextContent(text='Berlin ist die Hauptstadt Deutschlands und eine der
>> bedeutendsten Städte Europas. Es ist bekannt für ihre reiche Geschichte,
>> kulturelle Vielfalt und kreative Scene.')],
>> _name=None, _meta={'model': 'openai/gpt-5-chat-latest', 'index': 0,
>> 'finish_reason': 'stop', 'usage': {'completion_tokens': 120,
>> 'prompt_tokens': 29, 'total_tokens': 149}})]}

Using multiple models in one pipeline:

python

from haystack.components.builders import ChatPromptBuilder
from haystack_integrations.components.generators.aimlapi import AIMLAPIChatGenerator
from haystack.dataclasses import ChatMessage
from haystack import Pipeline

# Create a pipeline that uses different models for different tasks
prompt_builder = ChatPromptBuilder()
# Use one model for complex reasoning
reasoning_llm = AIMLAPIChatGenerator(model="anthropic/claude-3-5-sonnet")
# Use another model for simple tasks
simple_llm = AIMLAPIChatGenerator(model="openai/gpt-5-chat-latest")

pipe = Pipeline()
pipe.add_component("prompt_builder", prompt_builder)
pipe.add_component("reasoning", reasoning_llm)
pipe.add_component("simple", simple_llm)

# Feed the same prompt to both models
pipe.connect("prompt_builder.prompt", "reasoning.messages")
pipe.connect("prompt_builder.prompt", "simple.messages")

messages = [ChatMessage.from_user("Explain quantum computing in simple terms.")]
result = pipe.run(data={"prompt_builder": {"template": messages}})

print("Reasoning model:", result["reasoning"]["replies"][0].text)
print("Simple model:", result["simple"]["replies"][0].text)

With tool calling:

python

from haystack import Pipeline
from haystack.components.tools import ToolInvoker
from haystack.dataclasses import ChatMessage
from haystack.tools import Tool
from haystack_integrations.components.generators.aimlapi import AIMLAPIChatGenerator

def weather(city: str) -> str:
    """Get weather for a given city."""
    return f"The weather in {city} is sunny and 32°C"

tool = Tool(
    name="weather",
    description="Get weather for a given city",
    parameters={"type": "object", "properties": {"city": {"type": "string"}}, "required": ["city"]},
    function=weather,
)

pipeline = Pipeline()
pipeline.add_component("generator", AIMLAPIChatGenerator(tools=[tool]))
pipeline.add_component("tool_invoker", ToolInvoker(tools=[tool]))

pipeline.connect("generator", "tool_invoker")

results = pipeline.run(
    data={
        "generator": {
            "messages": [ChatMessage.from_user("What's the weather like in Paris?")],
            "generation_kwargs": {"tool_choice": "auto"},
        }
    }
)

print(results["tool_invoker"]["tool_messages"][0].tool_call_result.result)
>> The weather in Paris is sunny and 32°C

Overview​

Authentication​

Structured Output​

Tool Support​

Streaming​

Usage​

On its own​

In a Pipeline​

Overview

Authentication

Structured Output

Tool Support

Streaming

Usage

On its own

In a Pipeline