Version: 2.20

OpenAIResponsesChatGenerator

OpenAIResponsesChatGenerator enables chat completion using OpenAI's Responses API with support for reasoning models.


Most common position in a pipeline	After a ChatPromptBuilder
Mandatory init variables	`api_key`: An OpenAI API key. Can be set with `OPENAI_API_KEY` env var.
Mandatory run variables	`messages`: A list of `ChatMessage` objects representing the chat
Output variables	`replies`: A list of `ChatMessage` objects containing the generated responses
API reference	Generators
GitHub link	https://github.com/deepset-ai/haystack/blob/main/haystack/components/generators/chat/openai_responses.py

Overview

OpenAIResponsesChatGenerator uses OpenAI's Responses API to generate chat completions. It supports gpt-4 and o-series models (reasoning models like o1, o3-mini). The default model is gpt-5-mini.

The Responses API is designed for reasoning-capable models and supports features like reasoning summaries, multi-turn conversations with previous response IDs, and structured outputs.

The component requires a list of ChatMessage objects to operate. ChatMessage is a data class that contains a message, a role (who generated the message, such as user, assistant, system), and optional metadata. See the usage section for examples.

You can pass any parameters valid for the OpenAI Responses API directly to OpenAIResponsesChatGenerator using the generation_kwargs parameter, both at initialization and to the run() method. For more details on the parameters supported by the OpenAI API, refer to the OpenAI Responses API documentation.

OpenAIResponsesChatGenerator can support custom deployments of your OpenAI models through the api_base_url init parameter.

Authentication

OpenAIResponsesChatGenerator needs an OpenAI key to work. It uses an OPENAI_API_KEY environment variable by default. Otherwise, you can pass an API key at initialization with api_key using a Secret:

python

from haystack.components.generators.chat import OpenAIResponsesChatGenerator
from haystack.utils import Secret

generator = OpenAIResponsesChatGenerator(api_key=Secret.from_token("<your-api-key>"))

Reasoning Support

One of the key features of the Responses API is support for reasoning models. You can configure reasoning behavior using the reasoning parameter in generation_kwargs:

python

from haystack.components.generators.chat import OpenAIResponsesChatGenerator
from haystack.dataclasses import ChatMessage

client = OpenAIResponsesChatGenerator(
    generation_kwargs={"reasoning": {"effort": "medium", "summary": "auto"}}
)

messages = [ChatMessage.from_user("What's the most efficient sorting algorithm for nearly sorted data?")]
response = client.run(messages)
print(response)

The reasoning parameter accepts:

effort: Level of reasoning effort - "low", "medium", or "high"
summary: How to generate reasoning summaries - "auto" or "generate_summary": True/False

note

OpenAI does not return the actual reasoning tokens, but you can view the summary if enabled. For more details, see the OpenAI Reasoning documentation.

Multi-turn Conversations

The Responses API supports multi-turn conversations using previous_response_id. You can pass the response ID from a previous turn to maintain conversation context:

python

from haystack.components.generators.chat import OpenAIResponsesChatGenerator
from haystack.dataclasses import ChatMessage

client = OpenAIResponsesChatGenerator()

# First turn
messages = [ChatMessage.from_user("What's quantum computing?")]
response = client.run(messages)
response_id = response["replies"][0].meta.get("id")

# Second turn - reference previous response
messages = [ChatMessage.from_user("Can you explain that in simpler terms?")]
response = client.run(messages, generation_kwargs={"previous_response_id": response_id})

Structured Output

OpenAIResponsesChatGenerator supports structured output generation through the text_format and text parameters in generation_kwargs:

text_format: Pass a Pydantic model to define the structure
text: Pass a JSON schema directly

Using a Pydantic model:

python

from pydantic import BaseModel
from haystack.components.generators.chat import OpenAIResponsesChatGenerator
from haystack.dataclasses import ChatMessage

class BookInfo(BaseModel):
    title: str
    author: str
    year: int
    genre: str

client = OpenAIResponsesChatGenerator(
    model="gpt-4o",
    generation_kwargs={"text_format": BookInfo}
)

response = client.run(messages=[
    ChatMessage.from_user(
        "Extract book information: '1984 by George Orwell, published in 1949, is a dystopian novel.'"
    )
])
print(response["replies"][0].text)

Using a JSON schema:

python

from haystack.components.generators.chat import OpenAIResponsesChatGenerator
from haystack.dataclasses import ChatMessage

json_schema = {
    "format": {
        "type": "json_schema",
        "name": "BookInfo",
        "strict": True,
        "schema": {
            "type": "object",
            "properties": {
                "title": {"type": "string"},
                "author": {"type": "string"},
                "year": {"type": "integer"},
                "genre": {"type": "string"}
            },
            "required": ["title", "author", "year", "genre"],
            "additionalProperties": False
        }
    }
}

client = OpenAIResponsesChatGenerator(
    model="gpt-4o",
    generation_kwargs={"text": json_schema}
)

response = client.run(messages=[
    ChatMessage.from_user(
        "Extract book information: '1984 by George Orwell, published in 1949, is a dystopian novel.'"
    )
])
print(response["replies"][0].text)

Model Compatibility and Limitations

Both Pydantic models and JSON schemas are supported for latest models starting from GPT-4o.
If both text_format and text are provided, text_format takes precedence and the JSON schema passed to text is ignored.
Streaming is not supported when using structured outputs.
Older models only support basic JSON mode through {"type": "json_object"}. For details, see OpenAI JSON mode documentation.
For complete information, check the OpenAI Structured Outputs documentation.

Tool Support

OpenAIResponsesChatGenerator supports function calling through the tools parameter. It accepts flexible tool configurations:

Haystack Tool objects and Toolsets: Pass Haystack Tool objects or Toolset objects, including mixed lists of both
OpenAI/MCP tool definitions: Pass pre-defined OpenAI or MCP tool definitions as dictionaries

Note that you cannot mix Haystack tools and OpenAI/MCP tools in the same call - choose one format or the other.

python

from haystack.tools import Tool
from haystack.components.generators.chat import OpenAIResponsesChatGenerator
from haystack.dataclasses import ChatMessage

def get_weather(city: str) -> str:
    """Get weather information for a city."""
    return f"Weather in {city}: Sunny, 22°C"

weather_tool = Tool(
    name="get_weather",
    description="Get current weather for a city",
    function=get_weather,
    parameters={"type": "object", "properties": {"city": {"type": "string"}}}
)

generator = OpenAIResponsesChatGenerator(tools=[weather_tool])
messages = [ChatMessage.from_user("What's the weather in Paris?")]
response = generator.run(messages)

You can control strict schema adherence with the tools_strict parameter. When set to True (default is False), the model will follow the tool schema exactly. Note that the Responses API has its own strictness enforcement mechanisms independent of this parameter.

For more details on working with tools, see the Tool and Toolset documentation.

Streaming

You can stream output as it's generated. Pass a callback to streaming_callback. Use the built-in print_streaming_chunk to print text tokens and tool events (tool calls and tool results).

python

from haystack.components.generators.utils import print_streaming_chunk

## Configure any `Generator` or `ChatGenerator` with a streaming callback
component = SomeGeneratorOrChatGenerator(streaming_callback=print_streaming_chunk)

## If this is a `ChatGenerator`, pass a list of messages:
## from haystack.dataclasses import ChatMessage
## component.run([ChatMessage.from_user("Your question here")])

## If this is a (non-chat) `Generator`, pass a prompt:
## component.run({"prompt": "Your prompt here"})

info

Streaming works only with a single response. If a provider supports multiple candidates, set n=1.

See our Streaming Support docs to learn more how StreamingChunk works and how to write a custom callback.

Give preference to print_streaming_chunk by default. Write a custom callback only if you need a specific transport (for example, SSE/WebSocket) or custom UI formatting.

Usage

On its own

Here is an example of using OpenAIResponsesChatGenerator independently with reasoning and streaming:

python

from haystack.dataclasses import ChatMessage
from haystack.components.generators.chat import OpenAIResponsesChatGenerator
from haystack.components.generators.utils import print_streaming_chunk

client = OpenAIResponsesChatGenerator(
    streaming_callback=print_streaming_chunk,
    generation_kwargs={"reasoning": {"effort": "high", "summary": "auto"}}
)
response = client.run(
    [ChatMessage.from_user("Solve this logic puzzle: If all roses are flowers and some flowers fade quickly, can we conclude that some roses fade quickly?")]
)
print(response["replies"][0].reasoning)  # Access reasoning summary if available

In a pipeline

This example shows a pipeline that uses ChatPromptBuilder to create dynamic prompts and OpenAIResponsesChatGenerator with reasoning enabled to generate explanations of complex topics:

python

from haystack.components.builders import ChatPromptBuilder
from haystack.components.generators.chat import OpenAIResponsesChatGenerator
from haystack.dataclasses import ChatMessage
from haystack import Pipeline

prompt_builder = ChatPromptBuilder()
llm = OpenAIResponsesChatGenerator(
    generation_kwargs={"reasoning": {"effort": "low", "summary": "auto"}}
)

pipe = Pipeline()
pipe.add_component("prompt_builder", prompt_builder)
pipe.add_component("llm", llm)
pipe.connect("prompt_builder.prompt", "llm.messages")

topic = "quantum computing"
messages = [
    ChatMessage.from_system("You are a helpful assistant that explains complex topics clearly."),
    ChatMessage.from_user("Explain {{topic}} in simple terms")
]
result = pipe.run(data={
    "prompt_builder": {
        "template_variables": {"topic": topic},
        "template": messages
    }
})

print(result)

Overview​

Authentication​

Reasoning Support​

Multi-turn Conversations​

Structured Output​

Tool Support​

Streaming​

Usage​

On its own​

In a pipeline​

Overview

Authentication

Reasoning Support

Multi-turn Conversations

Structured Output

Tool Support

Streaming

Usage

On its own

In a pipeline