Open Responses Server

A plug-and-play server that speaks OpenAI’s Responses API — no matter which AI backend you’re running.

Find this useful? Star the repo to follow updates and show support!

Install from PyPIpip install open-responses-server and run otc start. See CLI Usage for all options.

Ollama, vLLM, LiteLLM, Groq, or even OpenAI itself — this server bridges them all to the OpenAI Responses API interface. It handles stateful chat, tool calls, and MCP server integration behind a familiar API.


What’s Inside


Quick Start

Install

pip install open-responses-server

Or from source:

pip install uv
uv venv
uv pip install -e ".[dev]"

Configure

otc configure

Or set environment variables:

export OPENAI_BASE_URL_INTERNAL=http://localhost:11434  # Your LLM backend
export OPENAI_BASE_URL=http://localhost:8080             # This server
export OPENAI_API_KEY=sk-your-key

Run

otc start

Verify:

curl http://localhost:8080/v1/models

Key Features

  • Drop-in replacement for OpenAI’s Responses API
  • Works with any OpenAI-compatible backend
  • MCP server support for both Chat Completions and Responses APIs
  • Supports OpenAI’s Codex CLI and other Responses API clients
  • Stateful multi-turn conversations via in-memory history
  • Tool call execution loop with configurable iteration limits

About

Open Responses Server is an open-source project. It is not affiliated with or endorsed by OpenAI.

Licensed under MIT.


Open Responses Server is an open-source project licensed under MIT. Not affiliated with or endorsed by OpenAI.

This site uses Just the Docs, a documentation theme for Jekyll.