Deploy a Workflow as an OpenAI Endpoint
Turn any Tensorify workflow into an OpenAI-compatible API endpoint. Any application that works with the OpenAI SDK can connect to your workflow — no custom client code needed.
Uses: API Trigger, AI Agent, Return Time: ~10 minutes
The API Trigger's openai-chat protocol translates between the OpenAI chat completions format and your workflow:
- Incoming: The SDK sends a standard
POST /chat/completionsrequest withmessages[] - Translation: The trigger extracts the latest user message and passes it to your workflow
- Processing: Your AI Agent (or any logic) generates a response
- Outgoing: The response is wrapped in OpenAI's format and returned to the SDK
This means your workflow is a drop-in replacement for any OpenAI model in existing applications.
- A Tensorify workspace
- An OpenAI API key (or any LLM provider)
Use the OpenAI Chatbot template or build from scratch:
- Add an API Trigger node
- In settings, set Protocol to
openai-chat - Set a Path (e.g.,
/my-agent) - Add an AI Agent node and configure your LLM provider and system prompt
- Add a Return node
- Wire: API Trigger
POST→ AI Agentmessage→ Returninput
Go to Settings → Environment Variables and add your LLM API key:
OPENAI_API_KEYfor OpenAI modelsANTHROPIC_API_KEYfor Anthropic models- Or use a custom provider with any OpenAI-compatible API
Click Deploy in the workflow editor:
- Cloud — Tensorify runs your workflow on managed infrastructure
- CLI — Your workflow runs on your own machine via the CLI runner
After deployment, you'll get an endpoint URL like:
https://triggers.tensorify.io/h/YOUR_HOOK_PATH
from openai import OpenAI
client = OpenAI(
base_url="https://triggers.tensorify.io/h/YOUR_HOOK_PATH",
api_key="your-tensorify-api-key",
)
response = client.chat.completions.create(
model="tensorify",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What can you help me with?"},
],
)
print(response.choices[0].message.content)
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://triggers.tensorify.io/h/YOUR_HOOK_PATH",
apiKey: "your-tensorify-api-key",
});
const response = await client.chat.completions.create({
model: "tensorify",
messages: [{ role: "user", content: "Hello!" }],
});
console.log(response.choices[0].message.content);
curl -X POST https://triggers.tensorify.io/h/YOUR_HOOK_PATH/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer your-tensorify-api-key" \
-d '{
"model": "tensorify",
"messages": [{"role": "user", "content": "Hello!"}]
}'
For real-time token delivery, enable streaming in two places:
- AI Agent settings → set Streaming to
true - SDK call → set
stream=True
response = client.chat.completions.create(
model="tensorify",
messages=[{"role": "user", "content": "Tell me a story."}],
stream=True,
)
for chunk in response:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)
Streaming uses Server-Sent Events (SSE) and delivers tokens as they're generated.
For multi-turn conversations, add a Window Memory node:
- Drag a Window Memory node onto the canvas
- Connect its
memoryoutput to the AI Agent'smemoryinput - Set Session Key to track conversations per user:
{{ api_request.headers['x-tensorify-session-id'] || 'default' }}
Then pass a session ID from the client:
response = client.chat.completions.create(
model="tensorify",
messages=[{"role": "user", "content": "Remember, my name is Alice."}],
extra_headers={"X-Tensorify-Session-Id": "user-alice-123"},
)
When you deploy with openai-chat protocol, the trigger exposes:
| Route | Purpose |
|---|---|
| POST /chat/completions | Chat completions (main endpoint) |
| GET /models | Lists available models |
- Self-Host an AI Agent — Run on your own machine with local LLMs
- Build a RAG System — Add vector search for document Q&A
- Agent Memory — Deep dive into memory configuration
- API Trigger Reference — Full protocol and settings reference
