Deploy a Workflow as an OpenAI Endpoint

Turn any Tensorify workflow into an OpenAI-compatible API endpoint. Any application that works with the OpenAI SDK can connect to your workflow — no custom client code needed.

Uses: API Trigger, AI Agent, Return Time: ~10 minutes

How It Works

The API Trigger's openai-chat protocol translates between the OpenAI chat completions format and your workflow:

Incoming: The SDK sends a standard POST /chat/completions request with messages[]
Translation: The trigger extracts the latest user message and passes it to your workflow
Processing: Your AI Agent (or any logic) generates a response
Outgoing: The response is wrapped in OpenAI's format and returned to the SDK

This means your workflow is a drop-in replacement for any OpenAI model in existing applications.

Prerequisites

A Tensorify workspace
An OpenAI API key (or any LLM provider)

Step 1: Create the Workflow

Use the OpenAI Chatbot template or build from scratch:

Add an API Trigger node
In settings, set Protocol to openai-chat
Set a Path (e.g., /my-agent)
Add an AI Agent node and configure your LLM provider and system prompt
Add a Return node
Wire: API Trigger POST → AI Agent message → Return input

Step 2: Add Environment Variables

Go to Settings → Environment Variables and add your LLM API key:

OPENAI_API_KEY for OpenAI models
ANTHROPIC_API_KEY for Anthropic models
Or use a custom provider with any OpenAI-compatible API

Step 3: Deploy

Click Deploy in the workflow editor:

Cloud — Tensorify runs your workflow on managed infrastructure
CLI — Your workflow runs on your own machine via the CLI runner

After deployment, you'll get an endpoint URL like:

https://triggers.tensorify.io/h/YOUR_HOOK_PATH

Step 4: Connect from Your Application

Python

from openai import OpenAI

client = OpenAI(
    base_url="https://triggers.tensorify.io/h/YOUR_HOOK_PATH",
    api_key="your-tensorify-api-key",
)

response = client.chat.completions.create(
    model="tensorify",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What can you help me with?"},
    ],
)

print(response.choices[0].message.content)

Node.js

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://triggers.tensorify.io/h/YOUR_HOOK_PATH",
  apiKey: "your-tensorify-api-key",
});

const response = await client.chat.completions.create({
  model: "tensorify",
  messages: [{ role: "user", content: "Hello!" }],
});

console.log(response.choices[0].message.content);

curl

curl -X POST https://triggers.tensorify.io/h/YOUR_HOOK_PATH/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your-tensorify-api-key" \
  -d '{
    "model": "tensorify",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Step 5: Enable Streaming

For real-time token delivery, enable streaming in two places:

AI Agent settings → set Streaming to true
SDK call → set stream=True

response = client.chat.completions.create(
    model="tensorify",
    messages=[{"role": "user", "content": "Tell me a story."}],
    stream=True,
)

for chunk in response:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

Streaming uses Server-Sent Events (SSE) and delivers tokens as they're generated.

Step 6: Add Session Memory

For multi-turn conversations, add a Window Memory node:

Drag a Window Memory node onto the canvas
Connect its memory output to the AI Agent's memory input
Set Session Key to track conversations per user:

{{ api_request.headers['x-tensorify-session-id'] || 'default' }}

Then pass a session ID from the client:

response = client.chat.completions.create(
    model="tensorify",
    messages=[{"role": "user", "content": "Remember, my name is Alice."}],
    extra_headers={"X-Tensorify-Session-Id": "user-alice-123"},
)

Available Routes

When you deploy with openai-chat protocol, the trigger exposes:

| Route | Purpose | |---|---| | POST /chat/completions | Chat completions (main endpoint) | | GET /models | Lists available models |

What's Next?

Self-Host an AI Agent — Run on your own machine with local LLMs
Build a RAG System — Add vector search for document Q&A
Agent Memory — Deep dive into memory configuration
API Trigger Reference — Full protocol and settings reference