Self-Host an AI Agent on Your Machine

Deploy an AI agent that runs entirely on your own machine. Your data never leaves your infrastructure, you can use local LLMs (zero API costs), and the agent can access local files, databases, and services.

Uses: API Trigger, AI Agent, Window Memory, Return, CLI Runner Time: ~15 minutes

Why Self-Host?

Privacy — Data stays on your machine. No third-party cloud processing.
Cost — Use Ollama with open-source models like Llama 3.2 for zero API costs.
Access — The agent can read local files, connect to localhost databases, and call internal services.
Control — Update models, prompts, and tools without redeploying infrastructure.

Prerequisites

A Tensorify workspace (sign up free)
A machine to run the agent (your laptop, a VPS, or a server)
(Optional) Ollama installed for local LLMs

Step 1: Build the Workflow

The fastest way to start is with the OpenAI Chatbot template:

Go to Templates → AI Agents → OpenAI Chatbot
Click Use template

This creates a workflow with:

API Trigger (openai-chat protocol) — exposes your agent as an OpenAI-compatible endpoint
AI Agent — processes messages with an LLM
Return — sends the response back

You can also build this from scratch: drag an API Trigger, AI Agent, and Return node onto the canvas and wire them together.

Configure the AI Agent

In the AI Agent settings:

Provider: openai (or custom for local LLMs — see Step 6)
Model: gpt-4o (or any OpenAI model)
System Prompt: Customize for your use case
Streaming: Enable for real-time token delivery

Add Environment Variables

Go to Settings → Environment Variables and add:

OPENAI_API_KEY — your OpenAI API key (skip if using a local LLM)

Step 2: Install the CLI Runner

On the machine where you want to run the agent:

curl -fsSL https://cli.tensorify.io/install | sh

Then initialize the runner:

tensorify init

This creates ~/.tensorify/config.json with your runner configuration.

Step 3: Authenticate

tensorify login

This opens a browser window to authenticate with your Tensorify account.

Step 4: Deploy to CLI

In the Tensorify dashboard:

Open your workflow
Click Deploy
Set Execution Mode to CLI
Choose your runner
Click Deploy

Your workflow is now assigned to your CLI runner.

Step 5: Start the Runner

tensorify runner start

The runner connects to Tensorify via WebSocket and starts processing incoming requests. You'll see logs for each request in the terminal.

To run as a background service:

tensorify runner install

This installs the runner as a systemd service (Linux) or launchd agent (macOS) that starts automatically on boot.

Step 6: Call Your Agent

Your agent is now accessible via the OpenAI SDK. The URL is shown in your deployment dashboard.

Python

from openai import OpenAI

client = OpenAI(
    base_url="https://triggers.tensorify.io/h/YOUR_HOOK_PATH",
    api_key="your-tensorify-api-key",
)

response = client.chat.completions.create(
    model="tensorify",
    messages=[{"role": "user", "content": "Hello!"}],
    stream=True,
)

for chunk in response:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

curl

curl -X POST https://triggers.tensorify.io/h/YOUR_HOOK_PATH/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tensorify",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Step 7 (Optional): Use a Local LLM

To eliminate cloud API costs entirely, use Ollama:

Install Ollama: curl -fsSL https://ollama.ai/install.sh | sh
Pull a model: ollama pull llama3.2
In your AI Agent settings, change:
- Provider: custom
- Custom Base URL: http://localhost:11434/v1
- Model: llama3.2
Remove OPENAI_API_KEY from environment variables (no longer needed)

The agent now runs 100% locally — no data leaves your machine, and there are no API costs.

Tip: Use the Local LLM Agent template for a pre-configured workflow with Ollama settings and conversation memory.

Step 8 (Optional): Access Local Files

Because the CLI runner executes on your machine, Code nodes can access local files:

import os
import json

files = os.listdir("/path/to/your/project")
content = open("/path/to/your/project/README.md").read()

result = {"files": files, "readme": content}

Add a Code node as a tool to your AI Agent, and the agent can read, analyze, and summarize files from your local filesystem.

What's Next?

Process Local Files with an AI Agent — Build a codebase assistant
Deploy as an OpenAI Endpoint — Expose your workflow as an OpenAI-compatible API
Build a RAG System — Add vector search for document Q&A
Agent Memory — Add conversation memory to your agent