Self-Host an AI Agent on Your Machine

Deploy an AI agent that runs entirely on your own machine. Your data never leaves your infrastructure, you can use local LLMs (zero API costs), and the agent can access local files, databases, and services.

Uses: API Trigger, AI Agent, Window Memory, Return, CLI Runner Time: ~15 minutes


Why Self-Host?

  • Privacy — Data stays on your machine. No third-party cloud processing.
  • Cost — Use Ollama with open-source models like Llama 3.2 for zero API costs.
  • Access — The agent can read local files, connect to localhost databases, and call internal services.
  • Control — Update models, prompts, and tools without redeploying infrastructure.

Prerequisites

  • A Tensorify workspace (sign up free)
  • A machine to run the agent (your laptop, a VPS, or a server)
  • (Optional) Ollama installed for local LLMs

Step 1: Build the Workflow

The fastest way to start is with the OpenAI Chatbot template:

  1. Go to TemplatesAI AgentsOpenAI Chatbot
  2. Click Use template

This creates a workflow with:

  • API Trigger (openai-chat protocol) — exposes your agent as an OpenAI-compatible endpoint
  • AI Agent — processes messages with an LLM
  • Return — sends the response back

You can also build this from scratch: drag an API Trigger, AI Agent, and Return node onto the canvas and wire them together.

Configure the AI Agent

In the AI Agent settings:

  • Provider: openai (or custom for local LLMs — see Step 6)
  • Model: gpt-4o (or any OpenAI model)
  • System Prompt: Customize for your use case
  • Streaming: Enable for real-time token delivery

Add Environment Variables

Go to SettingsEnvironment Variables and add:

  • OPENAI_API_KEY — your OpenAI API key (skip if using a local LLM)

Step 2: Install the CLI Runner

On the machine where you want to run the agent:

curl -fsSL https://cli.tensorify.io/install | sh

Then initialize the runner:

tensorify init

This creates ~/.tensorify/config.json with your runner configuration.

Step 3: Authenticate

tensorify login

This opens a browser window to authenticate with your Tensorify account.

Step 4: Deploy to CLI

In the Tensorify dashboard:

  1. Open your workflow
  2. Click Deploy
  3. Set Execution Mode to CLI
  4. Choose your runner
  5. Click Deploy

Your workflow is now assigned to your CLI runner.

Step 5: Start the Runner

tensorify runner start

The runner connects to Tensorify via WebSocket and starts processing incoming requests. You'll see logs for each request in the terminal.

To run as a background service:

tensorify runner install

This installs the runner as a systemd service (Linux) or launchd agent (macOS) that starts automatically on boot.

Step 6: Call Your Agent

Your agent is now accessible via the OpenAI SDK. The URL is shown in your deployment dashboard.

Python

from openai import OpenAI

client = OpenAI(
    base_url="https://triggers.tensorify.io/h/YOUR_HOOK_PATH",
    api_key="your-tensorify-api-key",
)

response = client.chat.completions.create(
    model="tensorify",
    messages=[{"role": "user", "content": "Hello!"}],
    stream=True,
)

for chunk in response:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

curl

curl -X POST https://triggers.tensorify.io/h/YOUR_HOOK_PATH/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tensorify",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Step 7 (Optional): Use a Local LLM

To eliminate cloud API costs entirely, use Ollama:

  1. Install Ollama: curl -fsSL https://ollama.ai/install.sh | sh
  2. Pull a model: ollama pull llama3.2
  3. In your AI Agent settings, change:
    • Provider: custom
    • Custom Base URL: http://localhost:11434/v1
    • Model: llama3.2
  4. Remove OPENAI_API_KEY from environment variables (no longer needed)

The agent now runs 100% locally — no data leaves your machine, and there are no API costs.

Tip: Use the Local LLM Agent template for a pre-configured workflow with Ollama settings and conversation memory.

Step 8 (Optional): Access Local Files

Because the CLI runner executes on your machine, Code nodes can access local files:

import os
import json

files = os.listdir("/path/to/your/project")
content = open("/path/to/your/project/README.md").read()

result = {"files": files, "readme": content}

Add a Code node as a tool to your AI Agent, and the agent can read, analyze, and summarize files from your local filesystem.

What's Next?

On this page