Qdrant Memory

Vector store memory for AI agents — stores conversation history in Qdrant and retrieves semantically relevant past context. Combines a recent message window with vector similarity search for hybrid recall.

When to Use

Use Qdrant Memory when:

  • The agent needs to recall information from conversations that happened days or weeks ago
  • You need semantic search ("find messages where the user talked about billing") rather than just recent messages
  • Building long-running assistants, knowledge workers, or support agents with persistent context
  • The conversation window alone isn't enough — the agent forgets important context from earlier

For short conversations or prototyping, Window Memory is simpler.

Configuration

Qdrant Memory is a sub-node — it connects to an AI Agent's Memory handle.

SettingTypeDefaultDescription
qdrantUrlstringhttp://localhost:6333Qdrant server URL. Supports bindings. Use Qdrant Cloud or self-hosted.
collectionNamestringtensorify_memoryQdrant collection name. Auto-created on first use if it doesn't exist.
embeddingModelselecttext-embedding-3-smallOpenAI embedding model for vectorizing messages. Choices: text-embedding-3-small (fast), text-embedding-3-large (accurate), text-embedding-ada-002 (legacy).
topKnumber5Number of semantically similar past messages to retrieve.
recentWindownumber4Number of most recent messages to always include (for immediate context).
sessionKeystringScope key for per-user memory. Use bindings like user:{{ input.body.user_id }}.

Required Secrets

Environment VariableDescription
OPENAI_API_KEYRequired for generating text embeddings via OpenAI's embedding API.
QDRANT_API_KEYOptional. Required if your Qdrant instance uses API key authentication (e.g. Qdrant Cloud).

How It Works

Qdrant Memory uses a hybrid retrieval strategy:

  1. Load phase (when the agent starts processing):

    • Retrieves the recentWindow most recent messages (chronological order)
    • Embeds the current user message using the configured embedding model
    • Searches Qdrant for the topK most semantically similar past messages
    • Deduplicates and merges both sets, returning them as conversation context
  2. Save phase (after the agent responds):

    • Embeds the new user message and assistant response
    • Stores them in Qdrant with metadata (session key, timestamp, role)

This means the agent always has immediate context (recent messages) plus relevant historical context (semantic matches), even if the relevant information is hundreds of messages old.

Example

Build a support agent with long-term memory:

  1. Add an AI Agent node with a customer support system prompt
  2. Add a Qdrant Memory node
  3. Connect Qdrant Memory to the agent's Memory handle
  4. Set qdrantUrl to your Qdrant Cloud instance URL
  5. Set sessionKey to customer:{{ "{{ webhook.body.customer_id }}" }}
  6. Set topK to 5 and recentWindow to 4

The agent will remember recent messages AND recall relevant past interactions — even if the customer hasn't contacted you in weeks.

Common Gotchas

  • Qdrant must be running: The plugin connects to Qdrant via HTTP. If the server is unreachable, the agent falls back to no memory (it won't crash, but it won't have context).
  • Embedding costs: Every load() and save() call makes embedding API requests. With text-embedding-3-small, this is very affordable (~$0.02 per million tokens).
  • Collection auto-creation: The collection is created automatically on first use with the correct vector dimensions for your chosen embedding model.
  • Session key required for multi-user: Without a session key, all users share the same memory space. Always set a session key in production.

See Also

On this page